Article Text

Download PDFPDF

Original article
Regulatory crosstalk between lineage-survival oncogenes KLF5, GATA4 and GATA6 cooperatively promotes gastric cancer development
  1. Na-Yu Chia1,2,
  2. Niantao Deng1,3,
  3. Kakoli Das1,
  4. Dachuan Huang4,
  5. Longyu Hu1,5,
  6. Yansong Zhu1,
  7. Kiat Hon Lim6,
  8. Ming-Hui Lee7,
  9. Jeanie Wu7,
  10. Xin Xiu Sam6,
  11. Gek San Tan6,
  12. Wei Keat Wan6,
  13. Willie Yu4,
  14. Anna Gan4,
  15. Angie Lay Keng Tan1,
  16. Su-Ting Tay1,
  17. Khee Chee Soo8,
  18. Wai Keong Wong9,
  19. Lourdes Trinidad M Dominguez9,
  20. Huck-Hui Ng10,
  21. Steve Rozen1,2,
  22. Liang-Kee Goh1,11,
  23. Bin-Tean Teh1,4,
  24. Patrick Tan1,5,7,10
  1. 1Cancer and Stem Cell Biology program, Duke-NUS Graduate Medical School Singapore, Singapore, Singapore
  2. 2A*STAR-Duke-NUS Neuroscience Partnership, Duke-NUS Graduate Medical School Singapore, Singapore, Singapore
  3. 3NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore, Singapore
  4. 4Laboratory of Cancer Epigenome, National Cancer Centre, Singapore, Singapore
  5. 5Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
  6. 6Department of Pathology, Singapore General Hospital, Singapore, Singapore
  7. 7Cellular and Molecular Research, National Cancer Centre, Singapore, Singapore
  8. 8Department of Surgical Oncology, National Cancer Centre, Singapore, Singapore
  9. 9Dept of General Surgery, Singapore General Hospital, Singapore, Singapore
  10. 10Genome Institute of Singapore, Singapore, Singapore
  11. 11Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
  1. Correspondence to Dr Patrick Tan, Cancer and Stem Cell Biology, Duke-NUS Graduate Medical School Singapore, 8 College Road, Singapore 169857, Singapore; gmstanp{at}


Objective Gastric cancer (GC) is a deadly malignancy for which new therapeutic strategies are needed. Three transcription factors, KLF5, GATA4 and GATA6, have been previously reported to exhibit genomic amplification in GC. We sought to validate these findings, investigate how these factors function to promote GC, and identify potential treatment strategies for GCs harbouring these amplifications.

Design KLF5, GATA4 and GATA6 copy number and gene expression was examined in multiple GC cohorts. Chromatin immunoprecipitation with DNA sequencing was used to identify KLF5/GATA4/GATA6 genomic binding sites in GC cell lines, and integrated with transcriptomics to highlight direct target genes. Phenotypical assays were conducted to assess the function of these factors in GC cell lines and xenografts in nude mice.

Results KLF5, GATA4 and GATA6 amplifications were confirmed in independent GC cohorts. Although factor amplifications occurred in distinct sets of GCs, they exhibited significant mRNA coexpression in primary GCs, consistent with KLF5/GATA4/GATA6 cross-regulation. Chromatin immunoprecipitation with DNA sequencing revealed a large number of genomic sites co-occupied by KLF5 and GATA4/GATA6, primarily located at gene promoters and exhibiting higher binding strengths. KLF5 physically interacted with GATA factors, supporting KLF5/GATA4/GATA6 cooperative regulation on co-occupied genes. Depletion and overexpression of these factors, singly or in combination, reduced and promoted cancer proliferation, respectively, in vitro and in vivo. Among the KLF5/GATA4/GATA6 direct target genes relevant for cancer development, one target gene, HNF4α, was also required for GC proliferation and could be targeted by the antidiabetic drug metformin, revealing a therapeutic opportunity for KLF5/GATA4/GATA6 amplified GCs.

Conclusions KLF5/GATA4/GATA6 may promote GC development by engaging in mutual crosstalk, collaborating to maintain a pro-oncogenic transcriptional regulatory network in GC cells.

  • Gastric Cancer
View Full Text

Statistics from

Significance of this study

What is already known about this subject?

  • Frequent amplifications of KLF5, GATA4 and/or GATA6 have been reported in gastric cancer (GC) and other GI cancers, indicating their important role in GI cancer development.

  • Little is known about how KLF5/GATA4/GATA6 promotes GC tumorigenesis, whether cooperatively or independently, and which genes are the direct targets of these transcription factors.

What are the new findings?

  • KLF5/GATA4/GATA6 exhibited significant mRNA coexpression pattern, despite being amplified in distinct sets of tumours.

  • Chromatin immunoprecipitation with DNA sequencing revealed a large number of genomic sites co-occupied by KLF5 and GATA4/GATA6, suggesting that these factors may cooperate to regulate common downstream genes.

  • Depletion and overexpression of these factors, singly or in combination, reduced and promoted cancer proliferation, respectively, in vitro and in vivo.

  • HNF4α, a direct target of KLF5/GATA4/GATA6, was required for GC proliferation, and could be targeted by the antidiabetic drug metformin.

How might it impact on clinical practice in the foreseeable future?

  • KLF5/GATA4/GATA6 amplifications were observed in ∼30% of all GC cases, representing a significant subgroup for potential targeted therapy.

  • Metformin provides a targeted-therapy option for KLF5/GATA4/GATA6 amplified GCs through their direct target HNF4α.


Gastric cancer (GC) is a leading cause of global cancer mortality responsible for ∼700 000 deaths annually (, and still highly prevalent in many parts of Asia, Eastern Europe and South America.1 Most patients with GC are diagnosed at advanced disease stages, and overall 5-year survival rates for patients with resectable GC remain low at 10–30% despite clinical advances in surgery and therapy.2 ,3 There is an urgent need to identify specific genetic alterations associated with GC, and to exploit these alterations to reveal novel therapeutic opportunities.

Most GCs are adenocarcinomas and frequently exhibit somatic copy number alterations (SCNAs).4 ,5 Importantly, genes exhibiting recurrent SCNA can represent therapeutic targets, as exemplified by the receptor tyrosine kinase ERBB2/HER2, amplified in 8–10% of GCs.3 We and others have recently delineated the near-complete repertoire of genes frequently affected by recurrent SCNAs in GC.5 ,6 Besides oncogenes such as KRAS, EGFR and ERRBB2 which can regulate oncogenesis in multiple tissues, we also observed genomic amplification of three transcription factors, KLF5, GATA4 and GATA6, in GC.5 KLF5 is a Krüppel-like transcription factor with diverse functions in cell differentiation and embryonic development,7 ,8 while GATA4 and GATA6 are required for development and differentiation of endodermal and mesodermal tissues.9 ,10 These findings are intriguing as previous research has shown that cancers of certain lineages may sometimes amplify transcription factors involved in early development of that lineage, such as MITF in melanoma11 and NKX2-1 in lung cancer.12 Such factors, termed ‘lineage-survival oncogenes’,13 may promote cancer in a tissue-specific manner by reactivating early developmental programmes.

As single factors, KLF5, GATA4 and GATA6 deregulation has been previously reported in tumours particularly in GI cancer. For example, GATA6 has been implicated in oesophageal and pancreatic cancers14 ,15 while KLF5 may be involved in pancreas,16 and intestinal tumours,16 ,17 (see Discussion). However, additional research is required to clarify the exact functions of these factors in cancer development. For example, it remains unknown which genes are direct targets of these factors, and it is unclear if KLF5, GATA4 and GATA6 each act independently to promote cancer, or if they are able to engage in crosstalk and collaborate with one another. From a therapeutic standpoint, KLF5, GATA4 and GATA6 are also considered unattractive as transcription factors are traditionally ‘undruggable’. It is thus essential to identify robust downstream targets of these factors which can be targeted in primary human GCs, as these might suggest promising avenues for treating KLF5/GATA4/GATA6 amplified tumours. Indeed, similar strategies have proven successful for MITF in melanoma.18

Here, we hypothesised that KLF5, GATA4 and GATA6 are lineage-survival oncogenes in GC acting in a collaborative fashion. To investigate this possibility, we mapped regions of KLF5, GATA4 and GATA6 genomic occupancy in GC cell lines and found that KLF5, GATA4 and GATA6, are likely to engage in mutual crosstalk to drive GC development. Moreover, one KLF5/GATA4/GATA6 target in GC cell lines and primary tumours, the HNF4α nuclear receptor, may provide a therapeutic opportunity for KLF5/GATA4/GATA6-amplifed GCs by imparting sensitivity to the antidiabetic drug metformin.

Materials and methods

Copy number, gene expression and transcription factor independency analysis

Genomic identification of significant targets in cancer was used to characterise copy number amplifications. Expression profiling data was obtained/generated using Affymetrix U133 Plus 2.0 arrays (Santa Clara, California, USA). Independent amplification analysis was calculated using Fisher's tests and Mutual Exclusivity Modules (MEMo) in cancer (see online supplementary text).

Cell culture, proliferation and colony formation assays

YCC3 (from Yonsei Cancer Centre, South Korea), AGS and Hela (from American Type Culture Collection) and KATO-III cells (Japan Health Science Research Resource Bank, Japan) were used. Proliferation assays were performed using the Promega CellTiter 96 System (Promega, Madison, Wisconsin, USA). Colony formation assays were performed in six-well dishes over 3 weeks. Colonies were counted using Image J software.19

Chromatin immunoprecipitation, chromatin immunoprecipitation sequencing, sequential chromatin immunoprecipitation

Cells were cross-linked with 1% formaldehyde. Chromatin was extracted and sonicated to 500 bp. KLF5 (sc-22797), GATA4 (sc-1237) and GATA6 (sc-9055) antibodies were used for chromatin immunoprecipitation (ChIP) (Santa Cruz Biotechnology, Dallas, Texas, USA). ChIPed DNA 10 ng was used for ChIP with DNA sequencing (ChIP-seq) library construction following manufacturer protocols (Illumina, San Diego, California, USA). Input DNA from cells prior to immunoprecipitation was used to normalise ChIP-seq peak calling. Sequential ChIP uses two rounds of immunoprecipitation, with the first antibody cross-linked onto magnetic beads. The immunoprecipitated DNA from the first antibody was eluted for the second immunoprecipitation. Immunoprecipitated DNA (from ChIP) or second immunoprecipitated DNA (from sequential ChIP) was purified using phenol-chloroform and subjected for quantitative real-time PCR or sequencing on an Illumina Gallx sequencer (see online supplementary text for additional details).

ChIP-seq motif analysis

ChIP-seq sequence reads were mapped to University of California Santa Cruz (UCSC) Genome Build hg19. Peaks were called using MACS V.1.4.1 (see online supplementary text), using the default peak calling threshold (p<1×10−5) similar to other studies.2022 De novo motif analysis was performed using Weeder and multiple expectation-maximization for motif elicitation (MEME), and compared against the JASPAR database.

Cell transfections

Plasmid and siRNA transfections were performed using Fugene 6 (Promega) and Dharmafect 1 (Dharmacon, Lafayette, Colorado, USA), respectively, according to the manufacturer's protocol. KLF5, GATA4, GATA6 and HNFα siRNAs (Dharmacon) and, KLF5 and HNF4α overexpression plasmids (Origene, Rockville, Maryland, USA) were used.

Animal studies

The ethical use of animals in this study was approved by the institutional care and use committee of the SingHealth Institutional Review Board in Singapore under protocol number #2011/SHS/636. Athymic NCr nude aged-matched male mice between 5 weeks and 7 weeks were subcutaneously implanted with parental cells or genetically engineered cells (eg KLF5-silenced) into the lower flanks. Tumour diameters were measured twice per week.

Western blotting and Co-immunoprecipitation (Co-IP)

For western blotting, cells were lysed using radio-immunoprecipitation assay (RIPA) buffer (Sigma, St. Louis, Missouri, USA) in the presence of protease inhibitors. Lysates were probed with KLF5 (sc-22797), GATA4 (sc-1237), GATA6 (sc-9055) or HNF4α (sc-8987) antibodies (Santa Cruz Biotechnology). For coimmunoprecipitation experiments, nuclear extracts were isolated using a NE-PER Nuclear and Cytoplasmic Extraction Kit (Pierce, Rockford, Illinois, USA) according to the manufacturer's protocol. Antibodies were incubated with Protein A/G Plus Agarose beads (Pierce) for 2 h at room temperature. The beads were then washed twice with cell lysis buffer and nuclear extract was subsequently added to the antibody bound beads. The mixture was nutated on a rotator overnight at 4°C. For in vitro translation experiments, KLF5 and GATA4 were translated in vitro using the TnT® Quick coupled Transcription/Translation kit (Promega) Translated proteins were mixed on a rotator at 4°C to allow for interaction. Independently, 5 µg of antibodies were added to protein A/G beads and nutated for 2 h at room temperature. After 2 h, the antibody beads were washed twice and added to the protein mixture. The mixture was rotated at 4°C overnight, and the next day the beads were washed thrice. To elute the proteins 60 µL of 2× sample buffer was added to the beads, and 10 µL of the sample was analysed by gel electrophoresis.

Fluorescent in situ hybridisation (FISH), immunohistochemistry (IHC)

KLF5 BAC clones and GATA4 and GATA6 SureFish probes from Agilent Technologies (Santa Clara, California, USA) were used for fluorescent in situ hybridisation (FISH). Protein expression of KLF5, GATA4, GATA6 and HNF4α were assessed by immunohistochemistry (IHC) on samples with high expression of KLF5/GATA4/GATA6.

Data accession

Data sets used are: Singapore primary GC cohort Affymetrix SNP6.0—Gene Expression Omnibus (GEO) Accession Number GSE31168, Singapore primary GC Cohort Affymetrix Plus2.0—GSE15459, Validation cohort Affymetrix U133A and Plus 2.0—GSE34942, GSE35809 and GSE37023; KLF5/GATA4/GATA6 ChIP-seq and perturbations—GSE51706. TCGA GC copy number data segmentation data was downloaded from the TCGA data portal (, December 2012).

Additional methodological details covering MEMo, MACs, ChIP-seq and electrophoretic mobility shift assays are provided in the online supplementary text.


Tissue-specific copy number amplification of KLF5, GATA4 and GATA6 in GC

Using high-resolution single nucleotide polymorphism (SNP) arrays, we previously found that GATA4, GATA6 and KLF5, were recurrently amplified in GC (figure 1A, top).5 We sought to confirm the tissue-specificity of these amplifications, and validate these findings in an independent GC cohort. We confirmed the SNP array data by demonstrating KLF5, GATA4 and GATA6 amplification in primary GCs and cell lines by FISH (see online supplementary figure S1A). In three independent SCNA studies of other tumour types, including glioblastoma,23 lung adenocarcinoma12 and 26 cancer types,4 KLF5, GATA4 and GATA6 amplifications were not prevalent (see online supplementary table S1). We also validated the presence of KLF5/GATA4/GATA6 amplifications in an independent cohort of 254 GCs from the TCGA project (; Dec 2012). In the TCGA cohort, we again observed recurrent amplification of KLF5, GATA4 and GATA6 in a comparable proportion of GCs (figure 1A, bottom).

Figure 1

KLF5 and GATA factors are independently amplified but coexpressed in gastric cancer (GC). (A) Focal genomic amplifications of KLF5, GATA4 and GATA6 in GC. Chromosomal regions are on the x axis, and genomic identification of significant targets in cancer (GISTIC) computed false discovery rate q-values on the y axis. Top plot—Singapore cohort (SG), bottom plot—TCGA cohort (TCGA). Genes within or directly adjacent to GISTIC peaks are shown. GATA4, GATA6 and KLF5 are in red. (B) KLF5, GATA4 and GATA6 independent amplification. Each column represents an individual tumour exhibiting either KLF5, GATA4 or GATA6 amplification (rows). Red gradient—degree of amplification. The independent amplification pattern is significant (p<10−4, Fisher's test; see Results) for both cohorts. (C) KLF5, GATA4 and GATA6 gene expression in GCs with or without KLF5/GATA4/GATA6 CNAs. GCs with transcription factor amplifications express significantly higher transcription factor levels. *KLF5: p=6×10−4; GATA4: p=4×10−5; GATA6: p=5×10−3. (D) and (E). Correlation plots comparing KLF5/GATA4/GATA6 gene expression in GC cohorts. The first cohort matches the copy number data. The second validation cohort is independent of the first cohort (see Methods). r, correlation value; P, correlation significance. MYC was included as a negative control.

KLF5, GATA4 and GATA6 are independently amplified but coexpressed in GC

KLF5, GATA4 and GATA6 were each recurrently amplified in ∼10% of patients with GC, and collectively ∼30% of GCs exhibited amplification of at least one factor. In the Singapore GC cohort, individual tumours exhibiting amplification of one factor (KLF5, GATA4 or GATA6) seldom coamplified the other two factors (p<10−4, Fisher's test) (figure 1B, top and see online supplementary table S2). This pattern, which we refer to as independent amplification, was not observed either with MYC; another transcription factor amplified in GC, or TP53, which is not frequently amplified (see online supplementary figure S1B and S1C). The KLF5, GATA4 and GATA6 independent amplification pattern was also observed in the separate TCGA cohort (p<10−4, Fisher's test) (figure 1B, bottom and see online supplementary table S2). We additionally replicated the statistical significance of this amplification pattern using MEMo, a Monte Carlo based sampling strategy previously used to assess the independence of oncogenic pathways in cancer.24 However, when we extended this analysis to all cases within the cohorts including patients lacking amplifications, the significance of this pattern was not retained (p=0.71; Fisher's test). As such, the observation of KLF5, GATA4 and GATA6 independent amplifications should be regarded as a hypothesis-generating, rather than definitive, finding.

We examined gene expression profiles of KLF5, GATA4 and GATA6 in the Singapore GC cohort, where copy number alteration (CNA) and gene expression information were available. For all three factors (KLF5, GATA4 and GATA6), tumours with high CNA of a factor exhibited significantly increased expression of that factor compared with tumours with no or low amplification (‘copy number driven expression’, figure 1C). Unexpectedly, and in contrast to the independent CNA amplification pattern, we discovered that in individual tumours, KLF5, GATA4 and GATA6 were significantly positively correlated in their gene expression levels, with several GCs exhibiting high coexpression of all three factors (figure 1D). Coexpression of the three factors was associated with GCs of Lauren's intestinal subtype (p=0.004; Fisher's exact test) and remained significant in analyses confined to intestinal-type tumours only (see online supplementary figure S1D). This finding was not limited to this Singapore cohort (n=193 primary tumours), as KLF5, GATA4 and GATA6 were also significantly coexpressed in a separate validation cohort of 207 GCs from Singapore (n=108), Australia (n=70) and the UK (n=29)2,5 ,26 (figure 1E) (Affymetrix expression data was not available for the TCGA cohort). Using IHC, we observed that KLF5, GATA4 and GATA6 are coexpressed at the protein level in primary GCs (see online supplementary figure S1E). It is possible that KLF5, GATA4 and GATA6 may act in a cross-regulatory manner, where amplification of one factor may induce expression of the other factors (see later).

Genome-wide binding profiles of KLF5, GATA4 and GATA6 reveal targeting of common downstream pathways

To identify direct binding targets of KLF5, GATA4 and GATA6 in GC cells, we performed ChIP and ChIP-seq of these factors in three GC cell lines (YCC3, AGS and KATO-III). These lines expressed high levels of KLF5, GATA4 and GATA6 relative to other lines (see online supplementary figure S2A), and exhibited either KLF5 (KATO-III), GATA4 (YCC3) or GATA6 (AGS) focal amplifications (see online supplementary figure S2B). Antibodies used for ChIP were evaluated by western blotting and siRNA knockdown assays (see online supplementary figure S2C). Binding sites of these three factors were computed using MACS, a validated algorithm previously used in several major transcription factor ChIP-seq studies including those from the ENCODE consortium.27

Motif analysis of the KLF5 ChIP-seq peaks revealed that the top-ranked de novo consensus binding motif matched previously known KLF factors, while the GATA4/GATA6 de novo motifs matched GATA-related motifs bound by Evi1 ((T/A)GATA(A/G)) (figure 2B and see online supplementary figure S3A).28 We observed binding of each transcription factor onto its own regulatory region (figure 2A), and also cross-binding to the regulatory regions of the other factors (see online supplementary figure S3B). Specifically, the KLF5 regulatory region was occupied by GATA4 and GATA6, the GATA4 regulatory region was occupied by GATA6, and the GATA6 regulatory region was occupied by KLF5 and GATA4 (see online supplementary figure S3B). These binding patterns are consistent with potential auto-regulatory and cross-regulatory activities of KLF5, GATA4 and GATA6.

Figure 2

Genomic binding sites of KLF5, GATA4 and GATA6 in gastric cancer (GC) cell lines. (A) KLF5, GATA4 and GATA6 binding on their respective promoters in YCC3, AGS and KATO-III GC cells. The KLF5, GATA4 and GATA6 gene loci are shown (top, middle and bottom). The red bar highlights the promoter region. (B) De novo binding motif analysis of KLF5, GATA4 and GATA6 chromatin immunoprecipitation with DNA sequencing (ChIP-seq) data in YCC3, AGS and KATO-III cells (p<3×10−7). The fourth row depicts the previously described KLF (from KLF4) and GATA (from EVI1) binding motifs from the JASPAR database. (C) Overlap in KLF5, GATA4 and GATA6 binding sites among all three GC lines; YCC3, AGS and KATO-III. The extent of overlap (ie, commonly-bound sites) for each of the transcription factors is significant (p<0.01). (D) Comparison of common GATA6 binding sites in GC cells with GATA6 binding sites in differentiating and proliferating Caco-2 intestinal cancer (IC) cells. A stronger degree of overlap is observed with proliferating IC cells (39.9%) compared with differentiating cells (9.6%). (E) Ingenuity pathway analysis of KLF5, GATA4 and GATA6 binding sites. All three factors are observed to target pathways related to development, cell movement, cell death and survival, growth and proliferation (all p<10−5).

To identify conserved KLF5, GATA4 and GATA6 binding sites, we identified a overlapping ‘core’ set of 1228 (KLF5), 1800 (GATA4) and 3682 (GATA6) common binding sites in multiple lines (figure 2C). Supporting the reproducibility and specificity of our data, we observed a significant overlap of GATA6 core sites between our cell lines and an independent GC line (HUG1N) from a separate study (see online supplementary figure S3C).29 Because recent data has also implicated GATA6 in other GI cancers, we then compared the GC GATA6 core binding sites with previously reported GATA6 binding profiles in Caco-2 colorectal cancer cells.30 We observed a significant overlap in GATA6 bound genes between GC cells and proliferating Caco-2 cells, but not in differentiated Caco-2 cells (figure 2D). These findings suggest that GATA6 may function similarly in gastric and intestinal cancer cells undergoing high proliferation.

We applied Ingenuity pathway analysis to identify biological functions associated with genes targeted by KLF5, GATA4 and GATA6. We observed a strong overlap of cellular functions related to the three factors, including development, cell movement, cell death and survival, and growth and proliferation (p<10−5 for all biological functions, figure 2E). The similarities of these functions suggest that KLF5, GATA4 and GATA6 may target common downstream pathways and possibly common genes in GC cells.

KLF5 and GATA factors co-localise at gene promoters and physically interact

To assess if KLF5, GATA4 and GATA6 might coregulate common genes, we compared the genomic binding sites of these three factors on a whole-genome scale (In this analysis, common regions were defined as genomic locations where binding events overlapped by at least one base pair). An all-pairwise comparison revealed that GATA4 and GATA6 exhibited a strong overlap of binding sites in GC cells (figure 3A). For this reason, we treated the GATA4 and GATA6 binding sites as a collective unit, termed the GATAs binding sites. We investigated if KLF5 and GATAs might co-target a specific population of binding sites in the genome. By overlapping the KLF5 binding sites with GATAs binding events, we identified 1441, 4395 and 10932 sites commonly bound by KLF5 and GATAs (KLF5/GATAs) in YCC3, AGS and KATO-III cells, respectively (figure 3B). Within each cell line, the overlap between KLF5 and GATAs binding regions was significantly greater than expected by chance (p<0.01). The KLF5/GATAs binding overlap remained significant even after varying the specific ChIP-seq peak-calling threshold (see online supplementary table S3), and also when the overlap analysis was restricted to ‘core’ binding sites commonly observed in either all three lines, or two out of three lines (see online supplementary figure S4A).

Figure 3

KLF5-GATAs binding sites are enriched at promoter regions and KLF5/GATAs physically interact. (A) Pairwise comparison of KLF5, GATA4, GATA6 and MYC binding. Colours reflect binding site colocalisation frequency (orange/yellow—increased colocalisation). GATA4 and GATA6 exhibit substantial binding overlap. (B) Overlap of KLF5 and GATAs binding sites in gastric cancer (GC) cells. There is a significant overlap of common KLF5 and GATAs binding sites (p<0.01). (C) Genomic addresses of KLF5 only, GATAs only, and KLF5/GATAs binding sites. Promoter, enhancer and intergenic regions—yellow, green and purple segments. KLF5/GATAs sites are enriched at promoters. (D) Chromatin immunoprecipitation with DNA sequencing (ChIP-seq) binding intensities of KLF5-GATAs binding sites compared with single factors. (Top) KLF5-GATAs binding sites (red) compared with KLF5 only (blue) and all KLF5 sites (grey). Individual columns—different GC lines (bottom) Comparisons of GATAs binding sites. (E) Proximity graphs reveal GATA motifs adjacent to KLF binding sites (left), and vice versa (right), in all three GC lines. (F) electrophoretic mobility shift assays demonstrating DNA-protein complex assembly using a wild type KLF/GATA comotif probe (WT) and GC line nuclear extracts (AGS and YCC3). Complex formation is inhibited by a non-labelled competitor probe (500-fold excess) or mutant probes with disrupted KLF or GATA motifs (Mut(KLF), Mut(GATA).(G) Super-shift assays caused by addition of KLF5 and GATA4 antibodies (red arrows), compared with IgG isotype controls, support the presence of KLF5 and GATA4 in the complexes. (H) Co-IP of KLF5, GATA4 and GATA6. YCC3 (left) and AGS (right) nuclear extracts were immunoprecipitated with GATA6 antibodies and immunoblotted with KLF5, GATA4 and GATA6 (positive control) antibodies. Immunoprecipitation with IgG antibodies served as a negative control.

Genomic sites commonly bound by both KLF5/GATAs exhibited features distinct from sites bound by single factors. First, KLF5/GATAs binding sites were enriched at gene promoter regions, while sites bound by single factors were largely associated with intronic and enhancer regions (figure 3C). Second, comparing ChIP-seq peak intensities, KLF5/GATAs binding sites exhibited greater binding intensities compared with sites bound by individual factors (figure 3D), consistent with KLF5 cooperating with GATA factors to enhance genomic binding. Third, when we explored DNA sequences in proximity to KLF5/GATAs binding sites, we observed strong enrichment of GATA motifs flanking sites bound by KLF5, and reciprocally a strong enrichment of KLF motifs nearby sites bound by GATA factors (figure 3E). In contrast, such motif enrichments were not observed at sites singly bound by KLF5 and not GATAs, or vice versa (see online supplementary figure S4B).

To further evaluate DNA:protein interactions between KLF5/GATA factors on DNA regions exhibiting KLF/GATA co-motifs, we performed in vitro electrophoretic mobility shift assays. Incubation of nuclear extracts from AGS and YCC3 cells with a DNA probe from the HNF4α promoter region with GATA/KLF binding motifs (described later) revealed the presence of DNA:protein complexes, that could be competed away with 500-fold excess of unlabelled probe (figure 3F). Notably, no complexes were formed using DNA probes mutated in the KLF or GATA motifs (figure 3F), indicating that intact KLF and GATA motifs are specifically required for protein:DNA complex formation. To verify the presence of KLF5 and GATA factors within these complexes, we incubated the protein:DNA complexes with antibodies to KLF5 or GATA factors. Addition of KLF5 and GATA4 antibodies caused a discernible ‘super-shift’ of the complex in both GC lines, while addition of GATA6 antibodies caused dissolution of the complex (figure 3G and see online supplementary figure S4C). Taken collectively, these results further support that KLF5 and GATA factors are capable of specifically binding to DNA containing KLF/GATA motifs to form DNA:protein complexes.

We hypothesised that KLF5, GATA4 and GATA6 might physically interact, and performed co-immumoprecipitation experiments. In YCC3 and AGS GC cell line nuclear extracts, GATA6 immunoprecipitation resulted in successful co-immunoprecipitation of GATA4 and KLF5 (figure 3H). Interactions between KLF5 and GATA4 were also observed in co-immunoprecipitation experiments using in vitro-translated proteins, demonstrating that KLF5 and GATA factors can directly physically interact (see online supplementary figure S4D). These results suggest that KLF5, GATA4 and GATA6 may physically interact in GC cells, and collaborate to co-target common gene promoters in GC.

KLF5, GATA4 and GATA6 functionally collaborate to promote GC tumorigenesis

To test if KLF5, GATA4 and GATA6 are functionally required for GC tumorigenesis, we performed RNAi experiments targeting these factors. For each factor, three independent siRNAs targeting each factor were used (see online supplementary figure S5A). Single-factor depletion of KLF5, GATA4 or GATA6 reduced the proliferation rate of YCC3 cells (figure 4A). Moreover, simultaneous depletion of all the three factors using three different combinations of siRNA pools resulted in a decrease in proliferation rate in YCC3 and also AGS cells, another GC cell line, indicating that these findings are not cell-line specific (see online supplementary figure S5B,C). We also observed a significant reduction in soft agar colony formation when all three transcription factors were depleted (figure 4B). In murine xenograft assays, smaller tumours were observed in vivo from YCC3 cells depleted of KLF5 compared with parental cells (figure 4C). We were unable to test the in vivo effects of triple-factor knockdown, due to technical difficulties in establishing lines stably depleted of all three factors.

Figure 4

KLF5, GATA4 and GATA6 regulate proliferation of gastric cancer cells. (A) Silencing of KLF5, GATA4 and GATA6 (graph from left to right) using three independent siRNAs reduces the proliferation of YCC3 cells compared with controls (siNT- negative control siRNA that does not target known human genes). (B) Colony formation assays of YCC3 cells with simultaneous siRNA knockdown of all three transcription factors KLF5, GATA4 and GATA6 (3TF). siNT represents the negative siRNA control. (C) Tumour volumes in xenograft mice injected with KLF5 knockdown-stable YCC3 cells. (D) Relative proliferation rate of Hela cells overexpressing either single transfection factors (KLF5/GATA4/GATA6), or all three transcription factors (3TF). (***p<10−4). (E) Relative numbers of Hela cell colonies from colony formation assays. Cells were transfected with single transfection factors (KLF5/GATA4/GATA6) or a combination of all three transcription factors (3TF). (*p<0.05). (F) Tumour volumes in xenograft mice injected with Hela cells and derivatives: negative control stable line (GFP), KLF5 overexpressing stable line, and stable line with coexpression of all three transcription factors (3TF). All data in the line graphs represent+SEM, n=3. * represents p values.

To investigate if the pro-oncogenic effects of KLF5, GATA4 and GATA6 are restricted to GC cells, we expressed KLF5, GATA4 or GATA6 in a completely different tissue type. We selected a cervical cancer line (Hela cells) that does not exhibit high expression of KLF5, GATA4 or GATA6. Single-factor overexpression of either KLF5, GATA4 or GATA6 did not significantly increase Hela cell proliferation rate or anchorage independent growth, however simultaneous over-expression of all three factors resulted in a significant growth enhancement in vitro (figure 4D,E). In mouse xenograft experiments, overexpression of all three factors also resulted in a greater tumour growth compared with single factors (figure 4F). Hence, these results provide further evidence that KLF5, GATA4 and GATA6 can collaborate to enhance tumour development.

Integrating expression profiles with genomic occupancy data identifies KLF5/GATA4/GATA6 target genes regulated in primary GC

To identify transcriptomic programmes regulated by KLF5, GATA4 and GATA6, we independently silenced each factor and measured gene expression changes relative to non-targeting siRNA controls by DNA microarrays and gene-specific qPCR. Supporting their cross-regulatory binding patterns (see online supplementary figure S3B), depletion of one factor in YCC3 cells affected the gene expression of the other factor/(s). For example, GATA4 exhibited binding to the promoter regions of KLF5 and GATA6, and GATA4 silencing decreased KLF5 and GATA6 expression (see online supplementary figure S6A). To understand if KLF5, GATA4 and GATA6 transcriptionally coregulate common genes in a global fashion, we next performed an all pairwise comparison of genes differentially regulated after knockdown of each factor in YCC3, AGS and KATO-III cells. We observed strong positive global correlations between genes regulated by KLF5, GATA4 and GATA6 (figure 5A, r=0.81–0.85; p<2.2×10−16). This positive correlation was not observed when compared with genes regulated by MYC (see online supplementary S6B). This result suggests that KLF5, GATA4 and GATA6, are likely to transcriptionally regulate many common genes in GC.

Figure 5

Global concordance of KLF5/GATA4/GATA6 regulated transcriptomes. (A) Pairwise comparisons of genes exhibiting transcriptional changes following KLF5, GATA4 and GATA6 silencing in YCC3, AGS and KATO-III cells. Every point represents a single microarray probe. A significant positive correlation is observed in genes regulated after transcription factor knockdown, demonstrating that the factors globally regulate large sets of common genes. R values for all graphs are r=0.81–0.85; p<2.2×10−16. (B) Integration of genomic occupancy data and transcriptional information identifies direct KLF5, GATA4 and GATA6 target genes. Each column depicts the overlap between genes either downregulated (top) or upregulated (bottom) after transcription factor silencing, intersected with KLF5/GATAs chromatin immunoprecipitation with DNA sequencing (ChIP-seq) data. Data is shown for YCC3, AGS and KATO-III cells (left to right).

To identify genes directly regulated by the three factors, we identified 749 genes directly regulated and bound by KLF5, GATA4 and GATA6 in at least two cell lines, including 413 upregulated genes and 336 downregulated genes (figure 5B and see online supplementary table S4). From these, we then identified those specific genes demonstrating a similar pattern of regulation in primary tumours. We defined a high-confidence set of 12 genes highly expressed in KLF5/GATA4/GATA6 amplified tumours, and 13 specifically repressed genes (see online supplementary table S5). Genes highly expressed in KLF5/GATA4/GATA6 amplified tumours included genes related to transcription factors (HNF4Aα, NCOA3), epigenetic processes (PRMT1) and cell adhesion related proteins (FN1, SPTBN1). Conversely, genes repressed in KLF5/GATA4/GATA6 amplified tumours included genes such as SOX2, MLL, and tumour suppressors such as CDKN1A and SYNE1. These results suggest that KLF5/GATA4/GATA6 may cooperate to regulate an assemblage of genes involved in GC oncogenesis.

HNF4α is a direct KLF5/GATA4/GATA6 transcriptional target in GC and a potential predictor of metformin response

Among the direct KLF5/GATA4/GATA6 targets, we focused on hepatocyte nuclear factor-4α (HNF4α). HNF4α overexpression has been reported in many cancer types,31 although the exact mechanism of HNF4α upregulation may vary. Consistent with HNF4α being a direct target of KLF5, GATA4 and GATA6, HNF4α was expressed at significantly higher levels in tumours (all tumours and intestinal-type tumours only) with KLF5/GATA4/GATA6 amplification relative to tumours lacking KLF5/GATA4/GATA6 amplification (figure 6A and see online supplementary figure S7A). By IHC, HNF4α was also expressed at the protein level in tumours co-expressing KLF5, GATA4 and GATA6 (see online supplementary figure S7B). Analysis of the ChIP-seq genomic occupancy data confirmed KLF5, GATA4 and GATA6 binding to the HNF4α promoter (see online supplementary figure S7C) and sequence analysis of HNF4α promoter confirmed the presence of GATA/KLF comotifs (see online supplementary figure S7D). We confirmed the ChIP-seq binding results using ChIP-qPCR (figure 6B), and simultaneous occupancy of KLF5, GATA4 and GATA6 at the same HNF4α promoter by sequential-ChIP assays, where a first round ChIP was performed using KLF5 antibodies, followed by a second round with GATA4, GATA6 or IgG antibodies (figure 6C). Further supporting that HNF4α is a direct transcriptional target gene of KLF5 and GATA factors, significant HNF4α downregulation at the mRNA and protein levels was observed in YCC3 cells depleted of KLF5 and GATA6, while the effects of GATA4 knockdown were weaker (figure 6D,E).

Figure 6

Direct regulation of HNF4α by KLF5/GATA4/GATA6. (A) Expression of HNF4α in normal gastric tissues, gastric cancers without amplified KLF5/GATA4/GATA6, and with amplified KLF5/GATA4/GATA6. (*p=0.02, **p=10−4 and ***p=2.3×10−7). (B) Q-PCR experiment demonstrating KLF5, GATA4 and GATA6 binding on the HNF4α promoter. (C) Sequential chromatin immunoprecipitation with DNA sequencing (ChIP) demonstrating concurrent binding of KLF5 and GATA factors on the HNF4α promoter. The first ChIP was performed using KLF5 antibodies, the second ChIP was performed using GATA4, GATA6 or IgG antibodies. (D) HNF4α mRNA expression in YCC3 cells after KLF5, GATA4 or GATA6 silencing. HNF4α shows significant downregulation when KLF5 and GATA6 are silenced (*p<0.05). (E) HNF4α protein in YCC3 cells after KLF5, GATA4 and GATA6 silencing. β-actin served as a normalisation control. (F) Reduction in YCC3 proliferation rate after HNF4α silencing (p<0.005) using three independent HNF4α siRNAs. All data in bar graphs represent+SEM, n=3.

Finally, to address a functional role for HNF4α in GC cells, we depleted HNF4α using three independent siRNAs (see online supplementary figure S7E). siRNA silencing of HNF4α in YCC3 and AGS cells resulted in a significant decrease in cell proliferation (figure 6F and see online supplementary figure S7F). In order to elucidate potential therapeutic options for patients with GC with KLF5/GATA4/GATA6 amplification, we investigated if HNF4α could be targeted. Metformin, an oral antidiabetic drug, is known to activate adenosine monophosphate (AMP)-activated protein kinase (AMPK), and one downstream consequence of AMPK activation is the transcriptional repression of HNF4α.32 We treated YCC3 cells with 10 nM of metformin and observed downregulation of HNF4α protein (figure 7A), and a significant decrease in cell proliferation (figure 7B). We found that the effects of metformin on GC growth are due, at least in part, to the presence of HNF4α, as HNF4α depletion using siRNAs sensitised GC cell proliferation to metformin (figure 7C) while HNF4α overexpression rescued the metformin effects (figure 7D,E).

Figure 7

Metformin inhibition of gastric cancer proliferation. (A) HNF4α protein in YCC3 cells after HNF4α silencing (left), or after treatment with metformin (right). β-actin was used as a normalisation control. (B) Proliferation of YCC3 cells after metformin treatment. Dimethyl sulfoxide (DMSO) (vehicle) treatment was used as a negative control. (C) Sensitisation of YCC3 cells pretreated with HNF4α siRNA to metformin. The graph shows the proliferation rate of YCC3 cells presilenced with siNT or siHNF4α and subsequently subjected to DMSO or metformin treatment. The y axis depicts ratios of metformin/DMSO proliferation rates. (*p<0.05). (D) Overexpression of HNF4α protein in YCC3 cells. (E) HNF4α overexpression rescues the effects of metformin. The graph shows the proliferation rate of YCC3 cells overexpressing either GFP or HNF4α and subsequently subjected to metformin treatment. The y axis depicts the absolute cell proliferation rate. (*p<0.05). All data in bar graphs represent+SEM, n=3.


SCNAs are commonly seen in solid epithelial tumours, and a major goal of cancer genomics involves identifying specific SCNAs driving malignant outgrowth. Here, we found that KLF5, GATA4 and GATA6 are collectively amplified in ∼25–30% of GCs. Integration of SCNA and gene expression profiles of these factors revealed an unexpected layer of complexity—while distinct GCs tended to singly amplify either KLF5, GATA4 or GATA6, all three factors were significantly coexpressed in tumours. The independent amplification pattern of KLF5, GATA4 and GATA6, while not significant in the entire cohort and thus rightfully regarded as a hypothesis-generating finding, calls to mind similar ‘mutual exclusivity’ patterns observed for certain somatic mutations, such as EGFR and KRAS mutations in lung cancer.33 Traditionally, such patterns have been attributed to pathway redundancy—where different tumours activate the same oncogenic pathway via mutation at different nodes. Our results for KLF5, GATA4 and GATA6 suggest that patterns of exclusivity may also be caused by the affected factors crosstalking with each other, such that amplification of one factor can induce expression of the other factors.

Unlike classical oncogenes and tumour suppressor genes such as EGFR, TP53 and KRAS, lineage-survival oncogenes are typically transcription factors which exhibit tissue-restricted expression and are involved in early development.13 Consistent with such factors often possessing dual developmental roles governing proliferation and differentiation, lineage-survival oncogenes can also exert context-dependent effects in cancer. For example, NKX2-1, a lineage-survival oncogene in lung cancer, has been shown to exert oncogenic and tumour suppressive activities.34 Prior evidence suggests that KLF5, GATA4 and GATA6 are primarily involved in cancers of the GI tract. In GC, GATA4 and GATA6 are expressed in the developing stomach35 and may be regulated by DNA methylation.36 Similarly, KLF5 expression in GCs can be induced by Helicobacter pylori infection and metaplasia-inducing factors such as CDX1.37 ,38 However, the specific functions of these factors in GC, their downstream target genes, and their ability to crosstalk with one another, have remained largely unexplored.

Genomic ontology and occupancy analysis revealed that KLF5, GATA4 and GATA6 appear to regulate many common biological functions. We observed a significant proportion of genomic sites commonly bound by KLF5 and GATAs, largely localised to gene promoter regions bearing joint KLF5 and GATA binding motifs. Co-immunoprecipitation analysis revealed that KLF5, GATA4 and GATA6 can physically interact. Functional studies revealed that overexpression of all three factors enhanced cell proliferation compared with single factor overexpression. Taken collectively, our data suggests lineage-specific oncogenes may not act in isolation but in partnership with other oncogenic transcription factors—these findings may be conceptually similar to other collaborative transcriptional complexes in cancer such as SOX2/FOXE1 in oesophageal squamous cell carcinoma.39

We searched for downstream targets of KLF5/GATA4/GATA6 that might highlight opportunities for therapy. Similar strategies have proven successful for MITF in melanoma, identifying BCL2 (which is targetable) as a MITF-regulated gene.18 Among genes directly regulated by KLF5/GATA4/GATA6 (see online supplementary table S4), we focused on HNF4α. First, HNF4α has been implicated in other cancers suggesting the functional importance of this gene in tumorigenesis, including lung cancer,40 hepatocellular carcinoma41 and colorectal cancer.42 Second, HNF4α is involved in the development of visceral endoderm, a process also involving GATA factors.9 Third, HNF4α can be indirectly inhibited by activation of AMPK signalling using the approved antidiabetic drug (metformin).32 In GC, high expression of HNF4α has been reported,43 which may result from deregulated AMPK signalling.44 In our study, the proliferation of GC cells was reduced when HNF4α was depleted, and metformin treatment similarly inhibited HNF4α protein expression and reduced GC proliferation. HNF4α was shown to be a true target of metformin as the HNF4α silencing sensitised GC cells to metformin while HNF4α overexpression provided rescue. These results suggest that metformin, or metformin-like drugs, might represent a possible therapy for KLF5/GATA4/GATA6-amplified GCs.

In conclusion, our study suggests that KLF5, GATA4 and GATA6 represent lineage-survival oncogenes in GC. These factors act in an intimate, cross-regulatory and collaborative manner, to target common downstream genes relevant to GC development and proliferation. One of these target genes is HNF4α, and may represent a potential drug target as it can be inhibited by metformin. Future work will focus on elucidating the role of other KLF5/GATA4/GATA6 target genes in GC.


We acknowledge the contributions of the Duke-NUS Genome Biology Facility for genomic profiling services. We thank Lin-Chuen Tan for technical support. We acknowledge the use of TCGA data prior to publication of the TCGA Gastric Cancer Marker Study. TCGA GC copy number data segmentation data was downloaded from the TCGA data portal (, December 2012).


View Abstract

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:


  • N-YC and ND contributed equally.

  • Contributors N-YC led the wet-bench experimentation. ND led the bioinformatic analysis. KD, XXS, GST, LTMD and WKW performed IHC and FISH experiments. YZ and LH assisted with cell culture experiments. DH performed EMSA experiments. KHL provided pathology analysis. M-HL, JW, WY, AG, ALKT and STT. provided genomic profiling (sequencing and microarray) and data preprocessing services. KCS and WKW contributed clinical tissues. H-HN, SR, B-TT oversaw ChIP-seq, bioinformatics, and cell biology experiments, respectively. L-KG and PT provided project oversight. N-YC and PT wrote the paper.

  • Funding This project was supported by the following grants: NMRC/TCR/009-NUHS/2013, BMRC 10/1/33/19/676 and BMRC 10/1/24/19/665.

  • Competing interests None.

  • Patient consent Obtained.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.