Objective Genomic structural variations (SVs) causing rewiring of cis-regulatory elements remain largely unexplored in gastric cancer (GC). To identify SVs affecting enhancer elements in GC (enhancer-based SVs), we integrated epigenomic enhancer profiles revealed by paired-end H3K27ac ChIP-sequencing from primary GCs with tumour whole-genome sequencing (WGS) data (PeNChIP-seq/WGS).
Design We applied PeNChIP-seq to 11 primary GCs and matched normal tissues combined with WGS profiles of >200 GCs. Epigenome profiles were analysed alongside matched RNA-seq data to identify tumour-associated enhancer-based SVs with altered cancer transcription. Functional validation of candidate enhancer-based SVs was performed using CRISPR/Cas9 genome editing, chromosome conformation capture assays (4C-seq, Capture-C) and Hi-C analysis of primary GCs.
Results PeNChIP-seq/WGS revealed ~150 enhancer-based SVs in GC. The majority (63%) of SVs linked to target gene deregulation were associated with increased tumour expression. Enhancer-based SVs targeting CCNE1, a key driver of therapy resistance, occurred in 8% of patients frequently juxtaposing diverse distal enhancers to CCNE1 proximal regions. CCNE1-rearranged GCs were associated with high CCNE1 expression, disrupted CCNE1 topologically associating domain (TAD) boundaries, and novel TAD interactions in CCNE1-rearranged primary tumours. We also observed IGF2 enhancer-based SVs, previously noted in colorectal cancer, highlighting a common non-coding genetic driver alteration in gastric and colorectal malignancies.
Conclusion Integrated paired-end NanoChIP-seq and WGS of gastric tumours reveals tumour-associated regulatory SV in regions associated with both simple and complex genomic rearrangements. Genomic rearrangements may thus exploit enhancer-hijacking as a common mechanism to drive oncogene expression in GC.
- gastric cancer
Statistics from Altmetric.com
Significance of this study
What is already known on this subject?
Previous structural variation (SV) studies in gastric cancer (GC) have focused on gene fusions and disruptions at 3’ untranslated regions of genes. However, SVs affecting non-coding distal enhancer elements, particularly those leading to increased oncogene expression in GC, remain largely unknown.
High CCNE1 expression in GC is associated with both poor patient survival and resistance to anti-HER2 targeted therapies. CCNE1 overexpression has been largely attributed to copy number amplification.
In colorectal cancer, genomic rearrangements affecting the IGF2 oncogene have been observed.
What are the new findings?
Using an experimental strategy combining paired–end enhancer profiling with whole-genome sequencing, we identified GC enhancer-based SVs.
We observed recurrent enhancer-hijacking events at CCNE1 (8%) and IGF2 (4%).
Functional analysis using CRISPR/Cas9 genome editing and chromosome conformation assays reveals that hijacked enhancers are important for maintaining high CCNE1 expression.
Hi-C analysis of primary GCs revealed novel long-range chromatin interactions at genomic locations exhibiting CCNE1 and IGF2 rearrangements.
Significance of this study
How might it impact on clinical practice in the foreseeable future?
Our results reveal that besides copy number amplification, CCNE1 expression in GC is also driven by enhancer hijacking.
Diagnostic platforms may need to incorporate non-coding genomic regions and epigenomic information to accurately detect enhancer hijacking events in cancer.
Gastric cancer (GC) is a leading cause of global cancer mortality with high incidence in East Asia, Eastern Europe and Central/South America.1 Previous GC genomic studies have largely focused on protein-coding genes such as TP53, ARID1A, ERBB2 and RHOA.2 3 While important to our understanding of GC,3 these studies have been confined to protein-coding genomic regions representing less than 2% of the human genome.2–4 In contrast, recurrent structural variations (SVs) including rearrangements, translocations and inversions remain poorly described in GC.5
Initial SV studies in GC focused on identifying gene fusions6–8 and the 3’ untranslated regions of genes including CCND1,9 FGFR2 10 and PD-L1.11 In other cancers, studies have uncovered a novel class of genetic driver alterations termed ‘enhancer hijacking’, where distal enhancers are translocated to the promoters of oncogenes such as GFI1, IGF2 and MYC.12–15 Detecting tumour-associated ‘enhancer-based SVs’ is challenging, often requiring synthesising outputs from different SV detection algorithms,16 prioritising SV breakpoints located at non-coding and often heterologous genomic regions, and discerning functionally relevant SVs from passenger SVs caused by regional genomic instability.17 Clinically, the observation that tumour-gene expression can be driven by recruitment of distal enhancer elements may require expanding current diagnostic platforms to non-coding genomic regions and integrating epigenetic information to identify functionally important SVs.
Enhancers exhibit high tissue-specificity and context-specificity18 and active enhancers are often marked by histone H3 Lys27 acetylation (H3K27ac).19 Previous studies have employed single-end H3K27ac ChIP-seq profiling for enhancer identification.19 20 Here, we performed H3K27ac ChIP-seq paired-end sequencing to identify SVs directly associated with acetylated genomic regions, an approach previously tested in B-cell lymphoma.21 By integrating SVs associated with acetylated chromatin identified by paired-end NanoChIP-seq (PeNChIP-seq) with whole-genome sequencing (WGS) analysis of a large GC cohort, we observed enhancer hijacking events in GC targeting CCNE1, IGF2 and CCND1. As high expression of CCNE1 has been associated with poor survival and resistance to ERBB2-directed therapy in GC,22 23 our results suggest that in addition to measuring CCNE1 gene amplification, diagnostic panels may also need to consider detecting CCNE1 enhancer-based SVs. Moreover, the identification of novel targets in GC such as IGF2 suggests the possibility of new therapeutic interventions.
Materials and methods
‘Normal’ (ie, non-malignant) samples refers to stomach samples harvested from sites distant from the tumour and exhibiting no visible evidence of tumour or intestinal metaplasia/dysplasia on surgical assessment.
Patient and public involvement
This research was done without patient involvement. Patients were not invited to comment on the study design and were not consulted to develop patient relevant outcomes or interpret the results. Patients were not invited to contribute to the writing or editing of this document for readability or accuracy.
Paired-end Nano-ChIPseq was performed19 with slight modifications. Fresh-frozen tissues were dissected in liquid nitrogen to obtain ~5 mg sized pieces for ChIP, performed using H3K27ac antibodies (ab4729, Abcam). After recovery of ChIP and input DNA, whole-genome-amplification was performed using the WGA4 kit (Sigma-Aldrich) and BpmI-WGA primers. About 30 ng of amplified DNA was used for each sequencing library (New England Biolabs).
SV detection and analyses
Enhancer-based SVs were detected using LUMPY3 and DELLY,4 followed by manual curation to remove SVs with features suggestive of artefacts such as repetitive or low-complexity DNA sequences and overlapping read pairs mapping to multiple distant genomic sites. We applied LUMPY (V.0.2.13).24 Each sample was genotyped using SVTyper.25 Enhancer-based SV and breakpoint lists were structured in VCF format. Files in BEDPE format were generated using the ‘vcfToBedpe’ script (https://github.com/arq5x/lumpy-sv/blob/master/scripts/vcfToBedpe), allowing detection of common or exclusive breakpoints using pairToPair or pairToBed. We excluded SVs exhibiting: (1) size SVLEN less than 50 bp (only applicable for SVs classified as deletions or tandem duplications), (2) neither breakpoint localised within predicted H2K27ac enriched regions (MACS2, FDR≤5%), (3) one or more breakpoints localised within ENCODE blacklisted or low complexity regions, (4) SV not supported by both discordant paired-end and split reads. After manual curation (above), a further 16 germline enhancer-based SVs were excluded. Enhancer-based SVs used for downstream analysis included: (1) SVs detected in both PeNChIP-seq and matched input using the pairToPair (–slop 20) tool, (2) SVs associated with a minimum quality score (QUAL) of 100. Identification of high-confidence SVs using WGS profiles from GC and normal samples5 was performed using LUMPY. GC-matched normal WGS profiles were analysed from in-house data, TCGA, ICGC and publicly available databases.2
Paired-end ChIP-seq, Hi-C, 4C-seq and Capture-C are available in Gene Expression Omnibus (GSE118392; https://www.ncbi.nlm.nih.gov/geo/). Histone ChIP-seq from MKN7 (wild type; GSE97838, GSE97837), colorectal cancer lines (GSE77737), breast cancer lines (GSE85158), in-house RNA-seq (GSE85465) and H3K27ac single-end ChIP-seq (GSE85465) were reanalysed. Normalised gene expression matrices from TCGA STAD (gdac.broadinstitute.org_STAD.Merge_rnaseqv2__illuminahiseq_rnaseqv2__unc_edu__Level_3__RSEM_genes_normalized__data.Level_3.2016012800.0.0) were used. WGS profiles were processed as described in Guo et al.5
Additional methods associated with RNA-seq, Hi-C, Capture-C, 4C-seq, CapStarr-seq are provided in the online supplementary information.
Detecting PeNChIP-seq regulatory structural variants
We performed paired-end H3K27ac PeNChIP-seq on a discovery set of 11 primary GCs and matched normal gastric samples (total 22 samples; online supplementary table 1). Comparison of PeNChIP-seq against previous single-end H3K27ac profiles of the same samples19 26 confirmed a high degree of correlation (0.84≤ρ≤0.91, Spearman’s rank correlation coefficient; online supplementary figure 1A, online supplementary table 2). To detect enhancer-based SVs (SVs associated with H3K27ac acetylated regions), we integrated outputs from two SV callers (LUMPY and DELLY).24 27 We prioritised SVs observed in both PeNChIP-seq and input controls, and/or PeNChIP-seq SVs with high quality sequence scores (online supplementary figure 1B). To be nominated, enhancer-based SVs were also required to demonstrate support by discordant paired-end and also split reads.
As PeNChIP-seq is a relatively new technique for SV discovery, we assessed the concordance of PeNChIP-seq detected enhancer-based SVs to conventional single-end enhancer ChIP-seq with WGS (enhancer-ChIP+WGS). To establish the enhancer-ChIP+WGS data set, we integrated single-end H3K27ac ChIP-seq and WGS structural variant (SV) data from 11 normal gastric samples. For each sample, we identified 72 enhancer-based SVs (average; SD=21 SVs), corresponding to 2.2% of all WGS SVs (average; SD=0.6%) comparable to other studies.28 Independently, we also identified for the same samples 310 PeNChIP-seq enhancer-based SVs (online supplementary table 3). An average of 16% of enhancer-ChIP+WGS SVs were detected using PeNChIP-seq (online supplementary figure 1C). Closer inspection highlighted two main factors explaining the 84% of SVs not detected by PeNChIP-seq. First, for many undetected SVs, we observed low PeNChIP-seq sequence coverage at breakpoints (median six reads for undetected SVs vs 20 for detected SVs, p value <2.2×10-16, one-sided Wilcoxon’s rank sum test; online supplementary figure 2A). Second, unlike PeNChIP-seq SVs where discordant/split aligned reads are directly associated with H3K27ac sequencing reads, enhancer-ChIP+WGS SVs comprise SVs whose breakpoints overlap with ‘genomic regions associated with histone acetylation’, where the latter regions are inferred based on initial H3K27ac reads mapping to the genome followed by subsequent in silico 3’ extension of the read cluster. This results in the inferred acetylated regions in enhancer-ChIP+WGS being larger/longer than the actual H3K27ac read cluster (typical enhancer length ~550 bp; see online supplementary methods) and a substantial number of enhancer-ChIP+WGS SVs mapping not directly but ‘nearby’ a region of H3K27ac ChIP-seq sequencing reads (online supplementary figure 2B).
Reciprocally, 61% of PeNChIP-seq enhancer-based SVs were rediscovered using conventional WGS5 (online supplementary figure 3). As an example, in sample N980417, both PeNChIP-seq and WGS identified a ~4 Kb deletion associated with acetylated chromatin at the SEC22B promoter (online supplementary figure 4). Three factors explain the 39% of SVs that were not detected by matched WGS. First, a subset of germline PeNChIP-seq SVs missed by matched WGS profiles were detected in unmatched WGS profiles from other individuals; inclusion of these unmatched WGS SVs improved the percentage of PeNChIP-SVs detected by WGS from 61% to 68% (online supplementary table 3). Second, categorisation of the undetected SVs by deletions, tandem duplications, inversions and complex variants revealed that undetected SVs are enriched in complex variants (76%) which are known to be analytically challenging (online supplementary figure 5); we found that excluding complex SVs from analysis improved the percentage of PeNChIP-seq SVs detected by WGS from 68% to 84%. Third, using orthogonal PCR and Sanger sequencing, we selected and experimentally validated five out of five PeNChIP-seq specific SVs undetected in the matched WGS profiles (100% success rate; online supplementary table 3; online supplementary figure 6) (see Discussion section).
Identification of tumour-associated enhancer-based SVs
We then applied PeNChIP-seq to the GC samples to identify tumour-associated enhancer-based SVs after removing germline SVs observed in patient-matched normal tissues, in-house matched as well as unmatched gastric WGS profiles and public germline databases (see Methods section). We identified 148 tumour-associated enhancer-based SVs (online supplementary table 4), divided into four rearrangement categories—complex variants (67%), tandem duplications (16%), deletions (16%) and inversions (1%) (figure 1A). Of these tumour-associated enhancer-based SVs, 20% (n=30) exhibited breakpoints localised at H3K27ac-enriched promoter regions (±2.5 Kb from annotated transcription start sites (TSSs) of protein-coding genes from GENCODE 19).
To evaluate their transcriptional impact, we linked the SVs to their predicted target genes using the GREAT tool29 (online supplementary table 5). The median distance of enhancer-based SVs to their predicted target genes was 16.6 Kb, significantly shorter than SVs from tumour WGS-only data where most SVs are likely caused by general chromosomal instability (31.6 Kb, n=986, p value=2.1×10-6, one-sided Wilcoxon’s test) (online supplementary figure 7). Quantifying target gene expression, 37%–42% of SVs in each class were associated with highly de-egulated (>fourfold) target gene expression compared with normal gastric tissues (figure 1B). Most of the tumour-associated enhancer-based SVs (63%) were associated with gene activation rather than repression, consistent with H3K27ac being a marker of gene activation (figure 1C, online supplementary figure 8).
We proceeded to investigate associations between tumour-associated enhancer-based SVs, somatic copy number amplification and elevated gene expression. Forty-nine of the enhancer-based SVs (33%) were associated with elevated expression of non-amplified target genes (online supplementary figure 9), including IGF2, CCR4 and TWIST2 (>twofold compared with normal), consistent with enhancer-based SVs driving the expression of these non-amplified target genes. Additionally, 19 enhancer-driven SVs (13%) were associated with elevated expression in amplified target genes, such as CCNE1 and CCND1. This raises the possibility that for target genes exhibiting amplification and enhancer-driven SVs, enhancer-driven SVs may interact with gene dosage to collaboratively up-regulate gene expression13 (see next section).
To determine the prevalence of these tumour-associated enhancer-based SVs in a larger GC cohort, we integrated the genomic regions identified by PeNChIP-seq with WGS profiles of 208 primary GCs (online supplementary table 6).5 We also leveraged this analysis to identify rearranged genes revealed by standard WGS SV analysis (WGS-only). Applying non-overlapping 50 Kb genomic windows to determine the most frequent SVs arising from analysing WGS-only data (‘WGS-only SVs’), we identified 297 windows (~0.5%; online supplementary table 7) exhibiting somatic SVs in at least five patients (>2% of GC samples), excluding active long interspersed nuclear element-1 (LINE-1) regions30 (figure 1D, green). Consistent with SVs frequently occurring at fragile sites due to chromosomal damage from replication stress,31 seven of the top ten WGS-only SV hotspots overlapped with common known fragile sites at FHIT, WWOX, MACROD2, PARK2 and IMMP2L.31 At the transcriptional level, across two independent GC cohorts (TCGA and in-house) 20% of the WGS-only SVs were associated with deregulated target gene expression in at least one cohort (online supplementary figure 10), and even after excluding regions associated with fragile sites, none of the top WGS-SV hotspots (n=5) were associated with consistent target gene deregulation (online supplementary figure 11).
About 60% of the top PeNChIP-seq SVs (n=10) exhibited deregulated target gene expression in at least one cohort (online supplementary figure 10). We confirmed that these hotspots were enriched for tumour-associated enhancer-based SVs compared with germline enhancer-based SVs (ratio >1.0, empirical p value ≤6×10-4; see online supplementary results, online supplementary methods). To prioritise functionally relevant enhancer-based SVs, we selected PeNChIP-seq SVs associated with elevated target gene expression and also with one breakpoint recurrently observed in a larger GC series. We identified 68 enhancer SVs (out of 148) associated with elevated target gene expression—of these, 47 exhibited recurrent breakpoints across multiple GCs. Pathway analysis of target genes linked to the 68 enhancer-based SVs revealed that they are enriched in pathways related to regulation of cell communication, signalling, and the mitotic cell cycle (p value <1×10-4, GOrilla),32 which are plausibly related to cancer. CCNE1 and IGF2 displayed consistent trends in both cohorts (online supplementary figure 11), exhibiting gene upregulation in the presence of enhancer-based SVs (yellow highlight, figure 1E,G). CCNE1 and IGF2 were not highlighted by WGS-only SV analysis (figure 1F,H).
Recurrent CCNE1 enhancer-based SVs in GC
CCNE1 is a cell cycle regulator upregulated in several malignancies including GC, ovarian and breast cancer.33–35 Associated with resistance to targeted therapies9 including ERBB2-directed treatment,22 36 CCNE1 genomic amplifications have been reported,33 however, CCNE1 as a target of enhancer-based SVs in GC has not been described. In our cohort, CCNE1-enhancer-based SVs were observed in 8% of GC patients (16 patients; online supplementary table 8), involving seven intra-chromosomal rearrangements, four inter-chromosomal translocations, three tandem duplications and two deletions. Analysing CCNE1 amplifications and enhancer-driven SVs, we found that GCs harbouring CCNE1-enhancer-based SVs but lacking CCNE1 amplification exhibited elevated CCNE1 expression, while GCs with both CCNE1-enhancer-based SVs and CCNE1 amplifications exhibited even greater CCNE1 expression (figure 1E, G, online supplementary figure 12). These results suggest that for target genes exhibiting amplification, enhancer-based SVs may interact with conventional gene dosage to collaboratively up-regulate target gene expression. Notably, in cases where CCNE1 enhancer hijacking was observed, we were unable to observe evidence of CCNE1 fusion genes, reducing the possibility of CCNE1 gene fusions as a causal factor to the dysregulation of CCNE1 expression.
Examination of breakpoints associated with the CCNE1 SVs (online supplementary figure 13A) revealed that 63% of cases (n=10 or 4.8% of the entire study) were associated with one breakpoint consistently localised close to the CCNE1 TSS (−31 Kb to +51 Kb), with partner breakpoints at heterologous regions including up to 10 Mb apart on the same chromosome.15 The remaining six cases also exhibited breakpoints relatively close to the CCNE1 TSS (+52 Kb to +188 Kb), comparable to distances seen in other enhancer hijacking scenarios.14 For one CCNE1-rearranged GC (GC T2000877), the region translocated upstream to CCNE1 was enriched for H3K27ac signals relative to input controls (figure 2A). We orthogonally confirmed this CCNE1 rearrangement (figure 2B) using Sanger sequencing and PCR (figure 2C, D). This enhancer signal was not present in matched normal gastric tissues (figure 2A), indicating that it is tumour-associated. For another CCNE1-rearranged case (T990275, figure 2E), integration of the enhancer-based SV with H3K27ac and CapStarr-seq profiles (a high-throughput method of measuring functional enhancer activity37 from SNU16 GC cells confirmed that the translocating partner region (figure 2E, left) was H3K27 acetylated and exhibited functional transcriptional stimulation activity, consistent with enhancer activity.
Extending our analysis beyond primary tumours, we then interrogated WGS data of GC cell lines (CLS145, MKN7, SCH, IST1 and YCC18) exhibiting high CCNE1 expression (online supplementary figure 13B). Two lines (MKN7, IST1) harboured CCNE1-enhancer-based SVs (online supplementary table 9) co-occurring with CCNE1 amplification. Analysis of MKN7 cells revealed that the top four CCNE1 SVs supported by the highest number of supporting reads (left, figure 2F) included an inter-chromosomal event (CTX), two intra-chromosomal rearrangements (ITX) and a separate 3.8 Mb tandem duplication (DUP) (validated by Sanger sequencing, online supplementary figure 13C,D). Examining putative hijacked enhancer elements using matched H3K27ac profiles (right, figure 2F) and 4C-seq (online supplementary figure 13E), we confirmed that these hijacked regions were indeed associated with acetylated chromatin (H3K27ac) at promoter-distal regions (blue, black rectangles, figure 2F). Using 4C-seq, we further confirmed that the putative hijacked enhancers indeed exhibited long-range cis-interactions with the CCNE1 promoter, despite originating from distal regions (Figure 2F, online supplementary figure 13E). These SVs are thus likely to juxtapose active enhancers (blue, black rectangles figure 2F) from distal loci to CCNE1-adjacent regions, resulting in novel enhancer-promoter interactions (green, figure 2F).
To demonstrate a causal role between hijacked enhancers and CCNE1 expression, we selected the tandem duplication-associated enhancer event for testing, as this enhancer is the nearest SV to the CCNE1 promoter, associated with cis-interactions with the CCNE1 promoter, and exhibits high H3K27ac signals. Using CRISPR/Cas9 genome editing, we deleted the hijacked enhancer region in MKN7 cells (scissor, figure 2F). After confirming CRISPR deletion efficiencies (online supplementary figure 13F), we used RT-qPCR to compare CCNE1 expression levels between enhancer-deleted MKN7 cells and control MKN7 cells co-expressing Cas9. Experiments were performed across five independent biological replicates with three technical replicates each. We observed a 20% reduction in CCNE1 expression on deletion of the hijacked enhancer (p value <0.0001, two-sided t-test). We further performed two control experiments, also using Cas9-coexpressing MKN7 cells, testing either a non-targeting control sgRNA, or deletion of a 100 Kb region in the same genomic neighbourhood as CCNE1 but outside the CCNE1 topology associated domain (TAD)—these showed no alteration of CCNE1 transcripts (figure 2G). Taken collectively, these results reveal that in primary GCs and cell lines, CCNE1 enhancer-based SVs frequently juxtapose distal active enhancers to the CCNE1 proximal region, driving CCNE1 overexpression in conjunction with CCNE1 amplification.
CCNE1 enhancer hijacking events disrupt TAD boundaries
Human genomes are partitioned into TADs which are 3D chromosomal domains largely invariant across cell types (mean 830 Kb size).38 Within individual TADs, genomic interactions between regulatory elements are strong, but are insulated between TADs.38 We proceeded to investigate CCNE1 enhancer-promoter interactions and their association with TAD boundaries. First, to survey regulatory interactions associated with the CCNE1 promoter in the absence of genomic rearrangements, we applied Capture-C technology onto a GC cell line (SNU16) lacking CCNE1 enhancer-based SVs (yellow tracks, figure 3A; online supplementary table 10). We observed multiple interactions between the CCNE1 promoter (viewpoint, figure 3A) and putative distal enhancers (H3K27ac+/H3K4me1+/H3K4me3-/CapStarr-seq+; figure 3A), indicating that the CCNE1 promoter can exhibit long-range interactions both upstream (solid yellow line) and downstream of the CCNE1 gene (dotted yellow line, figure 3A). Similar results were obtained using 4C analysis (online supplementary figure 13E). Second, to map CCNE1 promoter-enhancer interactions across TADs, we integrated the Capture-C interactions with TAD domains predicted from Hi-C chromosome conformation capture sequencing using data from SNU16 GC cells (online supplementary table 11), human embryonic stem cells, and human foetal lung cells (IMR90). As predicted, the CCNE1 promoter-enhancer interactions were indeed bounded within a single TAD domain, across all three cell types (figure 3B). These results suggest that in GCs lacking CCNE1 enhancer-based SVs, CCNE1 is generally regulated via promoter-enhancer interactions that are generally do not span more than 100–200 Kb and bounded within a typical TAD (average 830 Kb).38
Importantly, we then found that several of the CCNE1-enhancer-based SVs were predicted to disrupt these pre-existing TAD boundaries, which might lead to the pathological rewiring of gene-enhancer interactions.13 39 Specifically, we observed 10 cases of putative TAD disruptions in eight GC primary samples and two GC cell lines, through inter-chromosomal (n=5) or intra-chromosomal (n=3) rearrangements, one tandem duplication and one deletion (figure 3C), and of these seven out of 10 cases (filled dot, figure 3C) showed one breakpoint close to the CCNE1 promoter. As an example, we observed predicted enhancers (supported by H3K27ac and CapStarr-seq) at the chr2 KIAA1211L locus translocated close to the CCNE1 promoter, disrupting pre-existing TAD boundaries (figure 2E).
To directly confirm TAD disruptions caused by CCNE1-enhancer-based SVs, we performed Hi-C analysis (online supplementary table 11) on a CCNE1-rearranged primary GC containing both a CCNE1 intra-chromosomal rearrangement and a CCNE1 tandem duplication (T2000877). We visualised TAD interactions in this region as a contact matrix (chr19:30 Mb-32Mb; top, figure 3D) and as a negative control, we compared similar contact matrices from SNU16 GC cells where no CCNE1 SVs were observed (bottom, figure 3D). Besides confirming high numbers of interactions (in red) within pre-existing TADs (yellow box, Arrowhead at 10 Kb resolution), we also observed novel de novo interaction patterns at breakpoints associated with the cross-TAD CCNE1 rearrangement and tandem duplication (black dotted circle, figure 3D). These results demonstrate that CCNE1 enhancer-based SVs may rewire local regulatory circuits (figures 2B and 3E) by disrupting pre-existing TAD boundaries and causing new long-range chromatin interactions.
PeNChIP-seq also reveals IGF2 and CCND1 enhancer hijacking in GC
Besides CCNE1, our analysis further revealed enhancer-based SVs affecting IGF2 and CCND1 (online supplementary table 12) collectively occurring in 4.8% of GCs (IGF2—eight cases, CCND1—two cases). At IGF2, enhancer-based SVs were observed in the form of tandem duplications (Figure 4A, online supplementary figure 14A, online supplementary table 12) and linked to IGF2 overexpression in both primary GCs (figure 4B) and cell lines (online supplementary figure 14B,C). Associations between IGF2 SVs and IGF2 overexpression have been observed in colorectal cancer.13 Comparison of primary GC histone profiles against matched normal gastric tissues confirmed that the hijacked enhancer is tumour-associated (online supplementary figure 14D). When compared against gastric, colorectal and breast cancer cell lines,20 we observed similar enhancer acetylation patterns in gastric and colorectal cancer but not in breast cancer (online supplementary figure 14D–F), suggesting that IGF2 enhancer-based SVs are likely lineage-specific and shared across gastro-intestinal malignancies. We also found that IGF2 enhancer-based SVs are associated with TAD disruptions, as IGF2-rearranged primary GCs (T990275) (top, figure 4C) exhibited novel chromatin interactions at the IGF2 locus between pre-existing contact domains (yellow) and peaks (black dotted circles, figure 4C) compared with SNU16 GC cells that have a wild-type IGF2 genomic architecture (figure 4C). Survival analysis revealed that patients exhibiting high IGF2 expression showed poor overall survival compared with patients exhibiting low IGF2 expression (TCGA STAD, online supplementary figure 15; p value=2.5×10-2, log-rank test). In a multivariate analysis, the association with survival remained statistically significant (Cox regression p value=2.3×10-2; HR=1.8, 95% CI 1.08 to 2.90) after adjusting for other risk factors, such as age, stage, patient locality and histological subtype.
Finally, we observed tumour-associated enhancer-based SVs (figure 4D) supported by discordant paired and split reads at the CCND1 locus. Similar to CCNE1, the CCND1 SVs were associated with both CCND1 amplification and marked upregulation of CCND1 relative to other GC samples lacking CCND1 rearrangements (figure 4E). Analysis of the CCND1-partner breakpoint revealed that it involved a predicted enhancer associated with the PDHX locus in GC, distinct from previous CCND1-hijacking events in lymphoma which involve IGH.21 Taken collectively, these cases demonstrate that in GC, oncogenes such as CCNE1, IGF2 and CCND1 can be activated by rewiring regulatory circuits via enhancer-hijacking.
SVs at non-coding regions can cause enhancer-hijacking, where distal enhancer elements are juxtaposed against cancer genes. Initially identified in medulloblastoma through the activation of the GFI1 and GFI1B proto-oncogenes,12 enhancer hijacking events have two major consequences—first, tissue-specific enhancer elements are relocalised to the vicinity of oncogene promoters thereby instigating oncogene activity. Second, genomic rearrangements SVs may disrupt wild-type TAD boundaries, resulting in aberrant enhancer-promoter interactions between regulatory elements normally insulated from one another. In the current study, we employed PeNChIP-seq to identify enhancer hijacking events in GC. Compared with conventional enhancer-ChIP+WGS, benchmarking analysis revealed that only a subset of enhancer-based SVs were identified detected by PeNChIP-seq (online supplementary figure 1C). The performance of PeNChIP-seq may be improved by technical and algorithmic modifications, such as decreased DNA shearing to increase the size of library fragments, increasing the PeNChIP-seq sequencing depth, and more sophisticated methods to dissect complex SV events.
We identified tumour enhancer-based SVs associated with upregulation of CCNE1 and IGF2 expression in GC. In GC, CCNE1 has been associated with copy number amplification,33 and in liver cancer as a target of viral integration and enhancer hijacking.40 Our results suggest that for GC target genes exhibiting amplification such as CCNE1, enhancer-based SVs may interact with conventional gene dosage to collaboratively up-regulate target gene expression (online supplementary figure 12). Supporting this hypothesis, we experimentally confirmed that deletion of a CCNE1-hijacked enhancer in MKN7 cells (harbouring both CCNE1 enhancer-based SVs and CCNE1 amplification) caused decreased CCNE1 expression. Cyclin E1 (CCNE1) encodes a cyclin that stimulates Rb phosphorylation by CDK2, controlling the transition of cells from G1 to S phase. Breakpoints at this locus were detected in a significant fraction (8%) of GC patients, comparable to rates of ERBB2 amplification, the molecular target of trastuzumab.41 From a therapeutic perspective, overexpression of CCNE1 has been associated with poorer patient survival,23 and CCNE1 co-amplification has been reported as a collaborative oncogenic event conferring resistance to ERBB2-directed therapies in gastro-oesophageal malignancies.22 Cell lines harbouring CCNE1 amplification have also been associated with resistance to the CDK4/6 inhibitor abemaciclib,9 and MKN7 GC cells have been reported to be sensitive to bortezomib,42 highlighting a possible treatment strategy for CCNE1-addicted GCs. Notably, the CCNE1 genomic locus also contains URI1, which has been reported as an oncogene in ovarian and colorectal cancer.43 44 However, comparison of expression data demonstrated that CCNE1-rearrangments resulted in greater upregulation of CCNE1 than URI1 (CCNE1: median 40×, average 29×; URI1: median 7×, average=5×), motivating us to focus on CCNE1.
Besides CCNE1, our findings also identified IGF2 as another target of enhancer hijacking in GC. In gastric tumours, IGF2 enhancer hijacking employed a very similar mechanism previously reported in colorectal cancer,13 involving a tandem duplication causing de novo formation of a new TAD interaction domain (proven by Hi-C) mobilising a tissue-specific super-enhancer normally inaccessible to IGF2. IGF2 is known to play a role in the growth and proliferation of cells, and in our study IGF2 upregulation in GC was dramatic (>100 fold). In colorectal cancer, high IGF2 expression has been associated with poor prognosis,45 and IGF2 knockdown in paclitaxel-resistant ovarian cancer cells restored drug sensitivity.46 Besides upregulation of IGF2, intron 8 of IGF2 also contains the microRNA MIR483, which has also been reported to exhibit oncogenic potential.47 Co-occurrence of this enhancer hijacking event in both gastric and colorectal cancer supports non-coding enhancer hijacking as a dominant mechanism of IGF2 overexpression in gastro-intestinal malignancies.
Our study is one of the first to experimentally confirm the establishment of novel de novo chromatin interactions at the locations of enhancer-based SVs in primary epithelial tumours, as shown by Hi-C data for both CCNE1 and IGF2. These results are consistent with recent studies in other tumour types highlighting somatic copy number alterations and genome rearrangements as a mechanism by which local insulator domains may be disrupted, leading to the creation of novel de novo chromatin and regulatory interactions.13 14 Experimental verification of such interactions has also been demonstrated, in our present study through Hi-C analysis of primary GCs, and in other studies by 4C/CTCF-ChIPseq assays in colorectal cancer spheroids,6 neuroblastoma cell lines by in situ Hi-C analysis,11 and recently in primary diffuse large B-cell lymphoma.48
The observation that tumour-gene expression can be driven by recruitment of enhancer elements may have diagnostic and therapeutic implications. First, since enhancer-based SV breakpoints usually occur within non-exonic regions, these events are unlikely to be detected using exome-based or panel-based tumour sequencing where only protein-coding exons are examined. Second, diagnostic platforms may require incorporation of epigenetic information to distinguish functionally important SVs driving tumour gene expression from bystander SVs arising from background genomic instability. Third, examining gene targets of enhancer hijacking may reveal new therapeutic targets missed by exome-based mutation screening. In the case of IGF2, recent studies have shown that IGF2-overexpressing colorectal cancers may exhibit sensitivity to IGF1R/INSR inhibitor compounds.49 Fourth, for gene targets of enhancer hijacking that are themselves considered ‘undruggable’ (eg, CCNE1), disrupting the activity of the hijacked enhancer, by either genome-editing or targeting epigenetic complexes associated with the hijacked enhancer, may represent another therapeutic opportunity.
In conclusion, our findings highlight the dynamic interplay between the cancer genome and epigenome.19 26 We note that besides CCNE1, IGF2 and CCND1, we also observed tumour-associated enhancer-based SVs causing upregulation of several other genes without previously described roles in GC, such DLGAP1 and PKDCC, which may represent novel oncogenes. Our results thus demonstrate how non-coding rearrangements may influence tumour gene expression in GC through the rewiring of cis-regulatory elements.
We thank all members of the Singapore Gastric Cancer Consortium for their contributions and support, and the Duke-NUS Genome Biology Facility for sequencing services.
WFO and AMN contributed equally.
Contributors PT led the study and was involved in its conception, design, data collection and assembly, and manuscript writing. WFO, AMN and KJL were involved in study conception, design, data analysis and manuscript writing. JQL, YG, SJL, TN, JXT and WKL were involved in data analysis. AMN, SZ, MX, AM, SWTH, XY, CX, XO and YNL were involved in performing experiments. ML and AL-KT provided genomic profiling (sequencing and microarray) and data preprocessing services. AK, KPW, SR, BTT, SL, AJS provided facilities, reagents and intellectual input. WFO and AMN contributed equally to this article. All authors were involved in proof-reading and gave final approval of the manuscript.
Funding This work was supported by the National Research Foundation Singapore under its Translational and Clinical Research (TCR) Flagship Programme and administered by the Singapore Ministry of Health’s National Medical Research Council (TCR/009-NUHS/2013) and grant NMRC/STaR/0026/2015. Other sources of support include A*STAR A*ccelerate GAP fund (ETPL/15-R15 GAP-0021), core funding from Duke-NUS Medical School, and the Cancer Science Institute of Singapore, NUS, under the National Research Foundation Singapore and the Singapore Ministry of Education under its Research Centres of Excellence initiative.
Competing interests None declared.
Patient consent for publication Not required.
Ethics approval Primary patient samples were obtained from the SingHealth tissue repository with Institutional Review Board approval and signed patient informed consent.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data are available in a public, open access repository. All data relevant to the study are included in the article or uploaded as supplementary information.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.