Article Text

PDF

Original article
Integrated genomic analysis of recurrence-associated small non-coding RNAs in oesophageal cancer
  1. Hee-Jin Jang1,2,3,
  2. Hyun-Sung Lee1,2,4,5,
  3. Bryan M Burt4,
  4. Geon Kook Lee5,
  5. Kyong-Ah Yoon5,6,
  6. Yun-Yong Park2,
  7. Bo Hwa Sohn2,
  8. Sang Bae Kim2,
  9. Moon Soo Kim1,
  10. Jong Mog Lee1,
  11. Jungnam Joo7,
  12. Sang Cheol Kim8,
  13. Ju Sik Yun9,
  14. Kook Joo Na9,
  15. Yoon-La Choi10,
  16. Jong-Lyul Park11,
  17. Seon-Young Kim11,
  18. Yong Sun Lee12,
  19. Leng Han13,14,
  20. Han Liang13,
  21. Duncan Mak15,
  22. Jared K Burks15,
  23. Jae Ill Zo16,
  24. David J Sugarbaker4,
  25. Young Mog Shim16,
  26. Ju-Seog Lee2
  1. 1Center for Lung Cancer, Research Institute and Hospital, National Cancer Center, Goyang, Gyeonggi, Republic of Korea
  2. 2Division of Cancer Medicine, Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
  3. 3Department of Molecular Oncology, The Graduate School of Medicine, Seoul National University, Seoul, Republic of Korea
  4. 4Division of Thoracic Surgery, Michael E. DeBakey Department of Surgery, Baylor College of Medicine, Houston, Texas, USA
  5. 5Lung Cancer Branch, Research Institute and Hospital, National Cancer Center, Goyang, Gyeonggi, Republic of Korea
  6. 6College of Veterinary Medicine, Konkuk University, Seoul, Republic of Korea
  7. 7Biometric Research Branch, Research Institute and Hospital, National Cancer Center, Goyang, Gyeonggi, Republic of Korea
  8. 8Department of Biomedical Informatics, Center for Genome Science, National Institute of Health, KCDC, Choongchung-Buk-do, Republic of Korea
  9. 9Lung and Esophageal Cancer Clinic, Department of Thoracic and Cardiovascular Surgery, Chonnam National University Hwasun Hospital, Hwasun, Jeollanamdo, Republic of Korea
  10. 10Department of Pathology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
  11. 11Department of Functional Genomics, University of Science and Technology, Medical Genomics Research Center, KRIBB, Daejeon, Republic of Korea
  12. 12Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch, Galveston, Texas, USA
  13. 13Division of Quantitative Sciences, Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
  14. 14Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston McGovern Medical School, Houston, Texas, USA
  15. 15Department of Leukemia, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
  16. 16Department of Thoracic and Cardiovascular Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
  1. Correspondence to Dr Hyun-Sung Lee, Division of Thoracic Surgery, Michael E. DeBakey Department of Surgery, Baylor College of Medicine, Houston, TX 77030, USA; Hyun-Sung.Lee{at}bcm.edu; Dr Young Mog Shim, Department of Thoracic Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea; ymshim{at}skku.edu; Dr Ju-Seog Lee, Department of Systems Biology, Division of Cancer Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; jlee{at}mdanderson.org

Abstract

Objective Oesophageal squamous cell carcinoma (ESCC) is a heterogeneous disease with variable outcomes that are challenging to predict. A better understanding of the biology of ESCC recurrence is needed to improve patient care. Our goal was to identify small non-coding RNAs (sncRNAs) that could predict the likelihood of recurrence after surgical resection and to uncover potential molecular mechanisms that dictate clinical heterogeneity.

Design We developed a robust prediction model for recurrence based on the analysis of the expression profile data of sncRNAs from 108 fresh frozen ESCC specimens as a discovery set and assessment of the associations between sncRNAs and recurrence-free survival (RFS). We also evaluated the mechanistic and therapeutic implications of sncRNA obtained through integrated analysis from multiple datasets.

Results We developed a risk assessment score (RAS) for recurrence with three sncRNAs (microRNA (miR)-223, miR-1269a and nc886) whose expression was significantly associated with RFS in the discovery cohort (n=108). RAS was validated in an independent cohort of 512 patients. In multivariable analysis, RAS was an independent predictor of recurrence (HR, 2.27; 95% CI, 1.26 to 4.09; p=0.007). This signature implies the expression of ΔNp63 and multiple alterations of driver genes like PIK3CA. We suggested therapeutic potentials of immune checkpoint inhibitors in low-risk patients, and Polo-like kinase inhibitors, mammalian target of rapamycin (mTOR) inhibitors, and histone deacetylase inhibitors in high-risk patients.

Conclusion We developed an easy-to-use prognostic model with three sncRNAs as robust prognostic markers for postoperative recurrence of ESCC. We anticipate that such a stratified and systematic, tumour-specific biological approach will potentially contribute to significant improvement in ESCC treatment.

  • RNA EXPRESSION
  • OESOPHAGEAL CANCER
  • SURGICAL ONCOLOGY
  • CANCER IMMUNOBIOLOGY
  • GENE EXPRESSION

Statistics from Altmetric.com

Significance of this study

What is already known on this subject?

  • Oesophageal cancer results in about 400 000 deaths, making the disease the sixth-leading cause of cancer-related deaths. Oesophageal squamous cell carcinoma (ESCC) is particularly prevalent in Asia and Africa where it accounts for more than 90% of all oesophageal cancer cases.

  • Although combination therapy with surgery, chemotherapy and radiotherapy has improved survival of ESCC, the 5-year overall survival (OS) rate is at best 20%.

  • Reliable and reproducible prognostic markers identifying patients at high risk of ESCC recurrence after surgery have not been established.

What are the new findings?

  • For easy translation of our findings to the clinic, we developed a recurrence risk assessment score (RAS) using three small non-coding RNAs (sncRNAs) from fresh frozen tissue or formalin-fixed, paraffin-embedded ESCC specimens as robust prognostic markers for predicting postoperative recurrence of ESCC and validated this score in independent Eastern, Western and lung squamous cell carcinoma cohorts comprising over 500 patients.

  • The expression of oncogenic isoforms ΔNp63 was significantly higher in the high-RAS group and, interestingly, this higher expression correlated with amplification of TP63, further suggesting that the activation of alternative promoters and alternative splicing of TP63 are major mechanisms for the regulation of TP63 in squamous carcinoma.

  • For the low-risk group, we tested the immunogenicity of the tumour with time-of-flight mass cytometry under the hypothesis that the low-risk group has immunogenic tumours because a plethora of immune genes are upregulated. The upregulation of programmed cell death ligand 1 (PD-L1) in low-risk ESCC and no alteration of PD-L1 in high-risk ESCC, in response to loaded dendritic cells, suggest the therapeutic potential of immune checkpoint inhibitors.

  • We found that the high-risk group was more sensitive to Polo-like kinase (PLK) inhibitors, mammalian target of rapamycin (mTOR) inhibitors and histone deacetylase (HDAC) inhibitors by using the Genomics of Drug Sensitivity in Cancer Database, the largest public resource of information on the sensitivity of almost 700 cancer cells to 140 drugs.

How might it impact on clinical practice in the foreseeable future?

  • This study provides suggestive evidence, based on cancer genomics, for clinical trials for immune checkpoint inhibitors in patients with low-risk ESCC and for targeted agents, such PLK inhibitors, mTOR inhibitors and HDAC inhibitors, in patients with high-risk ESCC. Our analyses of RAS for recurrence demonstrate the potential of sncRNA signatures as robust prognostic and predictive markers for precision medicine.

Introduction

Oesophageal cancer (ESCA) is the eighth most common cancer worldwide. Of the 456 000 new cases estimated in 2012, 398 000 were squamous cell carcinoma (ESCC), 52 000 were oesophageal adenocarcinoma (EAC) and 6000 were other carcinomas. ESCA results in about 400 000 deaths, making the disease the sixth-leading cause of cancer-related deaths.1 ,2 ESCC is particularly prevalent in Asia and Africa where it accounts for more than 90% of all ESCA cases.3 Although combination therapy with surgery, chemotherapy and radiotherapy has improved survival of ESCC, the 5-year overall survival (OS) rate is at best 20%.1

Reliable and reproducible prognostic markers identifying patients at high risk of ESCC recurrence after surgery have not been established. In recent decades, non-coding RNAs (ncRNAs), especially microRNAs (miRNAs), have been found to play important roles in carcinogenesis and tumour progression. For ESCC, several miRNAs4–7 and long ncRNAs8 have been investigated as biomarkers or therapeutic targets. The translation of these findings into clinical approaches should be supported by robust validation in large independent patient cohorts and a better understanding of the underlying biology associated with ncRNAs.

In this study, we applied systems biology approaches to evaluate the expression patterns of small ncRNAs (sncRNAs) and to develop and validate a robust prognostic prediction model for ESCC. We propose molecular mechanisms and potential therapeutics dictating clinical outcomes that were well reflected in the prediction model.

Methods

Patients and study procedures

Eligibility and exclusion criteria

We enrolled patients who (1) had histologically confirmed ESCC confined to the thorax, (2) had undergone only complete oesophagectomy with adequate lymph node dissection, (3) had available paired tumour and adjacent normal tissue in the tumour bank (discovery set) or formalin-fixed, paraffin-embedded (FFPE) tissue block (validation set) and (4) were followed up completely.

The patient enrolment process and study scheme are illustrated in figure 1 and online supplementary figure S1. The clinical and pathological characteristics of the discovery and validation sets appear in online supplementary table S1. To generate the sncRNA expression profile (see online supplementary table S2), we obtained 108 pairs of fresh frozen ESCC and adjacent normal tissue samples from surgically resected specimens collected between 2002 and 2009 at the National Cancer Center Korea (discovery set). We included samples from patients with thoracic ESCC who underwent complete oesophagectomy with adequate lymph node dissection and did not receive any perioperative chemotherapy or radiotherapy. For external validation, we used 214 FFPE ESCC specimens collected at Samsung Medical Center and Chonnam National University between 2001 and 2011. For external validation in the Western cohorts, we used The Cancer Genome Atlas (TCGA) database. All human samples were collected with informed patient consent and the study protocol was approved by each institution's ethics committee.

Figure 1

Flow diagram showing the patient cohort enrolment. Summary of the clinical study design is shown. Left panel: the discovery cohort enrolment comprised 108 ESCC samples obtained from fresh frozen tissues in NCCK. Right panel: independent validation processes for other 214 ESCCs, 151 TCGA ESCAs and 147 TCGA LUSCs. CNU, Chonnam National University; ESCA, oesophageal cancer; ESCC, oesophageal squamous cell carcinoma; FFPE formalin-fixed, paraffin-embedded; LUSC, lung squamous cell carcinoma; NCCK, National Cancer Center Korea; RAS, Risk Assessment Score; SMC Samsung Medical Center; sncRNAs, small non-coding RNAs; TCGA, The Cancer Genome Atlas.

Determination of sample size in the discovery and validation sets

The discovery study was designed to detect sufficient differences in the 5-year recurrence-free survival (RFS) rate between the low-risk and high-risk groups, as defined by the GeneChip miRNA 2.0 array (Affymetrix, Santa Clara, California, USA). The sample size and power calculations for the discovery study were determined in two ways considering the difference in survival and fold change. First, we determined that a total of 52 events were needed to predict survival using log2 sncRNA expression with a standard deviation (SD) of 1.67 according to our own experimental data and to detect a hazard ratio (HR) of 1.5 with 80% power. Since the internally validated 5-year RFS rate was about 50%, a total of 104 samples were needed to observe 52 events. With this sample size, the study had 90% power to detect a 5-year RFS rate difference of 30% from the 30% reference rate of the high-risk group defined by sncRNA expression at a two-sided type I error rate of 0.05. Second, the study was powered to detect significantly overexpressed or under-expressed sncRNAs in cancer cells compared with normal cells. To detect twofold expression change with 90% power at a multiple testing corrected type I error rate of 0.05/900, a total of 79 samples were needed. Accounting for about 10% data loss to experimental failure and dropouts, a total sample size of over 90 was needed. According to these two calculations, we sought to study a sample size of over 100 for the discovery study.

The validation study was designed to detect sufficient difference in the 5-year RFS rate between the low-risk and high-risk groups, as defined by the three candidates of sncRNAs found in the discovery set. We assumed that the proportion of the low-risk group would be about 30%. Moreover, the internally validated 5-year RFS rate of the high-risk group was about 30%, whereas the RFS rate of the low-risk group was over 70%. We therefore powered the study to detect a 30% 5-year RFS rate difference from the 30% reference rate of the high-risk group, a conservative effect of size derived from the previous finding. To have 90% power at a two-sided 5% type I error rate, the study required 190 patients (57 in the low-risk group and 133 in the high-risk group) to observe 104 events. Accounting for data loss (about 10%) to experimental failure and dropouts, we set the sample size of the study to exceed 210.

Small non-coding RNA microarray hybridisation and analysis

For sncRNA expression array analysis, we used a GeneChip miRNA 2.0 array. We labelled 1 μg of total RNA with the FlashTag Biotin HSR RNA Labeling Kit (Genisphere LLC, Hatfield, Pennsylvania, USA) following the manufacturer's recommendations. Then, labelled sncRNA was hybridised to the array and incubated as described in the manufacturer's protocol (Affymetrix). Unsupervised hierarchical clustering of sncRNA expression data from ESCC and adjacent normal tissues in the discovery set revealed marked distinctions between tumour tissues and normal tissues, suggesting that the expression of sncRNA well reflects the molecular characteristics of ESCC tumours (see online supplementary figure S2).

Selection of reference gene for normalisation of quantitative real time-PCR

We adopted a normalisation strategy with multiple reference genes, which was proposed in a previous study,9 to overcome the limitations of using a single reference gene for the normalisation of miRNA expression from quantitative reverse transcription PCR (qRT-PCR) experiments.9–13 We first assessed the gene expression stability of putative normaliser genes with geNorm9 software in microarray data. Then, we selected two miRNAs (miR-132 and miR-652) with high expression stability (see online supplementary figure S3) and calculated the geometric mean value of the two miRNAs as the normaliser of miRNAs.

To determine the reference gene for the normalisation of nc886 expression, we searched previous studies of ncRNAs and selected three major reference genes: β-actin, cyclophilin A (PPIA)14 and small nuclear RNA 6. We computed the expression stability with geNorm5 and selected endogenous gene, PPIA, because it showed a relatively stable expression in ESCC.

Integrative analysis from the TCGA dataset

To determine how the sncRNA signature is conserved in ESCA from the Western cohort and the lung squamous cell carcinoma (LUSC) set, we obtained miRNA sequencing data of ESCA and LUSC from TCGA portal (https://tcga-data.nci.nih.gov/tcga/). Normalised reads per kilobase per million values were transformed logarithmically and centralised by subtracting the mean of each channel and further normalised to equalise the variance. To analyse TCGA and discovery set data together, we normalised the expression data from the discovery cohort, as previously described herein and divided the data by the SD of the channel to determine which cancer type was more similar to ESCC in terms of sncRNA expression.

We leveraged the ESCA (n=151) and LUSC data (n=147) including mRNA and miRNA sequencing, somatic mutation, DNA copy number alteration and clinical data from the TCGA portal and the cBioPortal for Cancer Genomics (http://www.cbioportal.org/public-portal/index.do).15 To visualise genomic alterations in multiple genes, we used OncoPrint generated by cBioPortal for Cancer Genomics.15 Then, we analysed the frequency of genetic alteration in major genes. The experimental and analytical methods for generating sncRNA and mRNA expression profile data and the integrated analyses are shown in the online supplementary appendix.

Statistical analysis

Student's t-tests, χ2 tests and Fisher's exact tests were used to compare clinicopathological data. Survival curves were generated with the Kaplan-Meier's method and intergroup comparisons were performed with the log-rank test. Our primary end point was RFS. Univariable Cox regression analysis was used to determine whether sncRNAs were associated with RFS. We performed multivariable Cox proportional hazards regression analysis to evaluate independent prognostic factors associated with survival and we used gene signature, tumour node metastasis (TNM) stage, smoking, age and sex as variables. We used data from the discovery cohort to develop a formula for a risk assessment score (RAS) based on a linear combination of the expression level of sncRNAs weighed by regression coefficients derived from multivariable Cox regression analyses. We analysed correlations between microarray data and qRT-PCR data with the Spearman's correlation coefficient. To select genes with statistically significant alterations from integrated analysis of data from TCGA, we used χ2 or Fisher's exact test. Statistical significance was accepted as p<0.05 and all tests were two-tailed. All statistical analyses were performed with SPSS (V.20.0; SPSS, Chicago, Illinois, USA) and R language and software environment (http://www.r-project.org).

Results

Development of risk assessment score for ESCC recurrence with sncRNAs

Multistep analyses were carried out to identify robust, prognostic sncRNAs from ESCC tumours. First, among 901 sncRNAs with differential expressions between tumours and adjacent normal tissues (see online supplementary figure S2), we selected sncRNAs whose expression showed at least a twofold difference from the median value of at least 30 samples, with a SD >1. This selection yielded 162 sncRNAs. Second, 8 of these 162 sncRNAs were significantly associated (p<0.05) with RFS in univariable Cox regression analysis (see online supplementary table S3). Finally, backward stepwise selection in multivariable Cox regression analysis revealed that only three sncRNAs (miR-223, miR-1269a and miR-886-3p) were independent predictors of RFS (p<0.05). High expression of miR-886-3p and miR-223 was significantly associated with longer RFS (p=0.03 and p=0.004, respectively, by the log-rank test), but high expression of miR-1269a was significantly associated with shorter RFS (p=0.045) (see online supplementary figure S4).

We measured the size of miR-886 via northern blot analysis of ESCC cell lines and confirmed that miR-886 is not a miRNA but rather a novel sncRNA 101 nucleotides long (see online supplementary figure S5) (ncRNA886 (nc886)).16 ,17 We remeasured the expression of three sncRNAs (two miRNAs and one sncRNAs) from the same tissues with qRT-PCR. The correlation between the two measurements was significantly high (r>0.6; p<0.001) (see online supplementary figure S6 and table S4).

We next developed a recurrence risk assessment model with Cox regression coefficients and the expression level of each sncRNA measured with qRT-PCR: Raw Risk Assessment Score for Recurrence (RASraw)=(−0.084 × expression value of nc886)+(−0.227 × expression value of miR-223)+(0.137 × expression value of miR-1269a). To generate a dynamic range score from 0 to 100, we reformulated the RASraw: RAS=80×eRASraw–24.152. For results less than 0, the score was considered 0 and, for results greater than 100, it was considered 100. Cut-off points were specified in advance to reflect prognostic differences observed in the training cohort: low (RAS≤30), intermediate (30<RAS≤65) and high risk (RAS>65) (see figure 2A and online supplementary figure S7).18 In the discovery cohort, the 5-year RFS rates were 60.8%, 44.8% and 30.3% for the low-risk, intermediate-risk and high-risk groups, respectively (p=0.031, log-rank test, figure 2B).

Figure 2

Development and validation of the prediction model with recurrence-associated small non-coding RNAs (sncRNAs) are shown. (A) Distribution of recurrence assessment score is shown. We constructed a risk score formula with Cox regression coefficients of three sncRNAs. Patients were grouped into three categories according to the modified risk prediction model on a scale from 0 to 100 (Panel A, upper). Each column represents an individual patient. The colour in cells reflects relative expression levels. Kaplan-Meier curves in discovery (B) and validation sets (C and D) show discrete survival curves according to risk categories. (E and F) Recurrence-free survival according to risk categories in TCGA ESCA and TCGA LUSC, respectively is shown. ESCA, oesophageal cancer; LUSC, lung squamous cell carcinoma; RAS, Risk Assessment Score; RFS, recurrence-free survival; TCGA, The Cancer Genome Atlas.

Validation of RAS in independent cohorts

We sought to validate RAS in an independent, blinded cohort of patients with ESCC with RNAs from FFPE tissues and qRT-PCR methods. When patients in the validation cohort (n=214) were stratified according to their RAS with the same cut-off values used in the discovery set, the 5-year RFS rate for the low-risk, intermediate-risk and high-risk patient groups were 89.1%, 75.5% and 48.7%, respectively (p<0.001, log-rank test) (figure 2C). The difference in the survival between discovery and validation cohorts is attributed to accessibility of FFPE specimens in the validation cohort. The validation cohort had earlier cancer staging and lower tumour location than those in the discovery cohort. In each stage, especially stages II and III, the high-risk group had worse prognosis in the validation set (figure 2D). Interestingly, the RFS rate of patients with high risk in stage II was similar to that of patients with low/intermediate risk in stage III, thus suggesting that RAS is an independent predictor of RFS among varying stages. In addition, to determine whether this signature is conserved in Western cohorts and in other surgically resected squamous cell carcinomas, we analysed genomic data of ESCA (N=151) and LUSC (n=147) from TCGA, because LUSC shares highly similar genetic alterations with ESCC.19–21 Patients with high RAS levels consistently showed the worst prognosis (see figure 2E, F and online supplementary figure S8). This result shows the robustness of RAS based on three sncRNAs, regardless of the technological platform (see online supplementary figure S9) or the source of RNAs.

In the validation ESCC cohort, continuous RAS was significantly associated with risk of recurrence (HR for a 25-unit increase in score, 1.78; 95% confidence interval (CI), 1.35 to 2.34; p=3.7×10−5; figure 3). In univariable analysis, TNM stage, age, smoking status and RAS were significant predictors of RFS and OS (see online supplementary table S5). Multivariable regression analysis revealed that RAS (HR=2.27, p=0.007) and TNM stage (HR=4.13, p<0.001) were independent prognostic factors of RFS and OS in patients with ESCC (see online supplementary table S6).

Figure 3

Relationship between the continuous risk assessment score (RAS) for recurrence and 5-year recurrence risk with 95% CIs is shown. Red dotted lines represent 95% CIs. Rug plots represent the distribution of RAS in the validation cohort.

Integrated analysis based on sncRNA signature

To gain insight into the biology of sncRNAs associated with prognosis, we performed mRNA microarray assays in the discovery set and analysed mRNA sequencing data of ESCA and LUSC from the TCGA project. 613 common genes were significantly correlated with RAS in all three cohorts (see online supplementary table S7 and figure S10). Enrichment analysis of these genes through gene ontology revealed that genes related to cell mitosis, chromatin remodelling, histone modification and DNA repair were significantly activated in the high RAS group, but genes related to cell-to-cell cytokine signalling were dominant in the low RAS group. Among them, tumor protein P63 (TP63), a marker of squamous histology and a member of the TP53 family, was overexpressed in the high RAS group. In addition, the expression of genes related to histone modification, such as histone acetyltransferase (HAT), E1A binding protein P300 (EP300) and histone deacetylase 2 (HDAC2) and the expression of chromatin remodelling genes, such as CCCTC-binding factor (CTCF), were simultaneously increased. In contrast, the low RAS group showed upregulation of immunoregulatory genes with cell-to-cell cytokine signalling functions, such as colony stimulating factor 3 (CSF3), interleukin (IL)8 and IL4R (figure 4A).

Figure 4

Patterns of genetic alteration in oesophageal cancer and LUSC according to RAS are shown. (A) The expression of 613 common genes is correlated with the risk assessment score (RAS). (B) Expression of the isoforms of TP63 (TAp63 (tumour-suppressive) and ΔNp63 (oncogenic)), TP53 mutation and copy number alteration of TP63 and PIK3CA in TCGA ESCA is shown. (C) Expression of the isoforms of TP63, TP53 mutation and copy number alteration of TP63 and PIK3CA in TCGA LUSC is shown. EAC, oesophageal adenocarcinoma; ESCA oesophageal carcinoma; ESCC, oesophageal squamous cell carcinoma; LUSC, lung squamous cell carcinoma; RAS, Risk Assessment Score; TCGA, The Cancer Genome Atlas.

To elucidate the biological linkage between sncRNA signature and TP63, we analysed the expression patterns of TP63 isoforms in ESCA and LUSC (n=134) from the TCGA project. We investigated six isoforms of TP63 that have different biological activities. TAp63α, β and γ with a transcription activation (TA) domain have tumour suppressive activity, but ΔNp63 α, β and γ without a TA domain have oncogenic activity, as they serve as competitive inhibitors of TAp63α, β and γ.22 As a result, the expression of oncogenic isoforms, ΔNp63β and γ were significantly higher in the high RAS group of both ESCA and LUSC cohorts (figure 4B, C), suggesting that the tumour suppressive activity of TP63 is blocked by its oncogenic isoforms in this group. Interestingly, higher expression of ΔNp63β and γ correlated with amplification of TP63, further suggesting that the activation of alternative promoters and alternative splicing of TP63 is a major mechanism in the regulation of TP63 in squamous cell carcinoma. Additionally, TP53 mutation and phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (PIK3CA) amplification are significantly higher in the high RAS group (figure 4C). Taken together, these data strongly support mechanistic roles of the sncRNA signature as a prognostic marker, as this signature is associated with the activation of ΔNp63 and multiple alterations of driving genes, such as TP53, TP63 and PIK3CA.

Therapeutic implication for risk-stratified ESCC

To detect suitable targeted drugs for low-risk and high-risk patients, we used the Genomics of Drug Sensitivity in Cancer Database (http://www.cancerRxgene.org), the largest public resource for information on the sensitivity of almost 700 cancer cells to 140 drugs.23 ,24 A total of 55 cell lines of ESCA, lung squamous cell and head and neck squamous cell carcinomas were enrolled and classified into three risk groups by using a prediction model with proof of its reproducibility (see online supplementary figures S11 and S12).

The drug sensitivity test revealed that cell lines with a high-risk signature were sensitive to Polo-like kinase (PLK) inhibitor—which blocks cell mitosis and regulates chromosomal remodelling—and mammalian target of rapamycin (mTOR) inhibitor—which inhibits downstream of PIK3CA directly. Likewise, all cell lines were sensitive to HDAC inhibitor—which disturbs the reset of chromatin by removing acetyl groups (see figure 5A and online supplementary figure S13). These results were consistent with our previous integrated genome analysis. Furthermore, the apoptosis assay of risk-stratified ESCC cell lines for three HDAC inhibitors (Panobinostat, Dacinostat (LAQ824) and Vorinostat), two PLK inhibitors (BI2536 and BI6727 (Volasertib)) and two mTOR inhibitors (BEZ235 (Dactolisib) and Tacrolimus (FK506)) demonstrated that TE-8 high-risk cell line had dominant signals of the apoptosis compared with Het-1a and TE-1 cell lines with lower risk (figure 5B).

Figure 5

Therapeutic implications are shown for high-risk oesophageal cancer. (A) A prediction model derived from a risk assessment score (RAS)-correlated 613 mRNAs signature revealed that TE-8 ESCC cell lines are a high-risk group and TE-1 cell line is a lower-risk group. Drug rearrangement in high-risk group with the Genomics of Drug Sensitivity in Cancer Database is shown. The IC50 values of each drug within the therapeutic range of concentration are compared between groups. (B) Apoptosis assay of risk-stratified oesophageal cell lines in the candidate targeted drugs is shown. The IC50 of each drug for high-risk cell line was administrated and then all cells were captured 24 hours later. ESCC, oesophageal squamous cell carcinoma; HDAC, histone deacetylase; IC50 half-maximal inhibitory concentration; mTOR, mammalian target of rapamycin; PLK, Polo-like kinase.

Given the upregulation of immunoregulatory genes in the low RAS group, we hypothesised that such gene expression profile in the ESCC cell would stimulate an immunogenic tumour microenvironment.25 To investigate immunogenicity26 on ESCC cells and dendritic cells, we used a model system in which either the low-risk ESCC cell line TE-1 or the high-risk cell line TE-8 was cultured in the presence of immature dendritic cells (imoDCs) or matured monocyte-derived dendritic cells (moDCs) and peripheral blood mononuclear cells (PBMCs) obtained from healthy human donors. Flow cytometry showed that CD40 and CD86, markers of mature dendritic cells, were activated by pooled IgG-coated TE-1 or TE-8 tumour cells with poly(I:C) for 24 hours. Interestingly, imoDCs were also stimulated by TE-1 cells fixed with 2% paraformaldehyde only and not by fixed TE-8 cells, implying that TE-1 cells are more immunogenic to dendritic cells (figure 6A). We then performed single-cell analysis with time-of-flight mass cytometry (see online supplementary table S8) and applied viSNE27 and SPADE28 ,29 software algorithms to evaluate the immune landscape. Immunophenotyping with 17 metal-conjugated antibodies (figure 6B) revealed that the expression of programmed cell death ligand 1 (PD-L1) (whose expression on tumour or antigen-presenting cells suppresses antitumour CD8+ T cells by binding its receptor, programmed cell death 1 (PD-1), found on activated T cells30) was upregulated in only TE-1 cancer cells, but not altered in TE-8 cells after incubation with moDCs and PBMCs. These results suggest that low-risk ESCC cells have immunogenicity that leads to the expression of PD-L1 and coordinates a tumour-favourable milieu (figure 6C). Furthermore, PD-L1 was markedly expressed in dendritic cells in a tumour microenvironment with TE-1 but not with TE-8 (see figure 6B and online supplementary figure S14). Also, the expression of immune inhibitory PD-1, PD-L1 and cytotoxic t-lymphocyte-associated protein 4 (CTLA-4) from discovery and validation cohorts was significantly upregulated in the low-risk group compared with other risk groups (figure 6D).

Figure 6

Therapeutic implications are shown for low-risk oesophageal cancer. (A) CD40 and CD86 expression on imoDCs from TE-1 and TE-8 ESCC cells is shown. The imoDCs incubated with pooled IgG-coated tumour cells and poly(I:C) are used as positive control of the stimulation in dendritic cell. (B) Immunophenotyping of TE-1 ESCC cell lines after administration of mature moDCs and PBMCs is shown. (C) CyTOF single cell analysis of TE-1 and TE-8 cells before and after incubation with mature moDCs and PBMCs is shown. The alteration of PD-L1 expression on the surface of tumour cells (CD45−) responsive to immune stimulation was analysed with viSNE and SPADE software. (*)=new expression of PD-L1 on tumour cells. (D) Expression of PD-1, PD-L1 and CTLA-4 in discovery and validation cohorts is shown. Immune inhibitory signal represents any alteration of PD-1, PD-L1 and CTLA-4. CTLA-4, cytotoxic T-lymphocyte-associated protein 4; CyTOF, time-of-flight mass cytometry; ESCC, oesophageal squamous cell carcinoma; imoDCs, immature monocyte-derived dendritic cells; moDCs, monocyte-derived dendritic cells; PBMCs, peripheral blood mononuclear cells; PD-1, programmed cell death 1; PD-L1, programmed cell death ligand 1 (=CD274); SPADE cyto-spanning tree progression of density normalised events and viSNE visualization of t-distributed stochastic neighbouring embedding algorithm.

Discussion

We developed and validated a RAS model based on the expression of three sncRNAs (miR-223, miR-1269a and nc886). This model can easily and reproducibly quantify the likelihood of recurrence in patients with ESCC who have received surgical resection as primary treatment. This sncRNA signature reflects multiple alterations of driving genes such as TP53, ΔNp63 and PIK3CA. Furthermore, we provided evidence based on cancer genomics for clinical trials of immune checkpoint inhibitors in patients with low-risk ESCC; and targeted agents, such PLK inhibitors, mTOR inhibitors and HDAC inhibitors, in patients with high-risk ESCC. These findings show the potential of sncRNA signatures as robust prognostic and predictive markers for precision medicine based on risk assessment of recurrence.

Previously reported risk prediction models in ESCA have had limited clinical use because, first, signatures derived from heterogeneous patient populations have not been thoroughly validated in independent cohorts.6–8 Second, many models have been based solely on correlative associations between dysregulated markers and prognostic end points without identifying the underlying molecular mechanisms and the biological and clinical implications of dysregulation.4 ,5 ,31–33 Last, most models were generated with data obtained from fresh frozen tissues, which are not readily available in general clinical practice.5–8 ,31 ,33 Our RAS model can overcome these limitations by developing an intuitive risk prediction scoring system with FFPE tissue data. This system has been validated in independent Eastern and Western ESCA cohorts and LUSC cohort comprising 512 patients and uncovers genomic implications that may dictate clinical outcomes associated with the identified markers.

Among three sncRNAs, the role of miR-223 has been extensively analysed since PARP1 was identified as a direct target gene of miR-223 in ESCA. Increased sensitivity to chemotherapy was observed in cells with increased miR-223 and reduced PARP1 expression.34 As for miR-1269a, its high expression was associated with increased risk of relapse and metastasis in colorectal cancer.35 Recently, we reported that nc886 knockdown induced oncogenes V-myc avian myelocytomatosis viral oncogene homolog (MYC) and FBJ murine osteosarcoma viral oncogene homolog (FOS).17 ,36 Combined alterations in these sncRNAs can reflect significant modifications in the tumour microenvironment. Most interestingly, our RAS signature was highly associated with the transcription of TP63, which is critical in the development of normal oesophageal and tracheobronchial epithelia and controls the commitment of early stem cells into basal cell progeny and the maintenance of basal cells. Networking between TP63 and miRNAs is also known in multiple cancers.37 ,38 Taken together, our findings suggest that combined alterations in these sncRNAs may control the differentiation of ΔNp63 and regulate driver genes.

Recently, among patients with previously treated advanced LUSC, survival and response rate were shown to be significantly better with nivolumab, a PD-1 inhibitor, than with docetaxel, but the rate of objective response to nivolumab was only 20%.39 In our study, the low-risk group presented a tumour microenvironment in which immune cells coexisted with tumour cells expressing PD-L1, which suppressed cytotoxic T-cell functions. Thus, we can speculate that there will be a subgroup that responds well to immune checkpoint inhibitors and the low RAS group, having overexpression of many immune regulators, can be a candidate for immunotherapy. In contrast, the alteration of the tumour microenvironment did not affect tumour cells in the high-risk group. This finding highlights that sncRNA signatures have predictive potential for immune checkpoint inhibitors.

Several comprehensive analyses of squamous cell carcinoma have revealed significant alterations in histone-modifying and chromatin-remodelling genes.40–46 HDACs are recruited to active genes to reset chromatin modification states and maintain an adequate level of histone acetylation, after the activation of RNA polymerase II and HATs. Indeed, this strategy of balancing between euchromatin and heterochromatin in active genes could in part account for the mechanism of drug resistance by preventing DNA damage.47 Moreover, HDAC-inhibitor treatment resulted in rapid upregulation of histone acetylation of the PD-L1 gene leading to enhanced and durable gene expression in melanoma.48 In addition, in vivo experiments revealed that the combination therapy of HDAC inhibitor with PD-1 blockade resulted in slower tumour progression and increased survival, when compared with that of control and single-agent experiments. Our results suggest that the combination of immune checkpoint blockade or other targeted drugs with HDAC inhibitors can boost the therapeutic effect on ESCC with less concentrated doses.

In conclusion, our risk prediction model with three sncRNAs from easily accessible tissues can be potentially useful for identifying patients at high risk of recurrence after surgical resection. This study represents a comprehensive characterisation of genomic alterations in squamous cell carcinoma and provides insights into the genetic mechanism of ESCC oncogenesis and supporting evidence for therapeutics. Our findings can facilitate the rational design of future clinical studies that may ultimately lead to the development of effective biomarker-based therapeutic approaches for ESCC.

Accession codes

Data have been deposited into National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO): sncRNA expression microarrays (GSE55857) and mRNA gene expression microarrays (GSE55856).

Acknowledgments

The authors would like to thank Ana María Rodríguez, PhD and Kimberly Macellaro, PhD, members of the Baylor College of Medicine, Michael E DeBakey Department of Surgery Research Core Team, for their editorial assistance during the preparation of this manuscript.

References

View Abstract

Footnotes

  • H-JJ, H-SL and BMB have equally contributed to this article as first authors. YMS, J-SL and H-SL have equally contributed to this article as corresponding authors.

  • Contributors H-JJ, H-SL, YSL, BMB and J-SL conceptualised and planned the study. H-JJ, H-SL, MSK, JML, JSY, KJN, Y-LC, JIZ and YMS contributed to collection of surgical samples and associated clinical information. GKL conducted the pathology assessment. H-JJ, H-SL, YSL and J-SL coordinated the data generation and led to the data analysis. DM and JKB generated the mass cytometry data. LH and HL supported in the generation of sncRNA data from TCGA database. H-JJ, H-SL, Y-YP, BHS and SBK generated the gene expression data. H-JJ, H-SL, K-AY, SBK, JJ and SCK processed, analysed and participated in discussions related to the genomics data. H-JJ, H-SL and JJ conducted the statistical analysis of the clinical data. H-JJ, H-SL, BMB, J-SL, J-LP, S-YK, YSL and DJS participated in discussions, provided critical scientific input, analysis suggestions and logistical support towards the project. H-JJ, H-SL, BMB, Y-SL and J-SL wrote the manuscript.

  • Funding This work was supported by National Cancer Center Korea Research Grant no. 1110260 to H-SL; NIH grants CA150229 to J-SL; Research Scholar Grant, RSG-12-187-01—RMC from the American Cancer Society to YSL; the R&D Program for the Society of the National Research Foundation (NRF) funded by the Ministry of Science, ICT & Future Planning (Grant no.: 2013M3C8A1078501) to YLC; NIH/NCI award number P30CA016672. This project was supported by the Cytometry and Cell Sorting Core at Baylor College of Medicine with funding from the NIH (NIAID P30AI036211, NCI P30CA125123 and NCRR S10RR024574) and the assistance of Joel M. This research was performed in the Flow Cytometry & Cellular Imaging Facility, which is supported in part by the NIH through MD Anderson's Cancer Center Support Grant CA016672.

  • Competing interests None declared.

  • Patient consent Obtained.

  • Ethics approval The study protocol including the use of all human samples with informed patient consent was approved by each institution's ethics committee (three institutes).

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles