Article Text


Original article
T cell neoepitope discovery in colorectal cancer by high throughput profiling of somatic mutations in expressed genes
  1. Daniele Mennonna1,
  2. Cristina Maccalli2,
  3. Michele C Romano1,
  4. Claudio Garavaglia1,
  5. Filippo Capocefalo2,
  6. Roberta Bordoni3,
  7. Marco Severgnini3,
  8. Gianluca De Bellis3,
  9. John Sidney4,
  10. Alessandro Sette4,
  11. Alessandro Gori5,
  12. Renato Longhi5,
  13. Marco Braga6,
  14. Luca Ghirardelli6,
  15. Ludovica Baldari6,
  16. Elena Orsenigo6,
  17. Luca Albarello7,
  18. Elisabetta Zino8,
  19. Katharina Fleischhauer8,9,
  20. Gina Mazzola10,
  21. Norma Ferrero10,
  22. Antonio Amoroso10,
  23. Giulia Casorati1,
  24. Giorgio Parmiani2,
  25. Paolo Dellabona1
  1. 1Division of Immunology, Transplantation and Infectious Diseases, San Raffaele Scientific Institute, Milan, Italy
  2. 2Division of Experimental Oncology, San Raffaele Scientific Institute, Milan, Italy
  3. 3Institute for Biomedical Technologies, National Research Council, Segrate, Italy
  4. 4La Jolla Institute for Allergy & Immunology, La Jolla, California, USA
  5. 5Institute of Molecular Recognition Chemistry, National Research Council, Milan, Italy
  6. 6Department of Surgery, San Raffaele Scientific Institute, Milan, Italy
  7. 7Department of Pathology, San Raffaele Scientific Institute, Milano, Italy
  8. 8Unit of Molecular and Functional Immunogenetics, San Raffaele Scientific Institute, Milan, Italy
  9. 9Institute for Experimental Cellular Therapy, University Hospital Essen, Essen, Germany
  10. 10Department of Medical Sciences, Center for Transplantation Biology and Immunogenetics, University of Turin, Turin, Italy
  1. Correspondence to Dr Paolo Dellabona, Division of Immunology, Transplantation and Infectious Diseases, San Raffaele Scientific Institute, Via Olgettina 58, Milano 20132, Italy; dellabona.paolo{at}


Objective Patient-specific (unique) tumour antigens, encoded by somatically mutated cancer genes, generate neoepitopes that are implicated in the induction of tumour-controlling T cell responses. Recent advancements in massive DNA sequencing combined with robust T cell epitope predictions have allowed their systematic identification in several malignancies.

Design We undertook the identification of unique neoepitopes in colorectal cancers (CRCs) by using high-throughput sequencing of cDNAs expressed by standard cancer cell cultures, and by related cancer stem/initiating cells (CSCs) cultures, coupled with a reverse immunology approach not requiring human leukocyte antigen (HLA) allele-specific epitope predictions.

Results Several unique mutated antigens of CRC, shared by standard cancer and related CSC cultures, were identified by this strategy. CD8+ and CD4+ T cells, either autologous to the patient or derived from HLA-matched healthy donors, were readily expanded in vitro by peptides spanning different cancer mutations and specifically recognised differentiated cancer cells and CSC cultures, expressing the mutations. Neoepitope-specific CD8+ T cell frequency was also increased in a patient, compared with healthy donors, supporting the occurrence of clonal expansion in vivo.

Conclusions These results provide a proof-of-concept approach for the identification of unique neoepitopes that are immunogenic in patients with CRC and can also target T cells against the most aggressive CSC component.

Statistics from

Significance of this study

What is already known on this subject?

  • It is now well established that T lymphocytes play a critical role in controlling cancer progression.

  • T lymphocytes recognised peptides, called epitopes, derived from tumour associated protein antigens.

  • Epitopes derived from mutated cancer proteins are known to elicit strong antitumour T cell responses that correlate with clinical responses.

  • Recent advancement in high throughput DNA sequencing techniques, in combination with the in silico prediction of T cell epitopes, have allowed the massive identification of mutated neoepitopes in melanoma, cholangiocarcionoma and chronic lymphocytic leukaemia (CLL).

What are the new findings?

  • We have implemented a method to identify T cell mutated neoepitopes based on the massive parallel sequencing of expressed genes coupled with an immunology approach not requiring HLA allele-specific epitope predictions.

  • This method allowed the identification of several mutated neoepitopes from colorectal cancer, the second cause of cancer death.

  • We provide evidence supporting a spontaneous activation and expansion of patient's T cell specific for a mutated neoepitope expressed by the autologous tumour.

  • Finally, our study also reveals that colon cancer stem/initiating cells, a subpopulation of cells that is supposed to drive tumour initiation, propagation and metastasis, express the mutated genes and are targeted by the neoepitope-specific T cells.

How might it impact on clinical practice in the foreseeable future?

  • The systematic identification of mutated neoepitopes in colon cancer may provide new prognostic/predictive approaches based on the determination of specific T cell responses in patients with colorectal cancer, as well as prompt more efficacious immunotherapy strategy that can target T cells against the most aggressive cancer stem/initiating cells component.


Recent clinical results obtained with adoptive T cell therapy or immune checkpoint blockade by monoclonal antibody (mAbs) provide compelling evidence for spontaneous immunosurveillance and T cell mediated regression of human cancers.1–4 T lymphocytes recognise epitopes derived from the processing of tumour-derived protein antigens and presented by major histocompatibility complex (MHC) molecules displayed on cancer cells.5 Tumour associated antigens are encoded either by non-mutated genes, shared by different tumours, or by genes undergoing somatic mutations in cancer cells.5 Somatically mutated cancer genes generate neoepitopes, unique to each tumour, which can induce tumour rejection in mice and appear to dominate the specificity of T cell responses to autologous mouse or human tumours.6–8 The lack of suitable technologies for massive identification of unique cancer neoepitopes has prevented the systemic analyses of T cell responses specific for such epitopes and their exploitation in cancer immunotherapy. Recent advancements in high throughput DNA sequencing overcome these limits and provide powerful tools for the systemic identification of somatic mutations in cancer genes.9 ,10 This information can be used to derive patients’ specific mutated protein sequences, which predict synthetic peptides that bind patients’ MHC and can be tested for T cell recognition in large-scale reverse immunology approaches.11 This strategy has recently allowed a systematic definition of T cell responses specific for unique neoepitopes in mouse and human cancers, highlighting their relevant role in the tumour control achieved by active vaccination, adoptive T cell therapy or immune checkpoint blockade.12–20

Colorectal cancer (CRC) is the second cause of cancer death and responds poorly to current therapies. Average CRCs carry from 70 to more than 1000 non-synonymous exonic mutations per gene, depending on whether they are microsatellite stable or instable.21 About 35 of such mutations recurrently affect expressed genes that are likely driving the oncogenic process (candidate cancer genes, CAN-gene).22–24 T cell infiltration of CRC is a strong positive prognostic parameter,25 ,26 implying that this cancer undergoes active immunosurveillance. The antigenic targets of CRC infiltrating T cells are not known and it is conceivable that they are formed, at least in part, by unique neoepitopes.

CRCs contain a small subpopulation of cells that display stem-cell like properties driving tumour initiation, propagation and metastasis.27 Cancer stem/initiating cells (CSCs) are considered the critical targets for therapy, because their elimination is expected to completely halt cancer progression. CSCs exhibit immunosuppressive effects that may hamper the induction of T cell responses; however, they can be recognised and eliminated by activated T cells.28

In light of these considerations, hence, relevant questions are whether T cell recognition of unique epitopes occurs in CRC, and whether these epitopes can also target T cell responses against CSCs. To address these questions, we set up a proof of concept platform to systematically identify unique neoepitopes from somatically mutated CAN-genes expressed by CRC cells and in the derived CSC cultures. The tumour-derived cDNAs encoding the 20 most frequently mutated CAN-genes in CRC22 were subjected to high throughput sequencing to identify mutations in the expressed genes. To avoid the need for precise bioinformatic prediction and assay of all the possible mutated epitopes that can potentially bind each tumour HLA allele, we tested the ability of pools of long synthetic peptides, spanning the CAN-gene mutations, to elicit T cell responses that recognise the differentiated cancer cells and the CSCs expressing the targeted mutations. Following this approach, we identified unique immunogenic neoepitopes in CRCs and showed that they can target T cells against the CSC component.

Materials and methods

Establishment of tumour cells cultures

Peripheral blood mononuclear cells (PBMCs) were obtained from patients with CRC or HLA-matched healthy donors (HDs) by standard Ficoll separation (Ficoll-Paque PLUS, GE Healthcare Bio-Science). Differentiated and CSC cell lines were generated from surgical specimens as described in online supplementary methods. To collect tumour sample and peripheral blood, written informed consent in accordance with the Declaration of Helsinki was obtained from patients.

Supplementary materials

PCR amplification of CAN-gene cDNAs

cDNA synthesised from CRC cell poly(A) RNA was PCR amplified using primers specific for each CAN-gene (see online supplementary table S4). The PCR products were gel purified and equalised on Nanodrop before pooling and sequencing.

Supplementary materials

Massive parallel cDNA sequencing

Amplified cDNA pools (3 µg) were processed for massive sequencing according to the GS FLX Titanium protocol (454 Life Sciences, Roche, Branfort, Connecticut, USA), as detailed in online supplementary methods.

PCR assay

DNA extracted from PBMCs or B lymphoblastoid cell lines (LCLs) obtained from the patients with CRC was PCR amplified using specific primers designed around each autochthonous mutation. PCR products were gel purified and directly sequenced by Sanger method.

MHC-peptide binding analyses

Quantitative assays to measure the binding of peptides to purified HLA A*02:01 class I molecules were performed as described previously29 and detailed in online supplementary methods.

Retroviral transduction of mutated and WT SMAD4 minigenes

Two 27 aa long minigenes encoding either the SMAD4V370A mutation expressed by the 1247 CRC, or the corresponding SMAD4V370-WT residue, were cloned in the retroviral vector MSCV-IRES-GFP and transduced into HLA-A*02:01+ HEK293t human embryo kidney cells that were selected by cell sorting to express high levels of green fluorescence protein (GFP) (detailed in see online supplementary methods).

PCR typing of mutated and WT SMAD4

The indicated tumour cell lines were screened by RT-PCR typing for the expression of either SMAD4V370A or SMAD4R361C mutations, or the corresponding wild type (WT) sequence (see online supplementary methods).

Flow cytometry and CD8+ T cell enrichment

Cancer cells, pretreated with interferon (IFNγ) for 48 h, were stained with anti-HLA class I W6/32 and anti-HLA-DR L243 mAbs. T cell lines expanded from patients and HDs were stained with anti-CD3 fluorescein isothiocyanate (FITC), antihuman CD4 phycoerythrin (PE), antihuman CD8 antigen presenting cell (APC) mAbs (Becton Dickinson), 4',6-diamidino-2-phenylindole (DAPI) and acquired on a Canto II (Becton Dickinson). Results on viable cells were analysed using Flow-Jo software (Treestar).

T cell cultures

T cell lines and mixed lymphocyte-tumour cell culture (MLTC) were generated from PBMCs as described28 ,30 and detailed in online supplementary methods.

ELISPOT assays

ELISPOT assay for IFNγ production by unique neoantigen specific T cells were performed as described28 and detailed in online supplementary methods.

Statistical analysis

Comparisons between two groups were done with the two-tailed parametrical Student's t test for unpaired samples, multiple comparisons were done by one-way analysis of variance. Statistics were calculated using GraphPadTM Prism V.5.0 (GraphPad Software). Differences with a p value <0.05 were considered statistically significant. *p Value <0.05; **p value <0.01; ***p value <0.001.


Identification of somatic mutations in expressed genes of CRC and CSC cultures by massively parallel tumour cDNA sequencing

We first sought to identify unique antigens in cell lines that were established from primary surgical specimens of patients with CRC (see online supplementary table S1 and S2).28 Single cell suspensions from cancer specimens were cultured in standard conditions to obtain ‘differentiated’ cancer cells and, when possible, also in serum-free conditions to support the generation of colon spheres displaying CSC characteristics.28 The cDNAs encoding the 20 most frequently mutated CAN-genes in CRC22 were PCR-amplified from eight differentiated CRC cell lines and two parallel CSC cultures and subjected to massively parallel sequencing. We found somatic mutations in 3–5 of the 20 expressed genes in all CRC cells (table 1).

Supplementary materials

Supplementary materials

Table 1

Mutated CAN-genes found in eight massively sequenced CRC cell lines

The mutations found in the CRC cDNAs were lacking in the corresponding gene exons present in the DNA obtained from healthy cells (PBMCs or LCLs) of the same patients, confirming that they were somatically acquired (data not shown). cDNAs encoding oncogenes were mutated in about 50% of the obtained sequences, consistent with their dominant functions in the presence of a WT allele, with the exception of KRAS that was mutated in 100% of the reads in three of seven tumour samples. cDNAs encoding oncosuppressors were mutated in about 100% of the obtained sequences, consistent with the loss of heterozygosity state required for their functional loss. Two exceptions to this finding were the APC and SMAD4 cDNAs expressed by the 1247 CRC/CSC samples, which exhibited four (three non-sense, one miss-sense) and two (one missense, one nonsense) mutations, respectively, suggesting that both alleles of each oncosuppressor gene were expressed and carried mutations that either prevented the expression of the encoded APC protein, or that resulted in a non-functional SMAD4 protein.

Five genes (APC, KRAS, TP53, PIK3CA, FBXW7) were recurrently mutated in the majority of samples, whereas the other 15 genes were more rarely mutated, consistent with the published data.22 ,24 The identified mutations in the APC, KRAS, TP53, PIK3CA FBXW7, SMAD4 genes were already described in the catalogue of somatic mutations in cancer (COSMIC) ( database of cancer gene mutations, with the exceptions of the: 1. frameshifts APCS139fs*2, V915fs*2, E1317fs*3 mutations in CRCs 1039, 1076 and 1869, respectively; 2. missense PIK3CAR770Q and SMAD4V370A mutations in CRC 1247. All the additional somatic mutations in the 15 more rarely mutated genes were apparently newly identified and unique to the expressing CRC samples. The great majority (30/38, 79%) of somatic mutations found in the CRC samples produced modified amino acid sequences of the encoded proteins, as a result of missense mutations generating a new amino acid residue, or frameshift mutations introducing novel open reading frames at the C-terminal protein sequence. A few nonsense mutations introduced a stop codon in the APC (R1114*, R1450*, R2204*) gene, in one SMAD4 (E41*) allele of the 1247CRC/CSC samples, and in the APC (E893*) and SMAD2 (G457*) genes of the 21 052 CRC. Finally, each pair of differentiated and CSC cultures from either 1076 or 1247 CRCs harboured the same mutations.

Hence, the sequencing of the 20 most frequently mutated genes in CRC provided four to five somatic mutations per tumour, which were a potential source of unique T cell neoepitopes.

Patients’ CD8+ T cells induced by a mutated SMAD4 peptide recognise autologous cancer cells and CSCs

We collected enough PBMCs from patients 1247 and 1039 to investigate T cell recognition of epitopes derived from the mutated gene products. PBMCs from the two patients were stimulated at least twice in vitro with pools of synthetic peptides consisting, for each mutated protein, of three 15 aa long peptides spanning the mutated residues and overlapping by 11 residues (see online supplementary table S3). This approach is based on the evidence that 15 aa long peptides are naturally processed by APCs into epitopes that are presented by MHC class I and II molecules to autologous T cells, without prior knowledge of the exact HLA allele-specific epitope structure.31 CD8+ T cells isolated from the stimulated PBMCs were tested for the recognition of the autologous cancer cells, which expressed HLA-A, B, C and HLA-DR upon IFNγ treatment (figure 1A). In patient 1247, peptide pools covering the PIK3CAR770Q and C10orf127S168L mutations did not elicit CD8+ T cells able to recognise the autologous cancer cells (data not shown). Remarkably, however, we found specific recognition of differentiated and CSC cultures by CD8+ T cells induced with the peptide pool encompassing the SMAD4V370A mutation (figure 1B). In preliminary experiments, we found that CD8+ T cells induced with the SMAD4-mut peptide pool were specific for the SMAD4mutated-1 (SMAD4m-1) peptide. Because patient 1247 expressed HLA-A*0201 (see online supplementary table S2), we assessed whether the SMAD4m-1 peptide was presented by this HLA allele. The CD8+ T cells recognised the SMAD4m-1 peptide presented by HLA-A*02:01+ T2 cells (figure 1C), but not the other two mutated peptides, or the peptide spanning the WT SMAD4 sequence corresponding to the SMAD4m-1 sequence (figure 1C,D). To further confirm the specificity of the peptide-induced CD8+ T cells for the SMAD4V370A-containing neoepitope, we transduced HLA-A*0201+ HEK293t cells with two minigenes encompassing the WT or the SMAD4V370A sequences, respectively (figure 1E). The SMAD4m-1 specific CD8+ T cells specifically recognised HEK293t cells transduced with the SMAD4V370A-mutated but not with the SMAD4 WT-minigene, nor three different HLA-A*0201+ CRC cell lines that were all negative for the SMAD4V370A mutation, and were either positive (COLO293) or negative (SW480, SW620) for the corresponding SMAD4 WT sequence (figure 1F). This result confirmed that the induced CD8+ T cells were specific for a naturally processed SMAD4V370A-containing neoepitope presented by HLA-A*0201.

Supplementary materials

Figure 1

Patient T cells recognise a SMAD4V370A-containing neoepitope presented by autologous cancer cells. (A) HLA class I and HLA-DR expression in 1247 cancer cells following 48 h IFNγ induction. (B–D) IFNγ ELISPOT of CD8+ T cells from 1247 patient, stimulated with a peptide pool encompassing the SMAD4V370A mutation, assayed for the recognition of: 1247 cancer stem/initiating cells (CSCs) and differentiated colorectal cancer (CRC) cells (B), of SMAD4 mutated peptides presented by HLA-A*0201+ T2 cells (C), of SMAD4m-1 and SMAD41 WT peptides presented by T2 cells (D), ±anti-class I or HLA-DR mAbs. (E) Percentage of GFP expression by sorted HEK293t cells transduced with retroviral vectors encoding the 1247 SMAD4V37A mutated or the corresponding SMAD4V37 WT minigenes; (F) Upper panel. SMAD4V37A-specific CD8+ T cells assayed by IFNγ ELISPOT for the recognition of HEK293t cells transduced with the SMAD4 minigenes, or of three HLA-A*02:01+ CRC cell lines negative for the SMAD4V37A mutation, ±anti-class I mAb. Lower panel. PCR typing for the expression of the SMAD4V37A mutation, or the corresponding SMAD4V37 WT sequence in the CRC cell lines shown in the upper panel, and in untransduced HEK293t (293t) cells. The 1247 and 1869 CRC cell lines are positive and negative controls for the PCR, respectively. (G) SMAD4m-1 peptide-stimulated CD8+ T cells assayed by IFNγ ELISPOT (right panel) for the specific recognition of the peptide epitopes of different lengths (left panel), presented by T2 cells. All IFNγ ELISPOT data are triplicate mean±SD, subtracted of the background spots produced by T cells alone, and are representative of three independent experiments performed with independently induced CD8+ T cell lines. Only the experiment in panel F was performed twice with the same CD8+ T cell line. *p≤0.05; **p≤0.01; ***p≤0.001; ns, non-statistically significant.

To define the minimal HLA-A*02:01-restricted SMAD4V370A-containing neoepitope recognised by the T cells of patient 1247, we searched the public prediction database Immune Epitopes Database ( for progressively shorter epitopes from either SMAD4m-1 or SMAD4–1 WT 15mers that were predicted to bind HLA-A*0201. Synthetic peptides corresponding to the predicted epitopes were then tested for recognition by CD8+ T cell lines induced with the SMAD4m-1 15mer. T cells recognised the 8, 9 and 10 aa long SMAD4m-1-derived peptides number 6, 19, 21 and 31 (figure 1G), but not the corresponding non-mutated peptides (data not shown), defining the recognised minimal CLGQLSNA mutated epitope. Binding assays performed with the three recognised SMAD4 mutated peptides established a very low binding affinity (>500 nM) for HLA-A*0201, in the range of the corresponding non-mutated peptides (not shown), suggesting that the antigenicity of the mutated SMAD4 peptide epitope was not due to an increased binding affinity for HLA, compared with the non-mutated epitope.

In contrast to patient 1247, CD8+ T cells from patient 1039 that were induced with peptide pools spanning the autologous KRASG12D, TP53Y107D and PIK3CAQ546K CRC mutations did not recognise autologous cancer cells, suggesting that the protein encoded by these three mutated genes could not generate naturally processed neoepitopes (data not shown).

Together, these findings indicated that the SMAD4V370A somatic mutation expressed by the colon cancer 1247 generated a naturally processed neoepitope recognised by autologous CD8+ T cells on differentiated and CSC cultures.

The mutated SMAD4-1 epitope is immunogenic for autologous CD8+ T cells

To investigate the spontaneous immunogenicity of the SMAD4V370A somatic mutation, we used T cell lines obtained by stimulating PBMCs from patient 1247 with autologous CSCs in MLTC,28 performed by neutralising the immunosuppressive interleukin (IL) 4 produced by CSCs.28 The MLTC contained CD4+ and CD8+ T cells that specifically recognised the autologous CSC cultures (figure 2A). CD8+ T cells, enriched from these MLTCs, were also specifically stimulated by T2 cells loaded with the SMAD4-m1 but not with the SMAD4-1 WT peptide (figure 2B), suggesting that T cell precursors specific for the SMAD4V370A somatic mutation had been naturally expanded by autologous CSCs in the MLTC.

Figure 2

The SMAD4V370A mutation of 1247 colorectal cancer (CRC) is spontaneously immunogenic for autologous CD8+ T cells. (A) Mixed lymphocyte-tumour cell culture (MLTC) from patient 1247, induced by autologous cancer stem/initiating cell (CSC) cultures, assayed by IFNγ ELISPOT for the recognition of the autologous CSCs, ±anticlass I mAb or anti-HLA-DR mAb. (B) CD8+ T cells, enriched from the previous MLTC, assayed by IFNγ ELISPOT for the recognition of the SMAD4m-1 or SMAD4-1 WT peptides presented by T2 cells; (C) Primary CD8+ T cells purified from patient 1247, stimulated twice with the SMAD4-1 peptide in the presence of autologous PBMCs in two series of 10 wells, each containing 5×104 (left panel) or 105 (right panel) cells/well for a total of 1.5×106 precursors, and assayed by IFNγ ELISPOT for the recognition of the 1247 CRC cells. IFNγ ELISPOT data are represented as triplicate mean±SD, subtracted of the background spots produced by T cells alone, and are representative of three (A) and two (B) independent experiments performed. *p≤0.05; **p≤0.01; ***p≤0.001.

To assess the frequency of CD8+ T cell precursors specific for the SMAD4V370A epitope in the PBMCs of patient 1247 and in two HLA-A*0201 matched HDs, a total of 1,5×106 CD8+ T cells from each individual were distributed in two series of 10 wells, containing 5×104 or 105 cells/well, respectively (figure 2C,D), and stimulated twice with the SMAD4m-1 peptide in the same wells. The cells contained in each well were then independently assayed for the recognition of differentiated 1247 CRC cells. Tumour-specific CD8+ T cells were detectable in this condition only in the T cell cultures derived from the patient (figure 3C,D), with an estimated SMAD4V370A-specific precursor frequency of about 1 in 2.5×105 CD8+ T cells. No specific T cell response could be detected in the cultures established from the HDs suggesting that, in health conditions, SMAD4V370A-reactive T cell precursors were either not expanded, or present at a frequency below that determined in the patient with cancer, patient 1247.

Figure 3

The 1869 colorectal cancer (CRC) presents neoepitopes from mutated genes to allogeneic CD4+ and CD8+ T cells. (A) CD4+ phenotype of a T cell line from a HLA-DR*β4 01:03 healthy donor (HD) expanded with the 30 aa long antigen presenting cell (APC)E1317KfsX4 synthetic peptide. (B) T cells from the HLA-DR*β4 01:03 HD assayed by IFNγ ELISPOT for the recognition of 1869 B lymphoblastoid cell lines (LCLs) loaded with the APC E1317KfsX4 or the WT peptide, ±anti-HLA-DR mAb. Similar results were obtained with CD4+ T cell lines elicited by the APC E1317KfsX4 peptide from HLA-DR*β1 13:01 HDs. (C–D) APCE1317KfsX4-specific CD4+ T cells from either HLA-DR*β4 01:03 or HLA-DR*β113:01 donor assayed by IFNγ ELISPOT for the recognition of LCL cells homozygous for either HLA-DR*β4 01:03 (C) or HLA-DR*β113:01 (D), or negative for these alleles, loaded with the APCE1317KfsX4 peptide±anti-HLA-DR mAb. (F–G) APCE1317KfsX4-specific CD4+ T cells induced from HLA-DR*β4 01:03- or HLA-DR*β1 13:01-matched HDs, assayed by IFNγ ELISPOT for the recognition of 1869 CRC cells, ±anti-HLA-DR mAbs. (H) PCR typing for the expression of the SMAD4R361C mutation, or the corresponding SMAD4V37 WT sequence, in the HLA-B*35:01+ cancer cell lines used for the recognition assay. The 1869 and 1247 cell lines are positive and negative control of the PCR, respectively. (I) CD8+ T cells elicited from HLA-B*35:01 HD with the peptide pool encompassing the SMAD4R361C mutation assayed by IFNγ ELISPOT for the recognition of 1869 CRC cells, or of HLA-B*35:01+ and SMAD4R361C-negative kidney cancer cell line MR196 and melanoma cell lines M47, M131 and Mel15765,±anti-class I mAb. IFNγ ELISPOT data are represented as triplicate mean±SD, subtracted of the background spots produced by T cells alone, and are representative of three independent experiments. *p≤0.05; **p≤0.01; ***p≤0.001; ns, non-statistically significant.

Hence, the SMAD4V370A somatic mutation generates a neoepitope that is naturally immunogenic for autologous T cells, resulting in the expansion of specific CD8+ T cell precursors in vivo.

Induction of CD4+ and CD8+ T cells from HLA-matched HDs specific for somatically mutated CRC gene products

We next investigated the recognition of the potential unique neoepitopes derived from the somatically mutated gene products expressed by the 1869 CRC. The 1869 CRC cell lines expressed class I and HLA-DR upon IFNγ pretreatment (not shown) and could be tested for recognition by the peptide-induced T cell lines. We first sought to specifically investigate the CD4+ T cells response against the mutated APCE1317KfsX4 gene product expressed by the 1869 CRC. Because there were not enough autologous T cells, the PBMCs of two HDs sharing the HLA-DR*β4 01:03 and HLA-DR*β1 13:01 alleles, respectively, with the 1869 CRC were stimulated with a 30 aa long peptide (APCmut) incorporating at the C-terminus the three substituted residues encoded by the frameshift mutated APCE1317KfsX4 gene (see online supplementary table S1). Such long peptides, at least in vitro, are taken up by APCs contained in PBMCs, processed and presented mainly in class II, selectively expanding CD4+ T cells. The resulting CD4+ T cell lines (figure 3A) from either donors recognised the APC-mut peptide, but not the APC-WT one, loaded on LCL cells from the 1869 patient (figure 3B). Each CD4+ T cell line was also specifically stimulated by LCL cell lines either homozygous for HLA-DR*β4 01:03 (figure 3C) or for HLA-DR*β1 13:01 (figure 3D), but not from LCL cells homozygous for different HLA-DR alleles, confirming their respective HLA-DR restrictions. The two CD4+ T cell lines were also specifically stimulated in an HLA-DR-restricted manner by the 1869 cancer cells (figure 3F,G), indicating the presentation of naturally processed class II neoepitopes containing the APC mutation by CRC cells.

We finally induced T cells lines from a different HD matched for HLA-B*35:01 with the 1869 cancer cells, using a pool of synthetic 15-mer peptides encompassing the SMAD4R361C mutation (see online supplementary table S1). The induced CD8+ T cells specifically recognised the target 1869 CRC cell line, which expresses the SMAD4R361C mutation but not the SMAD4 WT sequence, whereas they did not recognise HLA-B35+ kidney cancer (MR196) and melanoma (M47, M131, Mel15765) cell lines, which were all negative for the SMAD4R361C mutation and positive for the corresponding SMAD4WT sequence (figure 3H,I). Hence, the induced CD8+ T cells were specific for a naturally processed neoepitope derived from the SMAD4R361C mutation uniquely expressed and presented by the 1869 CRC cells.


Recent publications have highlighted the power of combining next-generation sequencing of cancer DNA with reverse immunology to identify T cell epitopes from unique tumour antigens involved in the control of mouse (melanoma and chemically induced sarcomas) and human (melanoma, cholangiocarcinoma and CLL) cancers.12 ,18 ,20 Our study extends those findings in several ways. First, we have focused on CRC, which is a frequent epithelial cancer and a big killer never investigated for the expression of unique tumour antigens by this approach. For a proof of concept, we generated primary tumour cell lines to assess direct tumour recognition by mutated peptide-specific T cells, implying the natural processing and presentation of the somatically mutated epitope. Second, we sequenced the expressed genome (cDNA) from CRC cells, which confirms that the mutated genes are actually expressed by the malignant cells. This approach might well be replaced by advanced RNA sequencing techniques,32 when considering a clinical setting in which primary tumour samples must be directly sequenced to speed up the possible therapeutic application of neoepitope-based vaccines. Third, we elicited tumour-specific CD8+ and CD4+ T cell responses in vitro using small pools of long peptides (>15aa) encompassing the somatic mutation, with no need for precise HLA allele-specific epitope prediction and the extensive synthesis and testing of the defined peptides. Finally, and importantly, because a critical issue concerning cancer therapy is the ability to target the CSC component to actually eradicate the tumour,33 ,34 we were able to show that CD8+ T cells induced with a mutated SMAD4 peptide recognised autologous CSCs expressing the same mutation. Although we did not sequence the whole expressed genome, the two pairs of CSCs and differentiated CRCs derived from each common surgical sample shared the same somatic cDNA mutations, implying the possibility that T cells directed against the corresponding mutated epitopes might effectively target the tumour initiating compartment in vivo for therapeutic purposes. With regards to this, it has been shown that CSCs from CRC are endowed with immunosuppressive mechanisms that inhibit the induction of T cell responses.28 ,35 However, effector T cells can efficiently recognise CSCs from CRCs suggesting that, once they are activated by strategies that counteract such suppression in vitro or in vivo, T cells specific for unique tumour antigens might be able to therapeutically target the CSC component in vivo.

On average, we find that 4 of 20 sequenced genes are somatically mutated in CRCs, representing potential T cell antigens. Even though parsimonious, our sequencing approach proved efficacious in identifying antigenic somatic cancer mutations. In the first three CRCs tested, two were found able to process and present somatically mutated epitopes to CD4+ and CD8+ T cells. However, in one CRC sample (1039), the missense mutations found in three genes did not generate antigenic epitopes recognised by autologous CD8+ T cells. It is conceivable that sequencing the whole RNA from each CRC cancer would significantly increase the likelihood to identify somatically mutated tumour antigens in all patients.

We sequenced CAN-genes that are clearly drivers involved in the oncogenic transformation of colon epithelium.22 These genes and proteins are ideal targets of T cell immunotherapy because they are less likely to be lost by cancer immune escape mutants. Data obtained in mouse and human tumours, however, suggest that tumour-specific T cells recognise neoepitopes derived mostly from passenger rather than driver somatic mutations,5 ,12–18 implying a possible immune-escape mechanism. It is conceivable that a more extensive sequencing of additional expressed genes from our CRC samples would have identified passenger mutations also, generating immunogenic neoepitopes for autologous T cells.

The unique SMAD4V370A epitope identified in the CRC 1247 displays a very low binding affinity for HLA-A*0201, well above 500 nM that is considered to be the threshold for productive epitope binding to MHC and presentation to cognate T cells. The mutation acquired in the 1247 CRC cells modifies the predicted putative MHC anchor of the epitope, but it does not increase the binding affinity for HLA compared with the WT sequence. Part of the very low binding affinity may be due to the fact that the peptides bear cysteine, which does not behave well in solution, in a position that generally has an appreciable influence on binding capacity. Nevertheless, we cannot readily explain the immunogenicity of the low affinity SMAD4V370A neoepitope. One possibility is that the new anchor residue introduced by the somatic SMAD4V370A mutation might generate a new agretope for HLA-A*02:01, which binds the cognate TCRs with increased C-terminal stability compared with the WT peptide, sufficient to lead to tolerance break and T cell activation. This possibility has been suggested by a recent study that identified 8/10 mutated neoepitopes from two chemically induced mouse sarcomas displaying extremely low affinity (over 500 nM), yet strongly immunogenic and able to induce potent T cell dependent tumour rejection upon immunisation in vivo.17 The mutations found in most of the immunogenic neoepitopes identified in the mouse sarcoma modify their C-terminal anchor residues.17 In support of this hypothesis, we indeed find that the SMAD4V370A epitope is immunogenic in vivo, as suggested by the increased frequency of specific T cell precursors found in the patient, compared with the nearly undetectable frequency of T cell precursors specific for the same epitope found in HLA-A*02:01-matched HD.

Of the two patients in whom we have investigated autologous peripheral T cell responses, one presented expanded circulating T cells specific for the SMAD4V370A tumour mutation and is still alive after almost 5 years from surgery. In contrast, the other patient in whom no T cell responses specific for unique antigens were detected developed fatal cancer progression 6 months post surgery. The possibility that the T cell response specific for the unique antigens may have contributed to the postsurgery survival warrants future investigations in more patients with CRC, including also the analysis of the tumour infiltrating lymphocytes and of the tumour immune microenvironment. Recent clinical results, reporting that mismatch repair-deficient CRCs are strikingly more responsive to anti-PD-1 mAb (pembrolizumb) therapy than mismatch repair-proficient tumours, suggest to extend such investigations also to patients treated with immune checkpoint blockade.36 This difference, in fact, correlated with a greater mean number of somatic mutations found in mismatch repair-deficient (1782) compared with mismatch repair-proficient (73) CRCs, which resulted in the prediction of 20 times more theoretical neoepitopes available for potential T cell responses in the former tumours.36

Collectively, our study shows a new strategy for the quantitative identification of mutated neoepitopes in CRC. Because the progression of this tumour is critically controlled by the immune system, particularly by the degree and quality of CD8+ and CD4+ T cell infiltration within the tumour tissue,37 ,38 this approach can be easily scaled up to thoroughly characterise the protecting immune responses in patients undergoing surgery, as well as to define neoepitopes of tumour-specific antigens for effective cancer vaccines.


The authors thank the members of the San Raffaele programme of Immunology and Bio-Immunotherapy of Cancer for suggestions and criticisms throughout the study.

View Abstract


  • DM and CM contributed equally.

  • GP, GC and PD are senior coauthors

  • Contributors DM, CM, MCR, FC designed and performed experiments, analysed data and wrote the manuscript; RB, MS, GDB performed and analysed deep sequencing; JS, AS performed peptide binding studies and epitope prediction analysis; AG, RL synthesised peptides; MB, LG, LB, EO, LA took care of the clinical cases and pathology; EZ, KF provided HLA-matched healthy donors; GM, NF, AA typed tumour samples and provided HLA-matched healthy donors; GC, GP, PD envisaged the study, supervised experiments and wrote the manuscript.

  • Funding Study supported by Associazione Italiana per la Ricerca sul Cancro grant AIRC-IG11524.

  • Competing interests None declared.

  • Ethics approval Institutional Review Board of San Raffaele Scientific Institute, Milano, Italy (study CAN-GENES 01).

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.