Objective Colorectal cancer is typically classified into proximal colon, distal colon and rectal cancer. Tumour genetic and epigenetic features differ by tumour location. Considering a possible role of bowel contents (including microbiome) in carcinogenesis, this study hypothesised that tumour molecular features might gradually change along bowel subsites, rather than change abruptly at splenic flexure.
Design Utilising 1443 colorectal cancers in two US nationwide prospective cohort studies, the frequencies of molecular features (CpG island methylator phenotype (CIMP), microsatellite instability (MSI), LINE-1 methylation and BRAF, KRAS and PIK3CA mutations) were examined along bowel subsites (rectum, rectosigmoid junction, sigmoid, descending colon, splenic flexure, transverse colon, hepatic flexure, ascending colon and caecum). The linearity and non-linearity of molecular relations along subsites were statistically tested by multivariate logistic or linear regression analysis.
Results The frequencies of CIMP-high, MSI-high and BRAF mutations gradually increased from the rectum (<2.3%) to ascending colon (36–40%), followed by falls in the caecum (12–22%). By linearity tests, these molecular relations were significantly linear from rectum to ascending colon (p<0.0001), and there was little evidence of non-linearity (p>0.09). Caecal cancers exhibited the highest frequency of KRAS mutations (52% vs 27–35% in other sites; p<0.0001).
Conclusions The frequencies of CIMP-high, MSI-high and BRAF mutations in cancer increased gradually along colorectum subsites from the rectum to ascending colon. These novel data challenge the common conception of discrete molecular features of proximal versus distal colorectal cancers, and have a substantial impact on clinical, translational and epidemiology research, which has typically been performed with the dichotomous classification of proximal versus distal tumours.
- cancer epidemiology
- cancer prevention
- colon cancer
- colorectal cancer
- colorectal pathology
- genomic instability
- non-steroidal anti-inflammatory drugs
- 2,4,6-trinitrobenzene sulphonic acid
Statistics from Altmetric.com
- cancer epidemiology
- cancer prevention
- colon cancer
- colorectal cancer
- colorectal pathology
- genomic instability
- non-steroidal anti-inflammatory drugs
- 2,4,6-trinitrobenzene sulphonic acid
Significance of this study
What is already known about this subject?
Colorectal cancer is typically classified into rectal, distal colon and proximal colon cancers.
Proximal colon cancers and distal cancers differ in clinical, pathological and molecular features.
Although it remains uncertain whether colorectal cancer molecular features change abruptly at splenic flexure, some investigators believe that there are distinct molecular features of proximal tumours and distal tumours.
What are the new findings?
The frequencies of CIMP-high, MSI-high and BRAF mutations in colorectal cancer increase gradually (statistically linearly) along the bowel from the rectum to ascending colon, rather than change abruptly at splenic flexure.
Caecal cancers represent a unique subtype characterised by a high frequency of KRAS mutations, and caecal cancers do not follow the linearity trend in terms of the frequencies of CIMP-high, MSI-high and BRAF mutations.
Mean tumour LINE-1 methylation levels show non-linear changes along the bowel subsites, and do not show an abrupt change at splenic flexure.
How might it impact on clinical practice in the foreseeable future?
Over the past decades, most clinical, translational and epidemiological studies have gathered and published colorectal tumour location data as proximal colon versus distal colon (vs rectum). Future studies on colorectal neoplastic diseases should include information on detailed bowel subsites (beyond proximal colon, distal colon and rectum), which will further improve our understanding of the mechanisms of colorectal carcinogenesis.
Over the past decades, clinical, pathological or epidemiological investigations into the large bowel have semi-automatically divided the colorectum into three compartments, namely, the rectum, distal colon and proximal colon.1–3 In 1990, Bufill1 proposed the existence of two distinct genetic categories of colorectal cancers according to tumour location in the proximal or distal segment of the large bowel, divided at splenic flexure. This concept of distinct molecular features of proximal cancer versus distal cancer has repeatedly been discussed.2 3
Colorectal cancers encompass a heterogeneous group of diseases with complex genetic and epigenetic alterations.4 Molecular classification is thus increasingly important for clinical decision making.5 Microsatellite instability (MSI) represents a distinct form of genomic instability.5 6 The CpG island methylator phenotype (CIMP) is a distinct form of epigenomic instability,7–17 which causes most sporadic MSI-high colorectal cancers through epigenetic inactivation of MLH1.18–21 Independent of MSI, CIMP-high colorectal cancer is associated with proximal tumour location, old age of onset, female sex and the BRAF mutation.18 19
Accumulating evidence suggests that proximal colon cancer and distal colon cancer differ in various molecular features including CIMP and MSI.22–25 However, it remains uncertain whether the tumour molecular features change abruptly at splenic flexure. Considering a possible role of bowel contents (including microbiome) in colorectal carcinogenesis,26 we hypothesised that tumour molecular characteristics might change gradually along the large bowel. This hypothesis is not inconsistent with the differences between proximal versus distal cancers, as long as tumour molecular features change along the large bowel.
To test the hypothesis, we conducted this study utilising a database of 1443 colorectal cancers in two prospective cohort studies. We examined the frequencies of relevant molecular features along the bowel subsites (rectum, rectosigmoid, sigmoid colon, descending colon, splenic flexure, transverse colon, hepatic flexure, ascending colon and caecum), and statistically assessed the linearity and non-linearity of molecular relations along the bowel subsites. Our novel findings of gradual increases of CIMP-high, MSI-high and BRAF mutation from the rectum to ascending colon challenge the common conception of discrete dichotomy of tumour molecular features in proximal colon versus distal colorectum.
Materials and methods
We utilised the database of two independent, prospective cohort studies; the Nurses' Health Study (N=121 701 women followed since 1976) and the Health Professionals Follow-up Study (N=51 529 men followed since 1986).27 28 Every 2 years, participants have been sent follow-up questionnaires to update information on potential risk factors and to identify newly diagnosed cancers in themselves and their first-degree relatives. In addition, we searched the National Death Index for those who died of colorectal cancer. Our study physicians reviewed medical records and obtained information on disease stage and tumour location (rectum, rectosigmoid, sigmoid colon, descending colon, splenic flexure, transverse colon, hepatic flexure, ascending colon and caecum). We collected paraffin-embedded tissue blocks from hospitals where patients underwent tumour resections. We collected diagnostic biopsy specimens for rectal cancer patients who received preoperative treatment, in order to avoid artifacts or bias introduced by treatment. Based on the availability of adequate tissue specimens, 1443 colorectal cancer cases (diagnosed up to 2006) were included (tables 1 and 2). Among our cohort studies, there was no significant difference in demographic features between cases with tissue available and those without available tissue.27 This current study represents a new analysis of tumour molecular features along the detailed bowel subsites on the existing colorectal cancer database that has previously been characterised for CIMP, MSI, LINE-1 methylation and BRAF and KRAS mutations.29 30 Informed consent was obtained from all study subjects. This study was approved by the Harvard School of Public Health and Brigham and Women's Hospital institutional review boards.
Assessment of physical activity
Leisure-time physical activity has been assessed every 2 years. Subjects reported duration of participation (ranging from 0 to 11 h or more per week) on walking (along with usual pace), jogging, running, bicycling, swimming laps, racket sports, other aerobic exercises, lower intensity exercise (yoga, toning, stretching), or other vigorous activities. Each activity on the questionnaire was assigned a metabolic equivalent task (MET) score. One MET is the energy expenditure for sitting quietly. MET scores are defined as the ratio of the metabolic rate associated with specific activities divided by the resting metabolic rate. The values from the individual activities were summed for a total MET-hours per week score.
Assessment of cigarette smoking and alcohol consumption
Cigarette smoking has been assessed every 2 years in both cohorts. Alcohol consumption was the sum of the values for three types of beverages: beer, wine and spirits. We assumed an ethanol content of 13.1 g for a 12 oz (38 dl) can or bottle of beer, 11.0 g for a 4 oz (12 dl) glass of wine and 14.0 g for a standard portion of spirits.
Tissue sections from all colorectal cancer cases were reviewed by a pathologist (SO) unaware of other data. Tumour differentiation was categorised as well–moderate versus poor (>50% vs ≤50% glandular area). Extent of mucin and signet ring cells were recorded.
Sequencing of BRAF, KRAS and PIK3CA, and MSI analysis
DNA was extracted from tumour and PCR, and pyrosequencing targeted for BRAF (codon 600),31 KRAS (codons 12 and 13)32 and PIK3CA (exons 9 and 20) were performed as previously described.33 MSI analysis was performed, using 10 microsatellite markers (BAT25, BAT26, BAT40, D2S123, D5S346, D17S250, D18S55, D18S56, D18S67 and D18S487).30 MSI-high was defined as the presence of instability in 30% or more of the markers. MSI-low (1–29% unstable markers) tumours were grouped into microsatellite stable (MSS) tumours (no unstable markers) because those showed similar features.
Methylation analyses for CpG islands and LINE-1
Using real-time PCR (MethyLight)34 on bisulfite-treated DNA,35 we quantified DNA methylation in eight CIMP-specific promoters (CACNA1G, CDKN2A (p16), CRABP1, IGF2, MLH1, NEUROG1, RUNX3 and SOCS1).9 18 36 CIMP-high was defined as the presence of six/eight or more methylated promoters, and CIMP-low/zero as zero/eight to five/eight methylated promoters, according to the previously established criteria.18 36 In order to quantify methylation levels accurately in LINE-1 repetitive elements we utilised pyrosequencing, as previously described.37 38
Analysis of gene expression
RNA was extracted and gene expression profiling was performed according to the complementary DNA-mediated annealing, selection, extension and ligation assay (Illumina, San Diego, California, USA), as previously described.39
For all statistical analyses, we used SAS software (V.9.1.3). All p values were two-sided. For categorical data, the χ2 test was performed. One-way analysis of variance was used to compare mean age or mean LINE-1 methylation level across bowel subsites. The Kruskal–Wallis test was used to compare the ABCB1 expression levels across bowel subsites.
To test the linearity and non-linearity of the relationship of tumour location–molecular feature along bowel subsites, multivariate logistic regression analysis (or linear regression analysis for LINE-1 methylation level) was performed. First, a numerical subsite location variable that represented an average distance (cm) from anal verge to each subsite was made, utilising recent CT colonography data.40 In the logistic or linear regression model with a tumour molecular variable as an outcome variable, a significant p value by the Wald test on the bowel subsite variable indicated a linear relationship of the molecular variable along the bowel subsites, but a non-linear relationship might be present. To test non-linearity of the relationship along the bowel subsites, we used a likelihood ratio test (LRT) comparing the model with squared and/or cubic subsite variables with the model without squared or cubic subsite variables. With a significant p value by the Wald test (mentioned above), a non-significant LRT p value would support a linear relationship excluding non-linearity, while a significant LRT p value would indicate the presence of non-linearity. All logistic and linear regression models were adjusted for age (continuous), sex, year of diagnosis (continuous), family history of colorectal cancer in any first-degree relative (present vs absent), body mass index (<30 vs ≥30 kg/m2), physical activity (<18 vs ≥18 MET-hours/week), smoking (never vs former/current smokers) and alcohol consumption (no vs <15 vs ≥15 g/day). For cases with missing information in any of the covariates (family history of colorectal cancer (0.9%), body mass index (0.8%), physical activity (5.1%), smoking (1.1%)), we included those cases in a majority category of a given covariate to avoid overfitting. We confirmed that excluding cases with missing information in any of the covariates did not substantially alter the results (data not shown).
Colorectal cancer molecular features along bowel subsites
To assess the frequencies of various tumour molecular features along the bowel subsites (rectum, rectosigmoid, sigmoid colon, descending colon, splenic flexure, transverse colon, hepatic flexure, ascending colon and caecum), we examined the database of 1443 colorectal cancer cases (excluding appendiceal cancers) in the two prospective cohort studies. Table 3 and supplementary tables 1 and 2 (available online only) show the frequencies of various clinical, pathological or molecular features along the bowel subsites in our subject population. The frequencies of CIMP-high, MSI-high and BRAF mutation gradually increased from the rectum (<2.3%) to the ascending colon (36–40%; figure 1), supporting our hypothesis that these tumour molecular features might change gradually along the large bowel. There was no abrupt change at splenic flexure. Caecal cancers showed lower frequencies of CIMP-high, MSI-high and BRAF mutation (12–22%) than ascending colon cancers. Notably, caecal cancers showed a higher frequency of KRAS mutations (52%) than any other sites (27–35%; p<0.0001).
Although there was no striking pattern of PIK3CA mutation frequency along bowel subsites, it was generally low in the rectum and rectosigmoid (10–11%) and higher proximally (p=0.0016).
With regard to the tumour LINE-1 methylation level (mean±SD), it gradually decreased from the rectum (63.2±8.5) to descending colon (60.2±11.7), and then increased from descending colon to ascending colon (64.7±9.4; p=0.0003). Again, there was no abrupt change at splenic flexure.
There was no significant relationship between bowel subsites and ABCB1 expression level (p=0.19).
Considering the importance of molecular classification based on combined CIMP and MSI status,41 we also examined the frequency of each CIMP/MSI subtype along bowel subsites (figure 2). The frequency of CIMP-high MSI-high tumours increased gradually along the bowel subsites from the rectum to ascending colon, while the frequency of CIMP-low/zero MSS tumours decreased from the rectum to ascending colon. There was no abrupt change at splenic flexure.
Assessment of linearity of tumour location–molecular relationship
We assessed the linearity of tumour location–molecular relationship along the bowel subsites by a multivariate logistic regression model (or linear regression model for LINE-1 methylation) (table 4).
In our multivariate analysis strategy, we could assess whether data presented in table 3 and figure 1 were independent of other variables. We used bowel subsite as a predictor (independent) variable, and a molecular feature as an outcome (dependent) variable. When we assessed the relationship between subsite (rectum to ascending colon) and CIMP, bowel subsite was significantly linearly associated with CIMP-high (p<0.0001). To assess non-linearity, we performed LRT comparing a model with squared and/or cubic subsite variable(s) with a model without squared or cubic variables. As a result, LRT yielded p>0.09, excluding a non-linear relationship and supporting a linear relationship of bowel subsites with CIMP-high.
When we assessed the relationship between subsite (rectum to ascending colon) and MSI (or BRAF mutation) by logistic regression models (table 4), the results were similar to those on the relationship between bowel subsite and CIMP. Tumour location bowel subsite (from rectum to ascending colon) was significantly linearly associated with MSI-high or BRAF mutation (p<0.0001). In addition, bowel subsite was also linearly associated with PIK3CA mutation (p=0.0034). When assessing non-linearity, LRT comparing a model with squared and/or cubic subsite variable(s) with a model without squared or cubic variables yielded non-significant p values (p>0.19), excluding non-linearity and supporting a linear relationship between subsite and MSI (or BRAF mutation or PIK3CA mutation).
To exclude a potential influence of differential selection bias as a result of preoperative treatment for rectal cancers, we excluded cancers in the rectum and rectosigmoid and performed a linearity test. Tumour location bowel subsite (sigmoid colon to ascending colon) was significantly linearly associated with CIMP-high, MSI-high, or BRAF mutation (p<0.0001), but not with PIK3CA mutation (p=0.13), and there was no evidence for non-linearity (LRT p>0.05).
We performed this study to test the hypothesis that molecular features of colorectal cancer change gradually along bowel subsites, rather than change abruptly at splenic flexure. Accumulating evidence suggests that proximal colon cancers differ in clinical, pathological and molecular features from distal cancers.22–25 However, it is still uncertain whether those features change abruptly at splenic flexure. Utilising the tumour database in the two prospective cohort studies, our current study is unique in examining tumour molecular features along the detailed bowel subsites (rectum, rectosigmoid, sigmoid colon, descending colon, splenic flexure, transverse colon, hepatic flexure, ascending colon and caecum). Notably, we found that the frequencies of CIMP-high, MSI-high and BRAF mutation increased (statistically) linearly along the bowel from the rectum to ascending colon. These data support our hypothesis of gradual changes in tumour molecular features along the bowel subsites, rather than abrupt changes at splenic flexure. Importantly, our hypothesis and data are not inconsistent with repeated observations of differences in molecular features (such as CIMP and MSI) between proximal colon cancer and distal colorectal cancer,22–25 so long as molecular features change along the bowel subsites.
Examining molecular changes in colorectal neoplasias is increasingly important for a better understanding of the carcinogenic process.42–44 In past decades, colorectal cancers were typically divided into three compartments: rectum, distal colon (sigmoid to splenic flexure) and proximal colon (transverse colon to caecum) in most clinical, pathological and epidemiological publications.1–3 As a result, our epidemiological, clinical and molecular pathological knowledge on colorectal cancer in detailed bowel subsites is currently deficient. Therefore, our data demonstrating gradual changes in tumour molecular features along the bowel may have considerable implications in clinical, epidemiological and pathological research. We would propose that future studies on colorectal neoplasia should include information on detailed bowel subsites (beyond the proximal colon, distal colon and rectum), which will further improve our understanding of the mechanisms of colorectal carcinogenesis.
Colorectal epithelial cells are constantly in contact with bowel contents, which may play a critical role in cellular transformation and tumour development and progression. Bowel contents (food debris, microbiome and bacterial fermentation products) and their interactions with host cells (epithelial and immune cells) may directly cause cellular molecular changes, or alternatively, may influence tumour progression differentially according to molecular features in preneoplastic or premalignant cells.45 46 In fact, bowel contents gradually change along the bowel subsites, and this fact may explain why tumour molecular features change gradually along the bowel subsites. In support of this hypothesis, studies on synchronous primary colorectal cancers have shown that CIMP-high (or MSI-high or BRAF-mutated) proximal cancer may co-exist with CIMP-negative (or MSS or BRAF wild type) distal cancer,47–50 and another study has shown a gradual gradient of CpG island methylation along normal bowel mucosa.51 Together with these data, our current study supports the role of bowel contents in predisposing colon epithelial cells to certain molecular insults. However, further investigations such as identifying the components of bowel contents or factors participating in the host–bacterial interactions are needed to understand how colorectal cancer develops.
The ATP-binding cassette (ABC) transporters constitute a large family of active transporter molecules, and play a role in the process of absorption. Because of the diverse substrates that can be transported, ABC proteins are found to be expressed in a number of specialised cell types.52 ABCB1 has been known to play a critical role in host–bacterial interactions in the gastrointestinal tract,53 and has been implicated in colorectal cancer development and progression.54 Potocnik et al55 have shown that ABCB1 gene polymorphisms may be associated with MSI-high cancer. Although our current study did not show that bowel subsite was significantly associated with ABCB1 expression in colorectal cancer, ABC transporters may play roles in modifying the risks of colorectal epithelial cells for neoplastic transformation/progression differentially according to the cellular molecular status.
Interestingly, our data indicate that caecal cancers have unique molecular features different from cancers in other subsites. The frequency of the KRAS mutation was highest in caecal cancers among all subsites. In addition, for the relations of bowel subsites with the frequencies of CIMP-high, MSI-high and BRAF mutation, caecal cancers did not follow the trend of an increase from the rectum to ascending colon. Kucherlapati et al56 have shown that loss of Rb1 in the gastrointestinal tract of Apc1638N mice promotes caecal tumour formation. Loss of RB1 (retinoblastoma protein) has been found specifically in caecal cancers.57 Taken together, caecal cancer may arise through somewhat unique carcinogenic mechanisms different from cancers in other subsites.
There are advantages in utilising the database of the two US nationwide prospective cohort studies to study molecular features of colorectal cancer along bowel subsites. Our large database readily enabled us to examine the frequencies of various molecular features in cancers in each bowel subsite with adequate statistical power, and test linearity of the molecular relations along the bowel subsites while adjusting for patient and clinical characteristics. In addition, cohort participants who developed cancer resided throughout the USA, and thus were more representative colorectal cancer cases in the general US population than highly selected patients in one to a few academic hospitals. These facts increase the generalisability of our study findings.
One limitation of our study is that a vast majority (94%) of our cohort participants were non-Hispanic Caucasians. It thus remains to be seen whether our findings can be applicable to other racial or ethnic groups. As another limitation, rectal cancer is commonly treated by preoperative radiation, which could cause bias or artifacts. Therefore, we collected pretreatment biopsy materials to overcome this issue. In addition, as a secondary analysis, we excluded rectal and rectosigmoid cancers, and we obtained similar findings of statistically linear increases in the frequencies of CIMP-high, MSI-high and BRAF mutation from the sigmoid colon to ascending colon.
In summary, our data suggest that the frequencies of CIMP-high, MSI-high and BRAF mutation in colorectal cancer do not change abruptly at splenic flexure. Instead, the frequencies of CIMP-high, MSI-high and BRAF mutation appear to increase gradually (statistically linearly) along the bowel from the rectum to ascending colon. In addition, caecal cancers represent a unique subtype characterised by a high frequency of KRAS mutations, and caecal cancers do not follow the linearity trend in terms of CIMP, MSI and BRAF mutation. Our novel data indicate that future studies on colorectal cancers or neoplasias should include information on detailed bowel subsites (beyond the proximal colon, distal colon and rectum), which will further improve our understanding of the mechanisms of colorectal carcinogenesis.
The authors would like to thank the participants and staff of the Nurses' Health Study and the Health Professionals Follow-Up Study, for their valuable contributions as well as the following state cancer registries for their help: Alabama, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, Florida, Georgia, Idaho, Illinois, Indiana, Iowa, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Nebraska, New Hampshire, New Jersey, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, Tennessee, Texas, Virginia, Washington, Wyoming.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Download Supplementary Data (PDF) - Manuscript file of format pdf
Funding This work was supported by US National Institutes of Health grants (P01CA87969 (to SE Hankinson), P01CA55075 (to WC Willett), P50CA127003 (to CSF), R01CA151993 (to SO) and R01CA137178 (to ATC)); the Bennett Family Fund for Targeted Therapies Research; and the Entertainment Industry Foundation through National Colorectal Cancer Research Alliance. TM was supported by a fellowship grant from the Japan Society for the Promotion of Science. The content is solely the responsibility of the authors and does not necessarily represent the official views of NCI or NIH. Funding agencies did not have any role in the design of the study, the collection, analysis, or interpretation of the data, the decision to submit the manuscript for publication or the writing of the manuscript.
Competing interests None.
Ethics approval This study was approved by the Harvard School of Public Health and Brigham and Women's Hospital institutional review boards.
Patient consent Obtained.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.