Article Text

Download PDFPDF

Original article
Race-dependent association of sulfidogenic bacteria with colorectal cancer
  1. Cemal Yazici1,
  2. Patricia G Wolf2,
  3. Hajwa Kim3,
  4. Tzu-Wen L Cross2,
  5. Karin Vermillion2,
  6. Timothy Carroll1,
  7. Gaius J Augustus4,
  8. Ece Mutlu5,
  9. Lisa Tussing-Humphreys6,
  10. Carol Braunschweig7,
  11. Rosa M Xicola8,
  12. Barbara Jung1,
  13. Xavier Llor8,
  14. Nathan A Ellis4,
  15. H Rex Gaskins2,9
  1. 1Division of Gastroenterology and Hepatology, University of Illinois College of Medicine at Chicago, Chicago, Illinois, USA
  2. 2Division of Nutritional Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
  3. 3Center for Clinical and Translational Science, University of Illinois at Chicago, Chicago, Illinois, USA
  4. 4Department of Cellular and Molecular Medicine, University of Arizona, Tucson, Arizona, USA
  5. 5Division of Digestive Diseases and Nutrition, Rush University Medical Center, Chicago, Illinois, USA
  6. 6Division of Academic Internal Medicine and Geriatrics, University of Illinois College of Medicine at Chicago, Chicago, Illinois, USA
  7. 7Department of Kinesiology and Nutrition, University of Illinois at Chicago, Chicago, Illinois, USA
  8. 8Section of Digestive Diseases, Yale University, New Haven, Connecticut, USA
  9. 9Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
  1. Correspondence to Dr H. Rex Gaskins, University of Illinois at Urbana-Champaign, 1206 W. Gregory Drive, Urbana, IL 61801, USA; hgaskins{at} OR Dr Nathan A Ellis, University of Arizona Cancer Center, 1515 N Campbell Ave, Tucson, AZ 85724, USA; naellis{at}


Objective Colorectal cancer (CRC) incidence is higher in African Americans (AAs) compared with non-Hispanic whites (NHWs). A diet high in animal protein and fat is an environmental risk factor for CRC development. The intestinal microbiota is postulated to modulate the effects of diet in promoting or preventing CRC. Hydrogen sulfide, produced by autochthonous sulfidogenic bacteria, triggers proinflammatory pathways and hyperproliferation, and is genotoxic. We hypothesised that sulfidogenic bacterial abundance in colonic mucosa may be an environmental CRC risk factor that distinguishes AA and NHW.

Design Colonic biopsies from uninvolved or healthy mucosa from CRC cases and tumour-free controls were collected prospectively from five medical centres in Chicago for association studies. Sulfidogenic bacterial abundance in uninvolved colonic mucosa of AA and NHW CRC cases was compared with normal mucosa of AA and NHW controls. In addition, 16S rDNA sequencing was performed in AA cases and controls. Correlations were examined among bacterial targets, race, disease status and dietary intake.

Results AAs harboured a greater abundance of sulfidogenic bacteria compared with NHWs regardless of disease status. Bilophila wadsworthia-specific dsrA was more abundant in AA cases than controls. Linear discriminant analysis of 16S rRNA gene sequences revealed five sulfidogenic genera that were more abundant in AA cases. Fat and protein intake and daily servings of meat were significantly higher in AAs compared with NHWs, and multiple dietary components correlated with a higher abundance of sulfidogenic bacteria.

Conclusions These results implicate sulfidogenic bacteria as a potential environmental risk factor contributing to CRC development in AAs.


Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Significance of this study

What is already known on this subject?

  • The colonic microbiome of patients with colorectal cancer is different from healthy subjects, but the impact of these differences remains uncertain.

  • A diet high in red meat and animal fat (a so-called Western diet) is associated with greater colorectal cancer risk.

  • African Americans have a significantly higher incidence of colorectal cancer compared with other Americans, possibly a consequence of environmental rather than genetic risk factors. Diet may play an important role in the higher incidence of colorectal cancer in African Americans.

  • African Americans are diagnosed with colorectal cancer at earlier ages and have a higher proportion of proximal colorectal cancers compared with non-Hispanic whites.

  • Hydrogen sulfide, which is produced by sulfidogenic bacteria, triggers proinflammatory and hyperproliferative pathways, and it is genotoxic.

  • People consuming a Western diet have a higher abundance of both primary and toxic secondary bile acids. Taurine-conjugated bile acids may serve as a substrate for the sulfidogenic bacterium Bilophila wadsworthia.

What are the new findings?

  • The abundance of sulfidogenic bacteria is higher in colonic mucosa of African Americans compared with non-Hispanic whites, regardless of disease status.

  • The sulfidogenic bacteria, Bilophila wadsworthia and Pyramidobacter spp., are significantly more abundant in African American colorectal cancer cases compared with African American controls.

  • Abundance of sulfidogenic bacteria correlated positively with the components of a diet high in fat and animal protein, and negatively with the servings of dairy and calcium.

How might it impact on clinical practice in the foreseeable future?

  • This study reflects health disparities and dietary differences in the African American compared with the non-Hispanic white population in Chicago. Practice aimed to make a healthy diet accessible for this population may have a significant effect on long-term health outcomes.

  • Dietary and pharmaceutical interventions focused on reducing sulfidogenic bacteria and secondary bile acids, or supporting fermentative modes of metabolism by sulfidogenic bacteria, may reduce colorectal cancer risk.

  • Measuring sulfidogenic bacterial abundance from biopsies during routine colonoscopy may constitute biomarkers for increased risk of colorectal cancer, particularly in African Americans.


Colorectal cancer (CRC) is the third most frequent cancer in the USA with an estimated 134 490 new cases and 49 190 deaths in 2016. African Americans (AAs) have a CRC incidence 10%–30% higher than other races and ethnicities, and even greater disparities in mortality rates.1 AAs are diagnosed at earlier ages and have more proximal CRC compared with non-Hispanic whites (NHWs).2 Although some types of CRC are influenced by genetic risk factors or disease, the majority occur sporadically as a consequence of environmental exposures that promote colonocyte hyperproliferation, loss of apoptotic control and epithelial barrier integrity, and excess production of proinflammatory immunomodulatory factors.3

Significant differences in faecal and colonic microbial composition in healthy subjects and patients with cancer have been identified,4–13 but the impact remains uncertain. Compared with tumour-free controls, CRC subjects have higher levels of faecal hydrogen sulfide (H2S).14 Higher levels of faecal H2S have been detected in patients with colitis, who have an increased susceptibility of CRC.15–18 Exogenous H2S is genotoxic at levels commonly found in the colon. This bacterial derived gas induces proliferative and proinflammatory pathways and inhibits β-oxidation of butyrate—an anticarcinogenic metabolite and favoured substrate of colonocytes.19–22 Multiple bacterial species produce H2S, including taurine-respiring Bilophila wadsworthia, cysteine-using Fusobacterium nucleatum and sulfate-reducing bacteria (SRB) such as Desulfovibrio spp.23–25 Additionally, higher levels of faecal bile acids have been observed in CRC cases than controls,14 which could be associated with increased abundance of taurocholate-using B. wadsworthia.

Because exposure to H2S could promote CRC development, we hypothesised that sulfidogenic bacteria contribute to a microenvironment conducive to colorectal carcinogenesis through production of H2S. Accordingly, we examined sulfidogenic bacterial abundance in uninvolved mucosa of subjects with CRC compared with tumour-free controls. Of particular interest was whether differences in sulfidogenic bacteria between AAs and NHWs might contribute to racial differences in CRC incidence rates.

Materials and methods

Human subjects

The Chicago Colorectal Cancer Consortium prospectively ascertained incident CRC cases from surgery and endoscopy units and tumour-free controls undergoing routine screening colonoscopy from five Chicago medical centres, including University of Illinois Hospital and Health Sciences System, Jesse Brown Veterans Administration, John H. Stroger Hospital of Cook County, University of Chicago Medicine and Rush University Medical Center, over a 2-year period (2011–2012), as previously described.26 The parent study received approval for human subjects research from the institutional review boards (IRBs) of each participating medical centre; the parent protocol was administered by the IRB at University of Illinois Hospital and Health Sciences System (2010–0168). Colonic tissue biopsies from uninvolved or healthy mucosa were studied from 329 subjects after routine bowel preparation. The cases were persons with adenocarcinoma of the colon or rectum detected during patient workup due to GI symptoms or collected at the time of surgical resection. Control subjects were persons undergoing colonoscopy in whom tumours were not identified. In CRC cases, biopsies were collected approximately 10 cm away from the tumour location. Samples were taken either with endoscopic biopsy or surgical forceps. Included were 97 AA and 56 NHW CRC cases, and 100 AA and 76 NHW controls. Persons with a personal history of cancer, IBD, adenomatous or benign polyps, polyposis or tumours with microsatellite instability were excluded.26 Clinical data were extracted from medical records, and epidemiological and demographic information were obtained using subject questionnaires as previously described.26 Factors analysed included age, sex, race; tumour or biopsy site, tumour stage; body weight, body mass index (BMI); education and income level; usage of tobacco products and alcohol; and usage of omega-3 fatty acids and non-steroidal anti-inflammatory drugs (NSAIDs) (table 1 and online supplementary table S1). Significantly different factors were used as covariates for potential confounder adjustment.

Table 1

Clinical and demographic characteristics of African American (AA) and non-Hispanic white (NHW) subjects

Quantitative PCR (qPCR) analysis

Mucosal biopsies were placed into RNAlater and stored at 4° C overnight, after which they were stored at −80° C until extraction. DNA was isolated from mucosal biopsies using commercial DNA extraction kits (Promega, Madison, Wisconsin; Mobio, Carlsbad, California, USA), quantified and diluted to 5 ng/μL. Examination of bacterial target abundances of NHW cases prepared with the two kits yielded similar results, indicating that DNA extraction method did not introduce bias. Sulfidogenic bacteria abundance was quantified in triplicate with a 7900HT Fast Real-Time PCR System (Applied Biosystems, Foster City, California, USA). Small subunit rRNA genes of SRBs including Desulfobacter spp. (DSB), Desulfobulbus spp. (DBB), Desulfotomaculum spp. (DFM) and Desulfovibrio spp. (DSV) were targeted and validated as previously described.27 In addition, functional genes from the H2S production pathway, pan-dissimilatory sulfite reductase A (dsrA)28 harboured by all SRB29 and B. wadsworthia-specific dsrA (dsrA-BW)22 were quantified.

Statistical analysis

To test differences between groups, Wilcoxon rank-sum test or t-test for continuous variables and χ2 or Fisher's exact test for dichotomous variables were used. Because bacterial targets were not normally distributed, data were log-transformed. For binary outcome analyses (AA vs NHW and case vs control), logistic regression models were used adjusting for selected covariates. In all models, age, sex and biopsy location were included. For sulfidogenic bacterial analysis in AAs, BMI, education level and annual income level were included. In NHWs, BMI, education level and NSAID use were included. Spearman correlation analysis was used to test associations between sulfidogenic bacteria. Statistical tests were conducted using SAS V.9.3 (SAS, Cary, North Carolina, USA).

16S rDNA sequence analysis and bioinformatics

Mucosal DNA samples were processed using Zymo Research DNA Clean and Concentrator, quantified using a Qubit fluorimeter (Life Technologies, Carlsbad, California, USA), and the V4 region of 16S rDNA was amplified according to Fluidigm protocols. Products were subjected to a second round of amplification with Illumina linkers and barcodes. Final products were quantified, quality tested with a Fragment Analyzer (Advanced Analytics, Ames, Iowa, USA) and size selected using a 2% agarose E-gel (Life Technologies). Quality of cleaned and extracted product was verified using an Agilent Bioanalyzer (Agilent Technologies, Santa Clara, California, USA), and sequencing was carried out using an Illumina MiSeq (Illumina, San Diego, California, USA).

Sequencing of the V4 region resulted in 4 885 578 raw reads which were demultiplexed, quality filtered to remove low abundant operational taxonomic units (OTUs) and singletons, and clustered using a 97% similarity threshold through QIIME 1.8.0 using the Greengenes 13_8 reference database, as previously described.30–32 Quality filtering and rarefaction to 5571 (max=67 359) sequences per sample resulted in 2 689 301 reads from 155 samples (61 AA CRC cases and 94 AA controls), with a sample mean of 17 350 sequences and an average base pair length of 252. α-diversity and β-diversity were assessed using the workflow in QIIME 1.8.0. To confirm core diversity findings were not an artefact of sequence rarefaction and to obtain an estimate of robustness, comparable-sized jackknifed sample datasets were created, and β-diversity was reanalysed in QIIME

Linear discriminant analysis (LDA) effect size was calculated with the LEfSe algorithm to identify biomarkers for the differences between CRC cases and controls.33 A secondary analysis was performed to examine the differences between subjects based on tumour location, comparing left-sided cases (n=33) and left-sided controls (n=52) and comparing right-sided cases (n=27) and right-sided controls (n=42); and based on age, comparing cases <50 (n=11) and controls <50 (n=20) and comparing cases <50 (n=11) and cases ≥50 (n=48). No further adjustments for multiple testing were required.33

Analysis of dietary intake

Dietary intake was obtained using the Block Brief 2000 Food Frequency Questionnaire (BBFFQ) from a subset of subjects, including 50 AA and 31 NHW CRC cases and 30 AA and 24 NHW controls.34 The questionnaires were administered within 3 weeks of the diagnostic procedure to reduce recall bias. NutritionQuest (Berkeley, California, USA) processed completed BBFFQs. The Diet and Behavior Shared Resource of the University of Illinois-Chicago Cancer Center prepared a dataset for statistical analysis.

Dietary intake data from outliers were excluded (<600 kcals (n=11) or >3500 kcals (n=1)). Because energy intake is highly correlated with macronutrient and micronutrient intake,35 nutrient or food group was adjusted per 1000 cal. Density variables were created for total fat (g), saturated fat (g), monounsaturated fat (g), polyunsaturated fat (g), protein (g), total carbohydrate (g), total fibre (g), cysteine (mg), calcium (mg), iron (mg), alcohol (g), omega-6 fatty acids (g), and daily food group servings of meat, vegetables, fruit, total grains and dairy. Percent kilocalories from fat, protein and carbohydrate were also examined.

Differences in intake of nutritional variables between AAs and NHWs were assessed using the Wilcoxon signed-rank test or t-test, as appropriate. To analyse the relationship between selected dietary intake variables (cysteine, total fat and daily servings of dairy) and sulfidogenic bacterial abundance, linear regression models were created using R V.3.3.1 and adjusted for age, sex, race, disease status and BMI (R Foundation for Statistical Computing, Vienna, Austria. Packages: tidyverse v.1.0.0, gridExtra v.2.2.1). Due to the high degree of co-linearity between these variables and other dietary factors of interest, it was necessary to isolate the independent effect of these intake variables. Consequently, we residualised each variable by obtaining the residuals for a model fit with all correlated variables >0.50 as explanatory and used these residuals in our final model.


Race-specific differences in mucosal sulfidogenic bacteria

Comparison of sulfidogenic bacterial targets between races revealed pan-dsrA, a measure of SRB abundance across a range of species, to be 10 times higher in AAs compared with NHWs (p<0.001), irrespective of disease status (figure 1A). Abundance of individual SRB genera was higher in AAs compared with NHWs for all species except DBB; DBB levels were mostly undetectable in AAs. Similarly, the abundance of B. wadsworthia was 2.5 times higher in AAs compared with NHWs, irrespective of disease status (p<0.001). These differences in abundance remained statistically significant after adjustment for clinical and epidemiological variables that were different between races (see online supplementary table S1).

Figure 1

Differences in mean gene copy numbers of sulfidogenic bacteria comparing colorectal cancer cases and controls in African Americans (AAs) and non-Hispanic whites. A. Scatterplot representations of mean gene copy numbers per nanogram of DNA of sulfidogenic bacterial targets in African Americans compared with non-Hispanic whites with median and upper and lower quartiles indicated. African Americans are represented in green, and non-Hispanic whites are represented in purple. All comparisons had a p<0.001. B. Scatterplot representations of mean gene copy numbers per nanogram of DNA of sulfidogenic bacterial targets in cases and controls in each racial group. African Americans cases are represented in light green and controls in dark green. Non-Hispanic white cases are represented in light purple and controls in dark purple.

Comparison of sulfidogenic bacterial targets between CRC cases and controls within each racial group revealed the mean abundance of pan-dsrA to be 1.8 times higher in AA controls than AA CRC cases (p<0.001) (figure 1B). By contrast, this target was 2.9 times higher in NHW CRC cases than NHW controls (p<0.001). These racial differences remained significant after adjusting for age, sex, biopsy site, BMI, education, income level and NSAID use (table 1).

The taurine-respiring sulfidogenic bacterium B. wadsworthia was 1.9 times more abundant in AA CRC cases than AA controls (p<0.001), and remained significant after adjusting for demographic factors mentioned previously. On the other hand, abundance of B. wadsworthia did not differ between NHW cases and controls.

Consistent with the pan-dsrA results, the abundance of DSB (p<0.001) and DFM (p<0.001) was significantly higher in AA controls than AA CRC cases. On the other hand, abundance of DSB (p=0.018) and DBB (p<0.001) was significantly higher in NHW CRC cases than NHW controls. These data demonstrate that AA and NHW cases and controls display reciprocal differences in SRB abundance, which remained after adjusting for tumour stage. Additionally, Spearman's correlation analysis revealed that SRBs correlated positively with B. wadsworthia in NHWs but not in AAs (see online supplementary table S2).

16S rDNA sequencing analysis in AAs

Due to the disparity in CRC risk in the AA compared with the NHW population, a 16S rRNA gene sequence analysis was conducted to validate sulfidogenic abundance trends observed by our qPCR analysis and to compare case versus control differences in the taxa not tested by qPCR analysis. The overall microbial population was represented by 10 phyla, 70 families and 137 genera (see online supplementary figure S1). Relative abundance differences in phyla between AA CRC cases and AA controls were not statistically significant, and measures of α-diversity were not different. Analysis of UniFrac distance matrices using Monte Carlo permutations (n=999) revealed significant differences in β-diversity between AA CRC cases and AA controls for both unweighted (p=0.01) and weighted (p=0.01) UniFrac distances (see online supplementary figure S2A, B). To ensure results were not due to sequence rarefraction, this analysis was further validated using jackknifed estimates, and similar results were obtained (see online supplementary figure S3). Overall, these data indicate that AA CRC cases displayed differences in taxon composition for both highly abundant and rare taxa compared with AA controls.

To determine which bacterial taxa were primary drivers of microbiome differences between AA CRC cases and AA controls, LDA was used to calculate the effect size of different bacterial taxa (figure 2A). LDA analysis revealed the genera Faecalibacterium, Pseudomonas and an unknown genus of Peptostreptococcaceae were strongly associated with AA controls; and the genera Alistipes, Delftia, Erysipelotrichaceae, Micrococcus, Pyramidobacter, Ruminococcus and unidentified genera associated with the Bacteroidales, Christensenellaceae and Mogibacteriaceae families were associated with AA CRC cases. The mean abundance of genera identified by LDA in AA CRC cases and controls is shown in figure 2B. We note that many of the subjects had undetected levels for the bacteria identified by this analysis.

Figure 2

Analysis of bacterial 16S rDNA data identifies differences in bacterial abundances comparing African American colorectal cancer (CRC) cases and controls. A. Differences identified by linear discriminant analysis. Effect sizes were calculated by the LEfSe algorithm and genera associated with effect size differences ≤−log2 and ≥log2 are shown. Genera associated with control samples are in green, and genera associated with CRC are in red. B. Scatterplot representations of the bacterial genera identified by LEfSe analysis in cases and controls. Each point is the logarithm of the proportion of each indicated bacterial genera with median and average indicated. The null measurements are not indicated on the plot. The numbers of nulls for each subject (case/control) by genera is Faecalibacterium 1 case/1 control, Pseudomonas 22/31, Peptostreptococcaceae 58/64, Alistipes 61/94, Delftia 62/79, Mogibacteriaceae 62/94, Micrococcus 62/94, Pyramidobacter 58/92, Christensenellaseae 54/92, Cc_115 Erysipelotrichaceae 60/93, Bacteroidales S24-7 49/81 and Ruminococcus 32/65. C. Scatterplot representations of abundances of three selected sulfidogenic bacteria—Fusobacterium, Desulfovibrio and Bilophila—shown as in B. The numbers of nulls for each subject (case/control) by genera are Fusobacterium 34/55, Bilophila 52/81 and Desulfovibrio 57/83. LDA, linear discriminant analysis.

Analysis of genera-specific differences of sulfidogenic bacteria revealed increased relative abundance of Bilophila and Desulfovibrio in AA CRC cases compared with AA controls, although these differences were not significant after adjusting for false discovery rate (figure 2C). OTUs for DSB, DFM and DBB were not identified. Fusobacterium was the most abundant sulfidogenic bacterium identified in the analysis, but the relative abundance of Fusobacterium was similar between cases and controls. Apparent differences in the relative abundance of Desulfovibrio as estimated by 16S rDNA sequencing likely reflects the greater number of null reads in control subjects.

LDA was used to identify differences in the relative abundance of bacterial genera for right-sided and left-sided colonic biopsies (figure 3A, B). Effect size scores identified Pyramidobacter, Weissella and unknown genera of the Bacteroidales, Mogibacteriaceae and Pseudomonadaceae families as being significantly associated with right-sided AA CRC cases. The genera Faecalibacterium, Erwinia, Klebsiella, Variovorax, Caulobacteraceae and Delftia were associated with right-sided AA controls. Coprobacillus, Alistipes, Christensenellaceae, Cc 115 Erysipelotrichaceae, Porphyromonas, Ruminococcus, Odoribacter and unknown genera belonging to YS2 were significantly associated with left-sided AA CRC cases. The genera Oxalobacteraceae, Erwinia and unknown genera of Peptostreptococcaceae were associated with left-sided AA controls.

Figure 3

Linear discriminant analysis identifies differences between colorectal cancer (CRC) cases and controls. Effect size calculated by the LEfSe algorithm identifies genera strongly associated with differences between CRC cases and controls in right-sided colonic biopsies (A) and in left-sided colonic biopsies (B), as well as differences based on age, comparing cases <50 and controls <50 (C) and comparing cases <50 and cases ≥50 (D). Genera associated with control samples are in green and genera associated with CRC are in red in panels A–C; genera associated with cases <50 are in red and cases ≥50 are in green in panel D. LDA, linear discriminant analysis.

Additionally, LDA was used to identify differences in relative abundance of bacterial species comparing CRC cases and controls <50 years of age (figure 3C). The genera Bilophila, Parabacteroides, Odoribacter and an unknown genus of the Barnesiellaceae family were associated with early-onset CRC, and the genus Acidaminococcus was associated with controls <50 years of age. We also compared CRC cases below 50 and CRC cases 50 and older, revealing two genera (unknown genera of the families Barnesiellaceae and Pseudomonadaceae) associated with cases above the age of 50, and 11 genera (Staphylococcus, Coprobacillus, Lactococcus, Corynebacterium, Cetobacterium, Megamonas, Anaerococcus, Sneathia, Odoribacter, and unknown genera of the Methylobacteriaceae and Mogibacteriaceae families) associated with cases under 50 years of age (figure 3D).

Associations between diet and sulfidogenic bacteria

To examine whether diet differed between AAs and NHWs, dietary intake was analysed with the BBFFQ, focusing on nutrient and dietary factors associated with the typical mixed Western-style diet, namely intake of total fat, animal fat and protein. Overall, dietary fat and protein intake per 1000 kcals was significantly greater in AA compared with NHW subjects; on the other hand, calcium intake and servings of dairy per 1000 kcals were lower (see online supplementary table S3). To examine the effect of dietary intake on sulfidogenic target abundance, these dietary variables were examined using linear regression models (figure 4 and online supplementary figure S4, online supplementary table S4). As observed in the larger dataset (figure 1), the three major sulfidogenic bacteria tested (pan-dsrA, B. wadsworthia and DSV) strongly associated with race––the main variable that explained differences between AAs and NHWs (figure 4). In the models, pan-dsrA abundance also significantly associated with residuals for cysteine (p=0.016), total fat (p=0.011) and daily servings of dairy (p=0.007), as well as male sex (p=0.0032). DSV abundance was not significantly associated with dietary variables, but trends were observed with cysteine and male sex. Similarly, abundance of B. wadsworthia was not significantly associated with dietary variables, but there was a strong association with age (figure 4; p=0.001).

Figure 4

Comparison of selected factors that explain differences in sulfidogenic bacteria in African Americans and non-Hispanic whites. A. Scatterplot representations of residuals of linear models for selected bacterial targets, namely, pan-dsrA, Desulfovibrio spp. and B. wadsworthia. African Americans are represented in green, and non-Hispanic whites are represented in purple. Cases are represented as open circles and controls as closed circles. B. Forest plots of effect size estimates from the linear models. Outcome variables are log-transformed gene copy abundances of pan-dsrA, Desulfovibrio spp. and B. wadsworthia. Points and lines on each plot represent the point estimate and 95% CI, respectively, for each covariate in the model. Estimates reflect predicted change in variable for one unit change in outcome. Positive associations are in blue, negative associations are in red. p value for each covariate is represented on the right, and statistically significant values after Bonferroni correction are in bold. The wide CI seen in the body mass index (BMI), underweight class is explained by the small number of observations in this class.


The present results demonstrate that the microbiome of uninvolved mucosa in CRC cases is distinct from tumour-free controls and implicate sulfidogenic bacteria as potential contributors to CRC development. One of the more striking observations was the greater abundance of SRBs and B. wadsworthia in AAs compared with NHWs, irrespective of the disease status. Pan-dsrA was 10-fold and B. wadsworthia dsrA 2.5-fold more abundant in AA colonic mucosa, and overall, analysis of individual sulfidogenic bacterial species confirmed this difference. Second, reciprocal differences in sulfidogenic bacterial abundance were observed in both AA and NHW CRC cases compared with controls. Notably, similar reciprocal differences were observed during diet exchange between rural South African blacks and AAs living in Pittsburgh.14

We demonstrated previously that human colonic mucosa is persistently colonised by sulfidogenic bacteria.36 Constitutive bacterial derived sulfide may have a protective antimicrobial effect, based on the susceptibility of resident microbes to sulfide.37 Sulfide production may become deleterious when the provision of organic sources of sulfur promotes growth of taurine and cysteine using bacteria, increasing sulfide to proinflammatory and genotoxic concentrations. Taurine provided by bacterial deconjugation of taurocholic acid acts as a substrate for anaerobic respiration by B. wadsworthia, generating genotoxic H2S.38 Once deconjugated, free primary bile acids are further metabolised by colonic bacteria to genotoxic and proinflammatory secondary bile acids. Specifically, the secondary bile acid, deoxycholic acid, acts as a tumour promoter, causing membrane perturbations leading to arachidonic acid release. Arachidonic acid is converted by cyclo-oxygenase-2 and lipo-oxygenase to proinflammatory and proangiogenic prostaglandins and reactive oxygen species, damaging DNA and inhibiting DNA repair enzymes.38 Accordingly, these data support the hypothesis that differences in sulfidogenic bacteria between AAs and NHWs may contribute to variations in CRC incidence between these two populations.

Examination of microbial abundance in uninvolved mucosa from AA CRC cases and controls using 16S rDNA analysis underscored a number of prevailing taxonomic trends associated previously with cancer development in NHWs.4–13 For example, in agreement with previous analyses,8–10 ,12 relative abundance of Mogibacteriaceae, Leptotrichia, Porphyromonas, Erysipelotrichaceae and Ruminococcus was significantly higher in AA CRC cases. Similar to F. nucleatum, Mogibacteriaceae was first isolated from the human oral cavity and associated with oral and extraoral abscesses.39 Likewise, species of the genera Sneathia,40 formally known as Leptotrichia sanguinegens,41 and Porphyromonas are prevalent in periodontal disease,42 and recently members of the oral microbial community have been associated with CRC.43 Organisms from the family Erysipelotrichaceae have been linked with IBD, are consistently positively associated with high-fat diets and dyslipidemia, and may be modulated by changes in host cholesterol metabolites.44 While some species of the genus Ruminococcus appear to be protective,45 ,46 a greater abundance of Ruminococcus obeum was observed in the faeces of rats with 1,2-dimethyl hydrazine-induced precancerous lesions.47

Several reports note increased abundance of sulfidogenic microbes in CRC-related dysbiosis, but few have connected the production of H2S with CRC development.5–13 Our study is the first to look at differences in sulfidogenic bacteria, specifically in uninvolved mucosa of individuals with CRC compared with controls. LDA revealed several genera associated with sulfur metabolism as significant indicators of disease, including Mogibacteriacea, an organism previously observed to co-occur with volatile sulfur compound-producing organisms like F. nucleatum,39 and the sulfidogenic genera Lactococcus, Porphyromonas, Odoribacter, Bilophila and Pyramidobacter.23 ,48–51 Similar to F. nucleatum, species of Lactococcus and Poryphyromonas generate sulfide from cysteine metabolism.48 ,49 The growth of Odoribacter splanchnicus, another species from the Porphyromonadaceae family, is enhanced with the addition of bile, and this bacterium produces an abundance of sulfide.50 While one other study observed higher abundance of Pyramidobacter in proximal colon cancer tissue, the authors did not make the association with sulfide.6 Our study also corroborates previous evidence that microbial composition of proximal and distal cancers differ and that Pyramidobacter is enriched in proximal cancers.6 Thus, Pyramidobacter spp. could contribute to the propensity for proximal tumours in AAs.2 The present report is the first to demonstrate that B. wadsworthia is associated with CRC.

A meta-analysis of prospective cohorts revealed increased CRC incidence among persons who consume a diet high in red and processed meat.52 Consistently, the present study observed a higher intake of dietary fat and protein in AAs, who are at higher risk of CRC development compared with NHWs. A Western diet is thought to be a major contributor to CRC incidence variance in different countries. For example, rural South African blacks, who consume a diet low in fat and animal protein, have a negligible risk of CRC compared with AAs and American NHWs, who on average consume a diet high in animal products.21 A recent comparison of AA and rural South African blacks showed significantly higher levels of both primary and secondary bile acids in AAs, as well as reciprocal changes in secondary bile acids and colonic microbiota after diet exchange.21 Additionally, a recent study revealed rapid changes in ‘bile-tolerant’ organisms upon consumption of a diet rich in fat and animal protein, including two genera abundant in this study, namely, Alistipes and Bilophila.53 Notably, species of Cetobacterium, a taxon observed to be significant in AA CRC, are also bile-tolerant.54 Similarly, B. wadsworthia was significantly more abundant in mice fed a high milk fat diet and increased severity of symptoms in dextran sodium sulfate (DSS) induced colitis.55 A high-fat diet, rich in the sulfur-containing amino acids cysteine and taurine, increases bile secretion and shifts the taurine:glycine bile acid ratio towards taurine conjugation.56

A key advantage of the present study is that recruited subjects represent an urban American population of average to low-income persons with predominantly high school or some college education, enabling a more generalisable examination of health and dietary disparities than performed previously.12 ,54 ,57 ,58 Additionally, epithelial cell function is affected to a greater extent by mucosally associated microbes59–61 as measured herein, functional differences have been observed between mucosal and luminal microbiota,62 and mucosal but not faecal microbiota may discriminate between disease status and health.60

Our study is limited by use of the BBFFQ to measure habitual dietary intake of subjects. It is well established that diet assessment methods which rely on self-report are prone to under-reporting of nutrient intake due to socially desirable responses and recall bias.63 ,64 The BBFFQ has a reduced food list (only 70 foods/beverages) and thus inherently will result in energy and macronutrient estimates lower than actual intake. With this in mind, nutrient density (per 1000 kcals) variables were created for dietary components of interest and have been suggested to more closely correlate with ‘true’ intake versus estimates of absolute intake.64 Given the limitations of self-report diet assessment methods, to best understand the impact of diet composition in the manipulation of bile acid ratios and microbial metabolites, controlled feeding studies are warranted. Additional limitations include lack of information on probiotic and antibiotic use by subjects. Use of LEfSe to identify significant genera in the 16S analysis may bias against identification of rare OTUs; however, it is one of the most commonly used analytical methods for biodiscovery. Metabolites and histological biomarkers were not examined in the present study; these variables should be evaluated in future studies.

Our analysis revealed that race was the strongest variable associated with differences in sulfidogenic bacteria in AA compared with NHWs, and that diet differences had relatively small effects; consequently, we propose some other latent variable explains this difference, identifying that latent variable is an important future direction. In the meantime, it is clear a greater understanding of sulfidogenic bacteria associated with the colonic mucosa and development of novel CRC prediction models using key sulfidogenic bacteria may provide substantial knowledge for prevention and early detection of CRC particularly in the AA population.


The Fluidigm and Illumina assays were performed by the Functional Genomics and DNA Sequencing Cores at the Carver Biotechnology Center, University of Illinois Urbana-Champaign. Statistical analysis was performed by Design and Analysis Core at the Center for Clinical and Translational Science, University of Illinois at Chicago, Chicago, IL.



  • CY and PGW share co-first authorship.

  • Contributors Study concept and design (NAE, HRG, PGW and CY); acquisition of data (KV, PGW and CY); analysis and interpretation of data (GJA, NAE, HRG, LTH, EM, T-WL, PGW and CY); drafting manuscript (NAE, HRG, PGW and CY); critical revision of manuscript (NAE, HRG, LTH, BJ, T-WL, XL, PGW, CY and RMX); statistical analysis (GJA, HK and EM); obtained funding (NAE, LTH, BJ, XL, EM and HRG); administrative, technical or material support (CB, TC, XL, LTH, BJ and RMX); study supervision (XL, RMX and HRG).

  • Funding This work was supported by grants from the National Cancer Institute (U01 CA153060 and P30 CA023074, NAE; RO1 CA204808, HRG, EM, LTH; RO1 CA141057, BJ) and the American Cancer Society Illinois Division (223187, XL). PGW was supported by a predoctoral fellowship from Mayo Clinic and University of Illinois Alliance for Technology-Based Healthcare. CY was supported by a training grant from National Institute of Health (5T32DK007788-15). GJA was supported by a Cancer Biology Training Grant (T32CA009213).

  • Competing interests None declared.

  • Ethics approval Institutional review boards of each participating medical centre.

  • Provenance and peer review Not commissioned; externally peer reviewed.