Leveraging sequence-based faecal microbial community survey data to identify a composite biomarker for colorectal cancer

Manasi S Shah; Todd Z DeSantis; Thomas Weinmaier; Paul J McMurdie; Julia L Cope; Adam Altrichter; Jose-Miguel Yamal; Emily B Hollister

doi:10.1136/gutjnl-2016-313189

Article Text

Gut microbiota

Original article

Leveraging sequence-based faecal microbial community survey data to identify a composite biomarker for colorectal cancer

Manasi S Shah1,2,3,5,
Todd Z DeSantis2,
Thomas Weinmaier2,
Paul J McMurdie2,4,
Julia L Cope3,5,6,
Adam Altrichter2,
Jose-Miguel Yamal1,
Emily B Hollister3,5

¹Department of Epidemiology, University of Texas School of Public Health, Houston, Texas, USA
²Bioinformatics, Second Genome Inc, South San Francisco, California, USA
³Department of Pathology, Texas Children's Microbiome Center, Texas Children's Hospital, Houston, Texas, USA
⁴Bioinformatics, Whole Biome Inc, San Francisco, California, USA
⁵Department of Pathology and Immunology, Baylor College of Medicine, HoustonTexas, USA
⁶Diversigen, Inc, Houston, Texas, USA

Correspondence to Dr Manasi S Shah, 3450 Central Expressway, Santa Clara, CA 95051, USA; manasishah86{at}gmail.com

Abstract

Objective Colorectal cancer (CRC) is the second leading cause of cancer-associated mortality in the USA. The faecal microbiome may provide non-invasive biomarkers of CRC and indicate transition in the adenoma–carcinoma sequence. Re-analysing raw sequence and metadata from several studies uniformly, we sought to identify a composite and generalisable microbial marker for CRC.

Design Raw 16S rRNA gene sequence data sets from nine studies were processed with two pipelines, (1) QIIME closed reference (QIIME-CR) or (2) a strain-specific method herein termed SS-UP (Strain Select, UPARSE bioinformatics pipeline). A total of 509 samples (79 colorectal adenoma, 195 CRC and 235 controls) were analysed. Differential abundance, meta-analysis random effects regression and machine learning analyses were carried out to determine the consistency and diagnostic capabilities of potential microbial biomarkers.

Results Definitive taxa, including Parvimonas micra ATCC 33270, Streptococcus anginosus and yet-to-be-cultured members of Proteobacteria, were frequently and significantly increased in stools from patients with CRC compared with controls across studies and had high discriminatory capacity in diagnostic classification. Microbiome-based CRC versus control classification produced an area under receiver operator characteristic (AUROC) curve of 76.6% in QIIME-CR and 80.3% in SS-UP. Combining clinical and microbiome markers gave a diagnostic AUROC of 83.3% for QIIME-CR and 91.3% for SS-UP.

Conclusions Despite technological differences across studies and methods, key microbial markers emerged as important in classifying CRC cases and such could be used in a universal diagnostic for the disease. The choice of bioinformatics pipeline influenced accuracy of classification. Strain-resolved microbial markers might prove crucial in providing a microbial diagnostic for CRC.

COLORECTAL CANCER
COLORECTAL ADENOMAS
META-ANALYSIS
INTESTINAL BACTERIA

https://doi.org/10.1136/gutjnl-2016-313189

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

View Full Text

Footnotes

Twitter Follow Manasi Shah @GoingByGut
Contributors MSS: Study design, data collection, sequence processing, statistical analysis and manuscript preparation. TDeS: Study design, data collection, sequence processing, statistical analysis and manuscript preparation. TW: Sequence processing and manuscript preparation. PJMcM: Statistical analysis and manuscript preparation. JLC: Sequence processing and manuscript preparation. AA: Statistical analysis and manuscript preparation. J-MY: Statistical analysis and manuscript preparation. EBH: Study design, data collection, sequence processing, statistical analysis and manuscript preparation.
Competing interests MSS worked as a consultant with Second Genome during the course of work. TDeS, TW, PJMcM and AA were employed by Second Genome during the course of the work and hold stock options.
Provenance and peer review Not commissioned; externally peer reviewed.

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Statistics from Altmetric.com

Request Permissions

Footnotes

Read the full text or download the PDF:

Log in using your username and password

Read the full text or download the PDF:

Log in using your username and password