Article Text

Gut microbiota as non-invasive diagnostic and prognostic biomarkers for natural killer/T-cell lymphoma
  1. Zhuangzhuang Shi1,
  2. Guoru Hu2,
  3. Min W Li2,
  4. Lei Zhang1,3,
  5. Xin Li1,3,
  6. Ling Li1,3,
  7. Xinhua Wang1,3,
  8. Xiaorui Fu1,3,
  9. Zhenchang Sun1,3,
  10. Xudong Zhang1,3,
  11. Li Tian1,3,
  12. Zhaoming Li1,3,4,5,
  13. Wei-Hua Chen2,6,7,
  14. Mingzhi Zhang1,3,4
  1. 1 Department of Oncology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
  2. 2 Department of Bioinformatics and Systems Biology, Huazhong University of Science and Technology College of Life Sciences and Technology, Wuhan, Hubei, China
  3. 3 Lymphoma Diagnosis and Treatment Centre of Henan Province, Zhengzhou, Henan, China
  4. 4 State Key Laboratory of Esophageal Cancer Prevention & Treatment and Henan Key Laboratory for Esophageal Cancer Research, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
  5. 5 Academy of Medical Sciences of Zhengzhou University, Zhengzhou University, Zhengzhou, Henan, China
  6. 6 Institution of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong, China
  7. 7 College of Life Science, Henan Normal University, Xinxiang, Henan, China
  1. Correspondence to Professor Mingzhi Zhang, Department of Oncology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China; mingzhi_zhang1{at}; Professor Wei-Hua Chen, Department of Bioinformatics and Systems Biology, Huazhong University of Science and Technology College of Life Sciences and Technology, Wuhan, Hubei, China; weihuachen{at}; Professor Zhaoming Li, Department of Oncology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China; fcclizm{at}

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

We read with interest the study by Kartal et al 1 showing that the gut-microbiota-derived biomarkers for disease stratification are often shared by subjects across disease cohorts. Here, we confirmed their observations with findings from a newly diagnosed natural killer/T-cell lymphoma (NKTCL) cohort, in which the gut biomarkers were significantly overlapped with those of multiple disease cohorts and consistently enriched/depleted in subjects with those diseases. Importantly, many of the shared biomarkers were remarkably associated with patient outcomes in our cohort, implying that they may have broad prognostic values in multiple diseases.

‘Microbiota-gut-lymphoma axis’ represents a fascinating avenue of microbiota-mediated lymphomagenesis and intervention opportunity,2 but the implications of gut microbiota in NKTCL remain enigmatic. To identify gut microbiota-derived diagnostic biomarkers for NKTCL, we recruited a discovery cohort consisting of 30 treatment-naïve patients and 20 healthy controls (HCs), and a validation cohort, including 12 patients and 13 HCs, respectively (online supplemental materials and methods). We applied shotgun metagenomic sequencing to their faecal samples, profiled their gut metagenomes using mOTUs2 V.2.5,3 and trained a patient-stratification classifier with all species-level taxonomic features using the LASSO algorithm implemented in SIAMCAT.4 Our classifier achieved an accuracy of 0.868 area under the receiver operating characteristic curve (AUROC) on the discovery cohort, and 0.910 AUROC on the validation cohort (figure 1A). To increase the sample size for model training, we retrained a LASSO classifier for the NKTCL using all the samples from both cohorts, and achieved an accuracy of 0.813 AUROC in cross-validation, which strongly support the role of gut microbiota as diagnostic biomarkers for NKTCL.

Supplemental material

Figure 1

(A) Performance of the area under the receiver operating characteristic curve (AUROC) values of the gut microbiota-based classifier of NKTCL on the discovery cohort (threefold three times repeated cross-validation; grey line, the training set), the validation cohort (yellow line, the testing set), and all samples combined (ten-fold ten times repeated cross-validation; the ‘all data model’, blue line). (B) External validations of the disease specificity of the NKTCL faecal microbiota model (the ‘all data model’). False positive rates (FPRs) of the unconstrained model (without feature selection) in the 29 external test sets were shown as a bar plot. We defined the false-positive predictions as those wrongly classified as NKTCL by our model. Thus, two FPRs will be calculated for each cohort, one for the healthy controls (ie, the proportion of healthy controls that were wrongly classified as NKTCL), and another for the diseased individuals (ie, the proportion of diseased individuals that were wrongly classified as NKTCL). We then also calculated an overall FPRs for all the healthy controls and each of the diseases. Prediction results from the ‘enrichment-constrained’ model by selecting NKTCL-enriched biomarkers (enrichment-constrained model) as recommended by Kartal et al,1 were shown in online supplemental figure 1E. (C) Marker microbes shared by the NKTCL cohort and other seven cohorts that had ~20% and higher FPRs in their diseased subjects in (B); markers were identified using the LDA Effect Size (LEfSe) analysis. Red (blue) species name represents its enriched (depleted) in patients. Wilcoxon rank sum test was used to compare the differences in relative abundances between the patients and HCs of the respective cohorts. Inside the square brackets are the numbers of studies in which the species were also among the top features (robustness >50%) of the corresponding disease-stratification classifiers (online supplemental table S2). The ‘Star’ symbol in front of a species name indicates that the species are significantly associated with patients’ survival in our NKTCL cohort; the details can be found in online supplemental figure 1A–D. Inside the parentheses next to the species name is the number of studies in which the corresponding species were identified as a biomarker, that is, with |LDA| ≥ 2. Inside the parentheses after a study name is the total number of species in this figure that were also biomarkers of the study. (D–E) the overall survival (OS) and progression-free survival (PFS) Kaplan-Meier survival curves for NKTCL patients (n=30). Patients were divided into the SRI-high group and SRI-low group according to scores of the Streptococcus parasanguinis–Romboutsia timonensis index (SRI), calculated using the quotient of the relative abundances of the two species; the cut-points of SRI 26386550 for OS and 10 776 890 for PFS, and were determined by the ‘survminer’ R package V.0.4.98 ( Log-rank test was used to calculate the p values. (F) Correlations between the SRI score and multiple adverse prognostic factors of NKTCL, including prognostic index for natural killer lymphoma-Epstein-Barr virus (PINK-E; L: low risk, I: intermediate risk, H: high risk), disease stage, lymph node (LN) involvement, responses to first-line treatment (R: response, NR: non-response), B symptoms, Eastern Cooperative Oncology Group (ECOG) Performance Status ≥ 2, an increase in plasm Epstein-Barr virus (EBV) DNA level, and Ki67 expression ≥ 60%. Wilcoxon rank sum test was used to compare continuous variables between groups. (More specific descriptions on these results could be found in online supplemental results). ACD, atherosclerotic coronary disease; ADA, American diabetes; BRCA, breast cancer; CD, Crohn’s disease; CRC, colorectal cancer; CTR, controls; DE, German; ES, Spanish; JP, Japan; LD, liver disease; NAFLD, non-alcoholic fatty liver disease; PC, pancreatic cancer; T1D, type 1 diabetes; T2D, type 2 diabetes; UC, ulcerative colitis.

Supplemental material

Supplemental material

To examine the specificity of the NKTCL gut-microbiota-derived signature, we applied the all-sample NKTCL classifier to 29 public gut microbiota cohorts (online supplemental table S1). We observed an overall false positive rate (FPR) of 3.1% in the HCs, but higher FPRs in patients of several cohorts (figure 1B), especially those of the pancreatic cancer (Kartal_DE_2022_PC, Kartal_ES_2022_PC, Nagata_JP_2022_PC), Crohn’s Disease (He_2017_CD, Franzosa_2018_CD, Forslund_2015_CD) and liver disease (Qin_2014_LD). These results imply significant overlaps in the biomarkers between these diseases and NKTCL, which was confirmed using LEfSe analysis5 (figure 1C). Importantly, these biomarkers were consistently enriched/depleted in most cohorts, including the enrichment of oral-derived taxa of Veillonella and Streptococcus in the patients, and known beneficial species in HCs such as Faecalibacterium prausnitzii, Eubacterium rectale and Bifidobacterium adolescentis 1 6 7 (figure 1C). These findings indicate that our classifier can accurately distinguish NKTCL patients from HCs; nevertheless, due to the shared biomarkers with other diseases, combination of selected clinical indicators with microbial biomarkers would be salutary for a distinctive diagnostic model.

Survival data were available for the NKTCL patients in the discovery cohort. Notably, many identified microbiome biomarkers, especially those shared by multiple diseases, could significantly predict the overall survival (OS) and progression-free survival (PFS) of the patients, including Streptococcus parasanguinis, Romboutsia timonensis and Veillonella atypica (online supplemental figure 1A–D). Finally, we created a Streptococcus parasanguinis–Romboutsia timonensis index (SRI) as the relative abundance ratio of the two species, and obtained the best prognostic prediction power than other individual species and combinations. Namely, NKTCL patients with higher SRI scores showed significantly worse OS and PFS than those with lower SRI scores (figure 1D–E). Furthermore, we observed remarkable correlations between high SRI score and multiple adverse prognostic factors of NKTCL, including PINK-E, stage, lymph node involvement and responses to first-line treatment (all p<0.05; figure 1F).

Overall, our results lend support for gut microbiota as a potent assistive diagnostic tool for NKTCL. Moreover, the SRI score, based on the shared biomarkers, may have extensive prognostic utility in multiple diseases and deserves further scrutiny (online supplemental discussion).

Ethics statements

Patient consent for publication

Ethics approval

This study was performed in accordance with the Declaration of Helsinki and rules of good clinical practice, and the study was approved by the Ethics Review Committee of the First Affiliated Hospital of Zhengzhou University, Zhenzhou, China. Participants gave informed consent to participate in the study before taking part.


We would like to thank all the clinical doctors from the Lymphoma Diagnosis and Treatment Centre of Henan Province for their kind suggestions, and we also thank all the generous participants of this study for their supports.


Supplementary materials


  • ZS and GH are joint first authors.

  • ZS and GH contributed equally.

  • Contributors Study concept and design: MZ, W-HC and ZL. Samples collection: ZS, LZ, XL, XW, LL and XF. Data acquisition: ZS, GH, MWL, ZS, ZL, XZ and LT. Analysis and interpretation of data: W-HC, MZ, ZL, ZS, GH and MWL. Technical and material support: LZ, XL, XW, LL, XF, ZS, ZL, XZ and LT. Drafting of the manuscript: GH and ZS. Revising of the manuscript: W-HC, ZL and MZ. All the authors approved the final version of the manuscript.

  • Funding This work was supported by the National Natural Science Foundation of China (81970184; 82170183; U1904139; 82070209; 82070210).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.