Article Text

Original research
Gestational diabetes is driven by microbiota-induced inflammation months before diagnosis
  1. Yishay Pinto1,
  2. Sigal Frishman2,3,
  3. Sondra Turjeman1,
  4. Adi Eshel1,
  5. Meital Nuriel-Ohayon1,
  6. Oshrit Shrossel4,
  7. Oren Ziv1,
  8. William Walters5,
  9. Julie Parsonnet6,7,
  10. Catherine Ley6,
  11. Elizabeth L Johnson8,
  12. Krithika Kumar8,
  13. Ron Schweitzer9,10,
  14. Soliman Khatib9,10,
  15. Faiga Magzal11,12,
  16. Efrat Muller13,
  17. Snait Tamir11,12,
  18. Kinneret Tenenbaum-Gavish2,
  19. Samuli Rautava14,15,
  20. Seppo Salminen16,
  21. Erika Isolauri14,
  22. Or Yariv17,
  23. Yoav Peled2,17,
  24. Eran Poran17,
  25. Joseph Pardo2,17,
  26. Rony Chen2,
  27. Moshe Hod2,
  28. Elhanan Borenstein13,18,19,
  29. Ruth E Ley5,
  30. Betty Schwartz3,
  31. Yoram Louzoun4,
  32. Eran Hadar2,
  33. Omry Koren1
  1. 1Azrieli Faculty of Medicine, Bar-Ilan University, Safed, Israel
  2. 2Helen Schneider Hospital for Women, Rabin Medical Center and Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
  3. 3Institute of Biochemistry, School of Nutritional Sciences Food Science and Nutrition, The School of Nutritional Sciences, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot, Israel
  4. 4Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
  5. 5Department of Microbiome Science, Max Planck Institute for Developmental Biology, Tubingen, Germany
  6. 6Department of Medicine, Stanford University, Stanford, California, USA
  7. 7Department of Epidemiology and Population Health, Stanford University, Stanford, California, USA
  8. 8Division of Nutritional Sciences, Cornell University, Ithaca, New York, USA
  9. 9Department of Natural Compounds and Analytical Chemistry, Migal-Galilee Research Institute, Kiryat Shmona, Israel
  10. 10Analytical Chemistry Laboratory, Tel-Hai College, Upper Galilee, Israel
  11. 11Laboratory of Human Health and Nutrition Sciences, Migal-Galilee Technology Center, Kiryat Shmona, Israel
  12. 12Nutritional Science Department, Tel Hai College, Upper Galilee, Israel
  13. 13The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
  14. 14Department of Pediatrics, University of Turku and Turku University Hospital, Turku, Finland
  15. 15University of Helsinki & Helsinki University Hospital, New Children’s Hospital, Pediatric Research Center, Helsinki, Finland
  16. 16Functional Foods Forum, University of Turku, Turku, Finland
  17. 17Clalit Health Services, Tel Aviv, Israel
  18. 18Department of Clinical Microbiology and Immunology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
  19. 19Santa Fe Institute, Santa Fe, New Mexico, USA
  1. Correspondence to Professor Omry Koren, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel; omry.koren{at}


Objective Gestational diabetes mellitus (GDM) is a condition in which women without diabetes are diagnosed with glucose intolerance during pregnancy, typically in the second or third trimester. Early diagnosis, along with a better understanding of its pathophysiology during the first trimester of pregnancy, may be effective in reducing incidence and associated short-term and long-term morbidities.

Design We comprehensively profiled the gut microbiome, metabolome, inflammatory cytokines, nutrition and clinical records of 394 women during the first trimester of pregnancy, before GDM diagnosis. We then built a model that can predict GDM onset weeks before it is typically diagnosed. Further, we demonstrated the role of the microbiome in disease using faecal microbiota transplant (FMT) of first trimester samples from pregnant women across three unique cohorts.

Results We found elevated levels of proinflammatory cytokines in women who later developed GDM, decreased faecal short-chain fatty acids and altered microbiome. We next confirmed that differences in GDM-associated microbial composition during the first trimester drove inflammation and insulin resistance more than 10 weeks prior to GDM diagnosis using FMT experiments. Following these observations, we used a machine learning approach to predict GDM based on first trimester clinical, microbial and inflammatory markers with high accuracy.

Conclusion GDM onset can be identified in the first trimester of pregnancy, earlier than currently accepted. Furthermore, the gut microbiome appears to play a role in inflammation-induced GDM pathogenesis, with interleukin-6 as a potential contributor to pathogenesis. Potential GDM markers, including microbiota, can serve as targets for early diagnostics and therapeutic intervention leading to prevention.


Data availability statement

Data are available in a public, open access repositories. All sequencing data were submitted to EBI (project accession number ERP143097). Metabolomics data were deposited at 10.5281/zenodo.6581068.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • The incidence of gestational diabetes mellitus (GDM) is increasing worldwide.

  • Early prediction of GDM may reduce short-term and long-term complications to the mother and the offspring.

  • At later stages of pregnancy, the gut microbiome of women diagnosed with GDM is different from the microbiome of women without GDM.

  • Insulin resistance has been associated with elevated secretion of proinflammatory cytokines.


  • Gut microbiome, metabolome and inflammatory markers were profiled during the first trimester of pregnancy in 394 women.

  • Significant differences were found in these markers between women who would and would not later develop GDM.

  • The GDM phenotype was transferred to germ-free mice following faecal microbiota transplant from women in their first trimester of pregnancy.

  • Accurate prediction of GDM development was made based on first trimester biomarker profiles and clinical data.

  • This study suggests diagnosis of GDM/GDM risk can be made earlier allowing for earlier management or even complete prevention.


  • Recognition of women at risk of GDM at an early stage of pregnancy, with appropriate risk stratification, may allow specific recommendations for prevention of the disease—currently by lifestyle modification and in the future perhaps by specific pre/pro/postbiotic supplementation.

  • If GDM can be prevented, there would be a major reduction in adverse outcomes of GDM, for the mother and offspring, in both the short term and long term.


Gestational diabetes mellitus (GDM), or development of glucose intolerance during pregnancy in women without diabetes, occurs when the pancreas cannot produce enough insulin to balance insulin-inhibiting effects of placental hormones (viz. oestrogen, cortisol and human placental lactogen).1 Approximately 10% of pregnant women worldwide are diagnosed with GDM. Risk factors include non-white ethnicity, increased maternal age, obesity, family history of diabetes and history of giving birth to large infants. Consequences of GDM include a wide range of obstetrical and metabolic complications for both the mother (eg, pre-eclampsia, type 2 diabetes and cardiovascular diseases) and the neonate (mainly macrosomia and hypoglycaemia).2 Many complications are preventable if GDM is detected and appropriately managed and good glycaemic control is achieved by nutrition, exercise and insulin administration, if necessary, along with heightened monitoring during labour and delivery,3 but earlier detection might allow for complete amelioration of GDM-associated short-term and long-term risks.

The incidence of GDM is increasing worldwide, due primarily to the increase in prevalence of overweight and obesity, advanced maternal age and growth of at-risk populations.4–6 As such, it is important to expand early-prediction efforts towards reducing its negative consequences. To date, few studies have examined biomarkers of GDM in the first trimester (T1).7 8 Additionally, while gut microbial dysbiosis has been associated with diabetes,9 and a recent study has associated gut dysbiosis with GDM in the third trimester (T3),10 few have focused on T1.8 11–15

We sought to identify biomarkers of GDM in T1 of pregnancy. First, we comprehensively profiled the T1 gut microbiome, metabolome and inflammatory cytokine profiles of women who would and would not later be diagnosed with GDM. We then investigated whether the early pregnancy microbiome drove GDM development using germ-free (GF) mice. Finally, we used a machine learning approach to predict GDM based on patient characteristics, T1 microbiome and clinical information, to identify earlier time frames for therapeutic intervention.


Pregnant women

Primary prospective cohort

We enrolled a prospective cohort followed throughout pregnancy (online supplemental figure 1). Upon screening for GDM in the second trimester (T2; screening method described in online supplemental methods), women were retroactively classified as ‘would go on to develop GDM’ and ‘would not go on to develop GDM’. This main prospective cohort included 394 pregnant women aged 18–40 years recruited between gestational ages (weeks+days) 11+0–13+6 at women’s health centres of Clalit HMO (Dan Petach Tikva District, Israel) during the years 2016–2017. Exclusion criteria included: type 1 or type 2 diabetes mellitus diagnosed before pregnancy (all other chronic diseases were documented in the database); in vitro fertilisation or hormonal therapy in the previous 3 months; use of antibiotics in the previous 3 months and multiple gestation. Initially, 400 women were recruited, but 4 did not provide any samples and 2 did not meet study criteria upon further examination of medical records (one with antibiotics use, one with type 2 diabetes; online supplemental figure 1). Thus, 394 women were followed through 27–31 weeks of pregnancy; in this study, no women were lost to follow-up as following initial recruitment, all other data (namely GDM diagnosis) could be obtained from digital medical records. Weight and height were assessed at the time of recruitment and blood and faecal samples collected (see online supplemental methods). Dietary consumption (24-hour recall), physical activity (24-hour recall), sleeping hours (3-day recall), stress (validated questionnaire16), employment and education details (at recruitment) were recorded. Other maternal demographics, clinical and obstetrical data including pregnancy follow-up and comorbidities were extracted from medical records.

Supplemental material

Supplemental material

Secondary cohort

Since GDM incidence in Israel is about 10%,17 a secondary cohort of pregnant women was also recruited. Patients with GDM were enrolled in a cohort study at 24–28 gestational weeks at Rabin Medical Center between the years 2016 and 2017. Exclusion criteria for this cohort were the same as for the main cohort. Medical chart review was performed to identify all demographic and clinical characteristics from T1. Clinical data, but not biological samples, from this secondary cohort are included in the study.

Additional cohorts

In addition to the above cohorts, for faecal microbiome transplant (FMT) experiments in GF models, two additional independent cohorts were included (see online supplemental methods).

Biomarker analysis in the primary cohort

Fasting glucose, liver enzymes and HbA1c were extracted from medical records and serum cytokine and hormone panels performed (online supplemental methods). Bacterial DNA was extracted, amplified (V4 region of the 16S rRNA gene) and sequenced (Illumina MiSeq) from all faecal samples as described in the online supplemental methods. QIIME2 V.2019.418 was used for read pre-processing (pipeline in online supplemental methods). Faecal short-chain fatty acid (SCFA) extraction and untargeted metabolomics methods are also described in the online supplemental methods.

FMT into GF mice

Transplantation experiments were performed using faecal samples from the primary prospective cohort and the two additional cohorts (see online supplemental methods).


To predict GDM, we developed a prediction model using our prospective cohort (identified T1 biological markers and clinical data) as well as clinical data from our secondary cohort. We checked each combination of the following components: (1) cytokines, (2) microbiome, (3) general clinical information and (4) food questionnaires. The accuracy of the prediction was assessed using the area under the curve of the test set, in a 20%/80% test/training set division and a fivefold cross-validation (see online supplemental methods). To examine generalisability of our model, we applied the classifier to an independently published dataset from a Chinese cohort of 98 pairs of pregnant women with and without GDM (matched) who provided a faecal sample in week 10–15 of pregnancy.8 We trained the model on our primary cohort and tested the model performance on the Chinese cohort.

Statistical analysis

Full statistical methods are presented in the online supplemental methods. Briefly, unless otherwise specified, statistical analysis was done using non-parametric Mann-Whitney U tests followed by false discovery rate (FDR) correction. Mantel’s correlations between study features were performed. Association of microbial features with GDM was done by Spearman’s rank correlations compared with a background distribution followed by a linear model to control for main risk factors. For untargeted metabolomics, the differential abundance of the metabolites between the groups was identified by Student’s t-tests and FDR correction. Microbial features of FMT-recipient mice were associated with GDM using MaAsLin2.19 The MetaCyc pathway abundance in mouse faeces was predicted using PICRUSt2.20

Data availability

All sequencing data were submitted to European Bioinformatics Institue (EBI) (project accession number ERP143097). Metabolomics data were deposited at 10.5281/zenodo.6581068.21 Ethics statement and patient and public involvement are described in online supplemental methods.


Study design

We prospectively recruited 394 women during T1, 44 (11%) of which went on to develop GDM, as diagnosed by glucose tolerance test (GTT) during the second trimester of pregnancy. The other 350 women served as the control group, hereafter ‘healthy pregnant women’ (online supplemental figure 1). Of the recruited women (regardless of GDM status), 8 suffered spontaneous abortion, 7 delivered preterm and 11 had gestational hypertension or pre-eclampsia. In addition, 4 had polycystic ovary syndrome and 25 had hypothyroidism. These were not exclusion criteria.

Of the 34 women in the GDM group who had blood work on file before pregnancy, 2 had high HbA1c; none had high glucose. Women diagnosed with GDM exhibited other common risk factors (table 1) such as higher maternal age and pre-pregnancy body mass index (BMI). Following pregnancy (6 weeks–6 months), we also examined HbA1c (or glucose) levels of these women and found one woman with high HbA1c level (out of six who did this blood work) and none with impaired glucose levels (fasting test/75 g oral GTT, out of 22). While beyond the timeline of this T1 study, among women later diagnosed with GDM, dietary consultation/lifestyle change was not sufficient for nine women who therefore received medication to control their GDM.

Table 1

Cohort description

When examining explained variance between parameters measured (microbiome, SCFA, metabolome, cytokines, hormones, diet and lifestyle; figure 1A), using a Mantel test, we found that the T1 gut microbiome significantly explained the variance of most measurements and was most tightly correlated with the faecal metabolomic profile (figure 1B).

Figure 1

First trimester blood and faecal biomarkers in women later diagnosed with GDM. (A) Sampling strategy and study design. Samples were collected in first trimester (T1). Stool was collected to profile gut microbiome (GDM: n=28, control: n=236), metabolome (n=15 age/BMI-matched pairs) and SCFAs (n=20 age-matched pairs) and to validate results when transplanted into germ-free mice. Blood samples were used to profile cytokines and hormones (GDM: n=35, control: n=78). Lifestyle surveys and medical records were collected from all participants. (B) Variance explained (square of the Mantel statistic) between all pairs of data types (Mantel test). (C) Serum levels of cytokines and hormones for GDM and control women (false discovery rate (FDR)-corrected Mann-Whitney U tests). (D) Concentration of faecal short-chain fatty acids (FDR-corrected Mann-Whitney U tests). Boxplots indicate the median and IQR; whiskers show IQR×1.5. oP<0.1, *p<0.05, **p<0.01, ***p<0.001. BMI, body mass index; FMT, faecal microbiota transplant; GDM, gestational diabetes mellitus; GM-CSF, granulocyte-macrophage colony-stimulating factor; GTT, glucose tolerance test; IFN, interferon; IL, interleukin; ns, not significant; SCFA, short-chain fatty acid; TNF, tumour necrosis factor.

Women with GDM exhibit elevated levels of serum inflammatory cytokines and low levels of SCFAs in T1

Following evidence of elevated inflammatory biomarkers in women diagnosed with GDM,22 we profiled 10 plasma cytokines, chemokines and hormones in both the GDM (n=35) and control (n=78) groups and found elevated levels of proinflammatory cytokines (interleukin (IL)-4, IL-6, IL-8, granulocyte-macrophage colony-stimulating factor and tumour necrosis factor-α) among the GDM group (figure 1C; p<0.05, FDR-corrected Mann-Whitney U tests) but no differences in leptin and insulin. This result was robust when controlling for BMI and age (see the online supplemental methods; p<0.05, linear model with age and BMI as fixed or random effects).

Another possible early biomarker for GDM are SCFAs, which promote glucose homeostasis and suppress inflammatory response. We found a significant reduction of two branched SCFAs (BSCFAs), isovalerate and isobutyrate, in the GDM group (figure 1D; p<0.05, FDR-corrected Mann-Whitney U tests) and a similar trend for valerate (p=0.09).

Gut microbiome is associated with GDM pathogenesis

A number of studies have suggested that the gut microbiome is altered in women with GDM, post-GDM diagnosis. In our study, we did not find differences in T1 gut microbiome ɑ-diversity between women who would and would not develop GDM. Principal coordinate analysis of unweighted UniFrac distances demonstrated that the microbial communities of healthy women and women with GDM trend toward significant differences (figure 2A; p=0.06, permutational multivariate analysis of variance; p=0.23, 0.05, 0.38 for Bray-Curtis, Jaccard and weighted UniFrac, respectively), supported by results of differential abundance analyses (below). Notably, when fitting a linear model to the distance matrix with GDM outcome and the risk factors age and BMI, widely associated with GDM (sequentially using adonis2, see the online supplemental methods), none of the variables were significant.

Figure 2

Differences in faecal microbiome composition in first trimester between women who would and would not develop GDM later. (A) Principal coordinate analysis based on 16S rRNA gene sequence profiling of the microbiome (GDM: n=28, control: n=236) using the unweighted UniFrac dissimilarity metric coloured by GDM/control (left; p=0.06, PERMANOVA); violin plots represent the distribution of GDM/control on each axis; Shannon diversity (top right; R2=0.24 with PCo1) and two phyla that mostly explain the PCo1 and PCo2 variance: Fusobacteria (R2=0.08 with PCo2) and Deferribacteres (R2=0.3 with PCo2). (B) The cladogram represents the microbial features associated with the disease state, while controlling for the main risk factors, BMI and age, at all taxonomic ranks. Spearman’s rank correlation for each association: a positive association (all associations found), implies over-represented features in the healthy control group. Cladogram and bars are coloured by phylum. BMI, body mass index; GDM, gestational diabetes mellitus; Unc., unclassified; PERMANOVA, permutational multivariate analysis of variance.

We next aimed to characterise the specific subset of differentially abundant bacteria: 1 bacterial species was over-represented and 16 bacteria under-represented in the GDM group. When repeating this analysis while controlling for age and BMI, we found 15 under-represented species in the GDM group (figure 2B), only 6 of which intersected with the prior, uncontrolled analysis. Controlling for confounding variables allowed us to distinguish between microbial species associated with main risk factors of GDM and the disease itself. We found a lower abundance of Prevotella in T1 samples of women who would develop GDM, and this result was replicated in mice (below).

Glucose impairment and elevated IL-6 levels of women with GDM were phenocopied to mice by FMT

To examine the causal role of the gut microbiome in the pathogenesis of GDM, faecal samples of age-matched and BMI-matched GDM and control samples from the primary cohort were transplanted to GF female mice (figure 3A). Microbiota characterisation was performed 7 and 21 days post-FMT. The GF mice acquired an average of 42 and 48 taxa from donor samples 7 and 21 days post-transfer, respectively. The recipient mice shared ~60% of their taxa with the donor on day 7 and ~55% on day 21 (online supplemental figure 2). On day 7, the microbial communities were significantly different between GDM-recipient and non-GDM recipient mice (figure 3B). Consistent with our observation in women, P. copri was found to be reduced in GDM-recipient mice (figure 3C). GTTs revealed GDM-recipient mice exhibited impaired glucose tolerance (figure 3D).

Figure 3

Phenotype transfer via first trimester (T1) FMT to germ-free mice. (A) Study design. (B) PCoA using the unweighted UniFrac metric. Mice receiving FMT from women with GDM exhibit different microbial profiles from mice receiving FMT from the control group (p=0.005, PERMANOVA test, n=7 age/BMI-matched FMT donor pairs). (C) Prevotella copri, which was found to be negatively associated with women with GDM, is negatively associated with GDM-transplanted mice as well (p=0.04, linear mixed-effects model). (D) Intraperitoneal glucose tolerance test (ipGTT) revealed impaired glucose sensitivity in mice transplanted with faeces from women with GDM in this study and in the Finnish cohort (insert) (error bars represent ±SEM; *p<0.05 one-tailed Mann-Whitney U test). (E) Serum cytokine level in transplanted mice (*p<0.05 Mann-Whitney U test). Boxplots indicate the median and IQR; whiskers show IQR×1.5. BMI, body mass index; FMT, faecal microbiota transplant; GDM, gestational diabetes mellitus; IL, interleukin; PCoA, principal coordinate analysis; PERMANOVA, permutational multivariate analysis of variance.

Further, the GDM-recipient mice exhibited elevated levels of both IL-6 (in agreement with our findings in women with GDM) and IL-10 relative to the control-recipient mice (figure 3E and online supplemental table 1). No differences were found for insulin or leptin levels (online supplemental table 2). We found further support for the role of gut microbiota in GDM pathogenesis with FMT from two additional GDM cohorts (Finnish and American women) (figure 3D; online supplemental figure 3 and online supplemental tables 3–5; combined p=0.15, 0.022, 0.10, 0.24 for time points 0, 30, 60, 120 min, respectively, Fisher’s method).

Lower levels of short peptides in stool of women with GDM

We next compared stool metabolome profiles of women who would and would not later develop GDM (n=15 age-matched and BMI-matched pairs). First, we found a significant correlation between the microbiome and metabolome of these women (r=0.26, p=0.02; Mantel test). Although we were limited in sample size, manual exploration of the data revealed that many short peptides had differential concentrations (raw p≤0.05) between control women and women with GDM (online supplemental table 6). Following curation of all dipeptides and tripeptides, the vast majority of the peptides (50 out of 52) with significant differential concentrations showed a clear tendency of depletion in women with GDM relative to healthy control women (figure 4A,B). These peptides were enriched with the hydrophobic amino acids tyrosine, phenylalanine and alanine (p=8×10−4, 0.01, 0.01, respectively, FDR-corrected Fisher’s exact tests; figure 4C). As metabolome profiling in women uncovered important associations to GDM, we decided to use PICRUSt2 to predict metabolic pathways enriched in mice from the faecal microbiota profiles of the mice in our FMT studies. We found 16 differentially abundant metabolic pathways between GDM and control-recipient mice (online supplemental table 7). We observed an enrichment of the mevalonate pathway (PWY-922; online supplemental figure 4), corresponding with evidence of increased IL-6 levels in the GDM group of both our primary cohort and our transplanted mice23 24 and of the heme pathway (online supplemental figure 4), previously implicated in type 2 diabetes.25

Figure 4

Analysis of first trimester human faecal metabolomics exhibits lower levels of dipeptides for women with GDM. (A) Volcano plot of all metabolites examined in this study, comparing age/BMI-matched metabolite profiles of women who would and would not later develop GDM (n=20 pairs); peptides are coloured in red. (B) Heatmap of the 52 differentially expressed peptides. Each row denotes a sample (grouped by disease state) and each column denotes a peptide. Z-scores were calculated per column. Peptides (columns) were hierarchically clustered based on Euclidean distances. (C) Amino acid composition of the differentially abundant peptides. Bars (left y-axis) represent odds ratios (OR) for each amino acid, and dots (right y-axis) represent the amino acid count in the differentially abundant peptides. BMI, body mass index; GDM, gestational diabetes mellitus.

Gut microbiome composition improves prediction of GDM early in pregnancy

Finally, we built a machine learning model to predict GDM based on microbiome composition, cytokine profile, medical history and dietary features, all collected during T1 (figure 1). For this aim, we also included T1 clinical data of 66 additional women, recruited retrospectively in later stages of their pregnancy (secondary cohort, see methods). Our Xgboost model predicted GDM with very high accuracy (area under the receiver operating characteristic curve (auROC)=0.83; figure 5). When making predictions based on only a single feature, we found the highest accuracy when using medical records alone (though still 7% lower than our combined model), in agreement with a recent study.26 Faecal microbiome features resulted in the second highest accuracy (auROC=0.73). Using our two-step method (see online supplemental methods), we improved the odds ratio (OR) from 3.2 to 4, demonstrating the potential for more accurate prediction using the faecal microbiome profile, especially relevant if medical records are incomplete or unavailable. To validate the predictive power of our microbiome model, we used a validation cohort of 98 women who developed GDM with 98 matched healthy controls from a T1 pregnancy study in China.8 We first built a model based on the validation cohort to test the predictive power for this cohort based only on microbiome data (auROC=0.65). Assuringly, when applying our model (the Israeli cohort learning set) on this cohort, we found a comparable accuracy (auROC=0.6), confirming that despite the striking genetic and lifestyle differences between the cohorts, our findings are, at least partially, generalisable. We also built a model based on 86 nutritional characteristics measured in our cohort, which yielded lower predictive accuracy (auROC=0.64; figure 5) than the other features in this study. Further, no differences were found in dietary habits between women in our primary cohort who would and would not later develop GDM (online supplemental tables 8 and 9) suggesting that differences in food consumption during T1 contribute minimally to GDM pathogenesis.

Figure 5

Highly accurate prediction of future disease onset among pregnant women during their first trimester. Area under the receiver operating characteristic curve (auROC) for each combination of features. Error bars represent ±SD.


GDM biomarkers

Here, using a combination of ‘omics’ tools, we identify biomarkers of GDM onset as early as the first trimester of pregnancy. Women in T1, who later develop GDM, exhibit gut microbiota dysbiosis as well as increased proinflammatory serum cytokines and lower levels of faecal SCFAs. Further, the specific microbial changes in their microbiota are directly associated with GDM phenotype features (insulin resistance and low-grade inflammation) as revealed by FMT into GF animals. Lastly, we demonstrated that microbiota samples from T1 alone can be used to predict GDM onset and that parameters from patient medical records can improve these predictions, providing a robust tool for early prediction of GDM.

In our primary cohort, women with GDM exhibit elevated levels of serum inflammatory cytokines during T1 of pregnancy. Insulin resistance has been associated with elevated secretion of proinflammatory cytokines,27 and indeed several studies demonstrated elevated levels of proinflammatory cytokines during T2 and T3.28–30 These altered cytokine profiles in women with several months prior to a GDM diagnosis suggest that inflammation may be associated with the pathogenesis of GDM and can be used to identify its early onset. This is in line with typical GDM and type 2 diabetes symptomatology. Low-grade chronic inflammation is associated with obesity in general and maternal obesity in particular. But here, we controlled for BMI, suggesting that increased pre-GDM-associated inflammation is beyond that associated with obesity or general pregnancy. This is in line with evidence in the literature of inflammation in cases of type 2 diabetes31 32 independent of weight. Further, in pregnancy, low-grade inflammation levels can differ among women independent of BMI, suggesting that some other characteristics like immune–endocrine interactions may also be at play. In our study, we specifically observed higher levels of IL-6 in both women with pre-GDM and in GDM-recipient mice. This suggests that the elevated levels of IL-6 are driven by gut microbes. IL-6 was previously described to play a role in the development of both type 1 and type 2 diabetes33 and was proposed as a potential biomarker of gestational diabetes in 16 different studies, mostly in later stages of pregnancy.34 Our findings in T1, both in the focal cohort and in FMT experiments, support inflammation as an early marker of GDM.

Another potential early biomarker for GDM is a decrease in SCFAs, which contribute to the maintenance of glucose homeostasis and suppression of inflammatory response. Hence, SCFAs are thought to play a role in obesity-induced inflammation leading to attenuation of insulin signalling and GDM.35 We found two BSCFAs reduced in stools of women who later developed GDM. BSCFAs are a product of bacterial fermentation of branched amino acids generated from undigested protein reaching the colon. BSCFAs, proposed markers for protein fermentation, were found to improve insulin sensitivity36 37 and reduce inflammation.38 These findings, in line with several studies of later-stage pregnancy39 (but see findings from Pappa et al40), suggest faecal BSCFAs could serve as a potential biomarker for GDM in early stages of pregnancy.

The gut microbiome is associated with GDM months before diagnosis

Several studies have found altered gut microbiome composition in women with GDM; most were based on samples collected post-diagnosis.41 42 Our findings suggest that microbial differences between GDM and control groups, when controlling for confounding variables, exist in T1 and are driven by specific taxa rather than community-wide shifts, leading to subtle differences in composition.

As an illustrative example, P. copri, which is known to play a role in glucose homeostasis43 and has been reported to be more abundant in women diagnosed with GDM,41 44 was found to be under-represented in women with GDM in our primary cohort, after controlling for confounders, and also in GDM-recipient mice. A recent study demonstrated that Prevotella was a marker of positive glucose metabolism.45 Kovatcheva-Datchary et al. even showed, in a clinical trial, that Prevotella protected against Bacteroides-induced glucose intolerance and that improvement in glucose metabolism was associated with increased abundance of Prevotella.46 This improved glucose metabolism by presence of Prevotella was also demonstrated by supplementing mice with P. copri. One possible mechanism, recently proposed in rats, is that P. copri improves glucose homeostasis through farnesoid X receptor signalling and increased bile acid metabolism.47 We chose to discuss P. copri specifically as it was found to have lower abundance in both women with GDM and recipient mice and was previously described to play a role in glucose homeostasis. In this study, we also demonstrated the importance of controlling for risk factors. For example, Akkermansia muciniphila, which is consistently negatively correlated with obesity,48 is prima facie negatively associated with GDM when not controlling for the difference in BMI between the groups.

FMT highlights IL-6 as a potential contributor to GDM

Based on our multicohort FMT experiments using T1, pre-GDM-diagnosis samples, we conclude that gut microbes play a causal role in the development of some of the phenotypes of GDM and that their role is likely universal as demonstrated by conservation across cohorts. Increased levels in IL-6, in both women that would develop GDM and transplanted mice that received their microbiota, suggest an important microbiota-related inflammatory mechanism in GDM progression, further supported by functional profile prediction in mice. Two relevant bacterial pathways, the mevalonate pathway and the heme biosynthesis pathway were elevated in GDM-recipient mice. There is evidence that mevalonate can have negative implications on host inflammation—its presence reduces effects of statins in decreasing IL-6 and IL-8,23 and a kinase deficiency, which increases free mevalonate, leads to autoinflammation.49 The heme biosynthesis pathway was previously associated with elevated IL-650 51 and with type 2 diabetes.52 We do note, however, that further research is needed to understand if the bacteria themselves, their excreted metabolites or some other factors control this phenotype. Only then can we uncover specific mechanisms of pathogenesis.

Short peptides association with GDM

We found lower levels of short peptides in T1 stools of women with pre-GDM. These peptides are enriched with the amino acids phenylalanine, alanine and tyrosine. Previously, plasma levels of these three hydrophobic amino acids have also been reported to be significantly associated with diabetes.53 One study found a link between their elevated blood levels and decreased insulin secretion.54 Interestingly, Jiang et al. recently found elevated levels of alanine and tyrosine in maternal blood at 12–16 gestational weeks in women later diagnosed with GDM55; alanine is also used by the liver for gluconeogenesis.56 Increased amino acid levels in the blood may result in lower levels excreted in stool,57 though this requires further study.

Prediction of GDM

We were able to accurately predict future GDM onset in T1, weeks before the complication is typically diagnosed. Our combined model predicts GDM with very high accuracy, and even a microbiota-centric model could predict disease onset in two geographically diverse cohorts. This tool allows for accurate early prediction, care plans and potential prevention of this disease, improving both maternal and fetal outcomes. This is further supported by phenotype transfer in samples originating from cohorts in three different continents. On the whole, prediction could (and likely should) be improved using local microbiota characteristics, but genus-level differences in the microbiome can be used as general predictors in the absence of local data.


In summary, we found broad and consistent evidence that GDM pathology begins as early as T1 in a large prospective cohort of pregnant women. Additionally, we successfully demonstrated that the precursors of GDM originate in the gut microbiota and that early-onset GDM has a bacterial signature at least partially responsible for the GDM phenotype, evident from phenotype transfer following FMT. Our findings suggest that GDM is induced through heightened inflammation, initiated by microbial dysbiosis. Future research based on our findings can help unravel the underlying mechanisms.

This study has several limitations. Bacterial dysbiosis could be a first response to disease onset rather than a cause. Additionally, the phenotype transfer we observed may be caused by other faecal material including metabolites, eukaryotic microorganisms, human viruses and bacteriophages, though in this case as well, the bacterial biomarkers identified can be relevant for diagnostics. Lastly, throughout this study, we treated the major risk factors of GDM, BMI and age, using either matching or relevant statistical methods. We cannot exclude the effect of other clinical or demographic features on our results and also wish to highlight the potentially important contribution of these two ‘confounding’ risk factors. Despite limitations, addition of microbiome data to a machine learning model improved our ability to predict GDM and can even serve as a standalone snapshot predictor. These results may be of use in the future when exploring preventive measures for GDM.

Data availability statement

Data are available in a public, open access repositories. All sequencing data were submitted to EBI (project accession number ERP143097). Metabolomics data were deposited at 10.5281/zenodo.6581068.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and was approved by Clalit’s Institutional Review Board (approval no. 0135-15-COM) and Rabin Medical Center Institutional Review Board (approval no. 0263-15-RMC). Participants gave informed consent to participate in the study before taking part.


Supplementary materials


  • YP and SF are joint first authors.

  • Twitter @SondraTurjeman, @Soliman72227832, @OmryKoren

  • YP and SF contributed equally.

  • Contributors YP, SF and OK conceived the research. YP, SF and STurjeman analysed the data and wrote the manuscript. SF collected and sequenced the samples and performed experiments. YP, STurjeman, CL, EH, YL and OK wrote the manuscript. AE, MN-O and OZ performed experiments. OS and YL developed prediction models. RS, SK, FM and STamir performed metabolomics experiments. WW, JP, CL, ELJ, SR, SS, EI, KK and REL collected samples and performed experiments of the Finnish or STORK cohorts. EM and EB analysed the data. KT-G, OY, YP, EP, JP, RC, MH and EH supervised sample collection for the primary and secondary cohorts. REL, BS, YL, EH and OK analysed data, provided valuable assets and supervised the research. OK is the guarantor of the research.

  • Funding This study was funded by the Israeli Ministry of Innovation, Science & Technology (grant number 3-15521), and the Israeli Ministry of Economy (Kamin grant number 62046). OK is supported by the European Research Council Consolidator grant (grant agreement no. 101001355).

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.