Article Text

Download PDFPDF

Original research
Metagenomics of the faecal virome indicate a cumulative effect of enterovirus and gluten amount on the risk of coeliac disease autoimmunity in genetically at risk children: the TEDDY study
  1. Katri Lindfors1,
  2. Jake Lin1,2,
  3. Hye-Seung Lee3,
  4. Heikki Hyöty1,
  5. Matti Nykter1,
  6. Kalle Kurppa1,4,5,
  7. Edwin Liu6,7,
  8. Sibylle Koletzko8,9,
  9. Marian Rewers10,
  10. William Hagopian11,
  11. Jorma Toppari12,13,
  12. Annette-Gabriele Ziegler14,15,16,
  13. Beena Akolkar17,
  14. Jeffrey P Krischer3,
  15. Joseph F Petrosino18,
  16. Richard E Lloyd18,
  17. Daniel Agardh19
  18. the TEDDY Study Group
    1. 1 Faculty of Medicine and Health Techology, Tampere University, Tampere, Finland
    2. 2 Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
    3. 3 Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
    4. 4 Center for Child Health Research, Tampere University and Tampere University Hospital, Tampere, Finland
    5. 5 The University Consortium of Seinäjoki, Seinäjoki, Finland
    6. 6 University of Colorado Denver, Anschutz Medical Campus, Aurora, Colorado, USA
    7. 7 Digestive Health Institute, Children's Hospital Colorado, Aurora, United States
    8. 8 Ludwig-Maximilians-Universitat Munchen, Munchen, Bayern, Germany
    9. 9 Division of Paediatric Gastroenterology and Hepatology, Dr von Hauner Children's Hospital, Munchen, Germany
    10. 10 Barbara Davis Center for Childhood Diabetes, University of Colorado Denver, Denver, Colorado, USA
    11. 11 Pacific Northwest Research Institute, Seattle, Washington, USA
    12. 12 Research Centre for Integrative Physiology and Phamacology, Institute of Biomedicine, University of Turku, Turku, Finland
    13. 13 Department of Paediatrics, Turku University Hospital, Turku, Finland
    14. 14 Kliikum Rechts der Isar, Technische Universität München, Munchen, Bayern, Germany
    15. 15 Institute of Diabetes Research, Helmholtz Zentrum München, Germany
    16. 16 Forschergruppe Diabetes e.V, Neuherberg, Germany
    17. 17 National Institute of Diabetes and Digestive and Kidney Disease, National Institutes of Health, Bethesda, Maryland, USA
    18. 18 Alkek Center for Metagenomics and Microbiome Research, Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, Texas, USA
    19. 19 The Diabetes and Celiac Disease Unit, Department of Clinical Sciences, Lund University, Lund, Sweden
    1. Correspondence to Katri Lindfors, Faculty of Medicine and Health Techology, Tampere University, Tampere, Finland; katri.lindfors{at}


    Objective Higher gluten intake, frequent gastrointestinal infections and adenovirus, enterovirus, rotavirus and reovirus have been proposed as environmental triggers for coeliac disease. However, it is not known whether an interaction exists between the ingested gluten amount and viral exposures in the development of coeliac disease. This study investigated whether distinct viral exposures alone or together with gluten increase the risk of coeliac disease autoimmunity (CDA) in genetically predisposed children.

    Design The Environmental Determinants of Diabetes in the Young study prospectively followed children carrying the HLA risk haplotypes DQ2 and/or DQ8 and constructed a nested case–control design. From this design, 83 CDA case–control pairs were identified. Median age of CDA was 31 months. Stool samples collected monthly up to the age of 2 years were analysed for virome composition by Illumina next-generation sequencing followed by comprehensive computational virus profiling.

    Results The cumulative number of stool enteroviral exposures between 1 and 2 years of age was associated with an increased risk for CDA. In addition, there was a significant interaction between cumulative stool enteroviral exposures and gluten consumption. The risk conferred by stool enteroviruses was increased in cases reporting higher gluten intake.

    Conclusions Frequent exposure to enterovirus between 1 and 2 years of age was associated with increased risk of CDA. The increased risk conferred by the interaction between enteroviruses and higher gluten intake indicate a cumulative effect of these factors in the development of CDA.

    • coeliac disease
    • gluten
    • small bowel

    This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

    Statistics from

    Request Permissions

    If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

    Significance of this study

    What is already known on this subject?

    • Intake of high amounts of gluten increases the risk of coeliac disease in genetically predisposed children.

    • Gastrointestinal infections have been associated with an increased risk of coeliac disease.

    What are the new findings?

    • The prospective metagenomics screening of the stool virome shows that the cumulative number of stool enteroviral exposures between 1 and 2 years of age are associated with an increased risk for coeliac disease autoimmunity.

    • There is an interaction between cumulative enteroviral exposures between 1 and 2 years of age with cumulative gluten intake by 2 years of age in relation to the risk of coeliac disease autoimmunity.

    • The effect of enteroviruses on the risk of coeliac disease autoimmunity is higher when greater amounts of gluten are consumed.

    How might it impact on clinical practice in the foreseeable future?

    • This study suggests that enteroviral infections early in life are potentiated by high gluten intake, which may trigger the development of coeliac disease autoimmunity in genetically predisposed children.

    • More studies are needed to evaluate potential pathogenetic mechanisms of this interaction, which could offer new opportunities for the development of preventive strategies for coeliac disease.


    The incidence of autoimmune diseases is rising more rapidly than can be explained by genetics, supporting the role of environmental factors in the disease pathogenesis.1 Coeliac disease, a dietary gluten-driven chronic small bowel enteropathy, is characterised by an autoimmune response against tissue transglutaminase (tTG). The main autoantigen in coeliac disease is tTG, which post-translationally deamidates gluten-derived gliadin peptides.2 Coeliac disease autoimmunity (CDA), which refers to the appearance of serum tTG autoantibodies, is indicative of ongoing gluten-induced inflammatory response and may precede small bowel mucosal damage.3 4

    Higher gluten intake increases the risk of coeliac disease.5 6 However, it is not clear why not all genetically predisposed HLA-DQ2 and/or DQ8 positive individuals who eat gluten develop the disease. As a result, reasons leading to breakage of oral tolerance for gluten in coeliac disease remain unresolved. The possible role of infections in the development of coeliac disease has been supported in previous observational studies.7–10 These investigations have been extended by other studies, pointing to a potential role of viral infections, particularly by adenovirus, enterovirus, rotavirus and reovirus, in the disease pathogenesis.10–15However, these findings are mainly based on cross-sectional or experimental studies. Thus, prospective studies are warranted to clarify if a virus or a group of viruses are involved in the aetiology of coeliac disease. More importantly, studies are limited in investigating a possible interaction between viral infections and gluten intake.

    This study investigated the viral exposures prior to development of CDA from serial stool samples collected from children followed in a prospective birth cohort at genetic risk for both type 1 diabetes (T1D) and coeliac disease. By examining next-generation sequencing (NGS) metagenomics data, we assessed if viruses detected in the stool are associated with CDA and whether this association involved a possible interaction with gluten intake.

    Materials and methods

    ​The nested case–control (NCC) study design

    Following new-born screening for high-risk HLA-DR-DQ genotypes, The Environmental Determinants of Diabetes in the Young (TEDDY) enrolled 8676 children before 4.5 months of age for a 15-year follow-up study with the main aim of identifying genetic and environmental triggers associated with T1D and coeliac disease.16 Annual screening for CDA started at the age of 2 years by detection of tTG autoantibodies using radiobinding assays, as previously described.17 If a sample was tTG-autoantibody positive, all of the child’s earlier available samples were tested to determine the age of seroconversion. CDA was defined as being positive for tTG autoantibodies in two consecutive samples at least 3 months apart.18

    From this cohort, two nested case–control (NCC) studies were constructed to improve the efficiency of multiple biomarker studies, with one focused on islet autoimmunity (IA) and the other on T1D. Cases and controls were identified as of 31 May 2012, then all available samples meeting the design criteria by that time were processed in the laboratories chosen for each biomarker analysis. Case–control pairs were matched for T1D family history (defined as having a first-degree relative with T1D), gender and clinical site location in the region where the participant was enrolled.19 All children in the 1:1 NCC studies for gut virome analysis that had been screened for CDA were considered for the present study. Each case–control pair included a CDA positive child (‘case’) matched with a child (‘control’) who was CDA-free for at least 6 months from the CDA case’s age of seroconversion.

    Identified were 88 CDA case–control pairs. Among those, only the 83 pairs (44 were females and 39 were males) whose stool virome data were available after introducing gluten in their diet were included in the final analysis (figure 1, table 1). The distribution by country was as follows: USA (n=25), Finland (n=13), Germany (n=5) and Sweden (n=40). There were 16 pairs with family history of T1D. Of the CDA cases, 41 were confirmed for IA positivity and 31 of these developed IA prior to CDA. Of the controls, 11 were IA positive. During the follow-up, 28 of the CDA cases developed coeliac disease. Six CDA cases and three controls had a first-degree relative with coeliac disease. As for the rotavirus vaccination status, 15 CDA cases and 20 controls were vaccinated. Other characteristics of the cases and controls are shown in table 1.

    Figure 1

    Flow chart describing the selection of the CDA case–control pairs for the study. CDA, coeliac disease autoimmunity; IA; islet autoantibody; NCC, nested case–control; T1D, type 1 diabetes.

    Table 1

    Demographic data of the 83 nested case and control pairs

    ​Detection of viral stool sequences

    Stool samples were collected monthly from 3 months until 2 years of age. NGS viral sequences were assayed in serial stool samples using a custom offline version of Vipie.20 Vipie virus population profiling pipeline components include standard scripts for base quality, trimming, chimaera detection while de novo assembled contigs were generated via integration of local assembly methods SPAdes and Velvet.21 22 These contigs were mapped to NCBI virus database using BLAST23 resulting in a sample-based general virus population profile. For the present study, we focused on enterovirus, adenovirus, astrovirus, norovirus, reovirus and rotavirus. To approximate serotype and increase specificity, viral structural capsid-specific remapping on positive NGS samples was performed for enteroviruses and adenoviruses. For this analysis, MAPQ alignment cut-off of 20, representing greater than 0.99 probability was applied. The capsid resource contains Genbank and Tampere Virology Group-selected strains of adenovirus hexon, fibre and penton regions as well as enterovirus strains Coxsackievirus A, Coxsackievirus B and selected Echovirus P1 protein (VP1-4) regions. This study included 1507 samples processed, after introducing gluten in diet.

    ​Dietary data from food records

    By 2 years of age, information on breastfeeding and the timing of introduction to gluten-containing cereals were collected from validated questionnaires at each clinic visit occurring every 3 months.5 Information on gluten consumption was collected at each clinic visit every 3 months up to 1 year of age and biannually thereafter (24 hours recall at 3-month visit and subsequently 3-day food records). Amount of gluten intake was calculated by multiplying the amount of vegetable protein in gluten-containing flours by a factor of 0.8.24 From the 3-day food records, daily consumption (g/day) was obtained as the mean of 3 days of consumption.

    ​Statistical analysis

    The case–control pairs identified from the TEDDY NCC design were used to examine whether viral profiles differed by the CDA status. Conditional logistic regression was used to compare the cumulative appearance of viral exposures, after adjusting for HLA. Viral exposures were categorised by age:<1 year of life and from 1 to <2 years of life. From pairs where the case seroconverted prior to 2 years of age, only the samples available prior to age of seroconversion were included in the analysis. Interaction with cumulative gluten intake on the risk of CDA was examined. The cumulative gluten intake was obtained from the sum of daily consumption (g/day) from all clinical visits by 2 years of age. None reported any gluten consumption at 3-month visit. Additionally, the effects of the enterovirus sequence reads from 1 to <2 years of life were assessed in three groups by the total gluten intake: low (<33rd percentile), middle (33–66rd percentile) and high (>66rd percentile) based on unique individuals included in the analysis. Two-sided p-values are reported. Statistical significance was determined when the p-value was <0.05. All statistical analyses were performed using SAS V.9.4.


    ​Association of viral exposures with CDA

    The highest coverage of stool samples (available in 72.9% of the case–control pairs) was at 9 months where after the number gradually declined and at 24 months samples were available in 21.7% of the pairs (figure 2). Among the available stool samples at each collection age, the percentage of samples positive for any virus fluctuated between 22% and 50% without no obvious peaks at any collection age (figure 2A). The frequency of enterovirus positive samples ranged from 0% to 21% from the age 6 months onwards (figure 2B).

    Figure 2

    Stool samples positive for (A) any of the investigated viruses and (B) enteroviruses by 2 years of age as a percentage of samples available at each collection age. Filled triangles denote cases with CDA and unfilled circles controls. Bars represent the percentage of case–control pairs from whom stool samples were available for analysis at each collection age. CDA, coeliac disease autoimmunity.

    Between the time of first introduction of gluten at median 6 months of age and 1 year of age, 63 cases compared with 72 controls had at least one viral exposure (table 2). Enterovirus and adenoviruses were detected in 17 and 56 cases compared with 19 and 65 controls, respectively. Reoviral exposures were detected in only one case and one control, and rotaviral exposures were detected only in one of the controls. The cumulative amount of any viral exposures (OR 0.68, 95% CI 0.49 to 0.94, p=0.02), and specifically those by adenoviruses (OR 0.69, 95% CI 0.48 to 0.99, p=0.04), were inversely associated with CDA when adjusting for HLA (table 2). None of the other individual viruses were associated with CDA.

    Table 2

    HLA-adjusted OR of cumulative virus detections in stool up to 1 year of age from a matched CDA case and control study

    Between 1 and 2 years of age, 59 cases compared with 58 controls had at least one positive stool sample with any of the selected viruses (table 3). Adenoviruses were detected in 52 cases compared with 55 controls, whereas enteroviruses were detected in 31 cases compared with 16 controls. Reovirus sequences were detected in one case, but none of the controls, while rotavirus sequences were not detected in any subjects. Cumulative number of positive stool samples for any virus was associated with an increased risk for CDA (OR 1.60, 95% CI 1.12 to 2.29, p=0.01). Of the different viruses, enteroviruses conferred the strongest positive association with CDA (OR 2.56, 95% CI 1.19 to 5.51, p=0.02) (table 3). When excluding the enteroviruses, the association of any viruses with the development of CDA was lost (OR 1.35, 95% CI 0.92 to 1.98, p=0.13). The ORs for the enterovirus B group were in the same direction when excluding the 42 pairs where either the case or control developed IA or T1D prior to seroconversion of CDA, although the reduced sample size did not have enough power to show differences between the groups (table 3).

    Table 3

    HLA-adjusted OR of cumulative viral detections in stool between 1 and 2 years of age: a matched CDA case and controls study

    ​Effects of feeding habits and viral exposures on the risk of CDA

    The cumulative amount of any viral infections or enteroviruses during the period after gluten introduction but while breastfeeding was still ongoing was not associated with CDA (OR 1.15, 95% CI 0.72 to 1.82, p=0.56 and OR 0.98, 95% CI 0.43 to 2.21, p=0.96, respectively). When restricting the analysis to the time period after the end of any breastfeeding and to stool samples collected between 1 and 2 years of age when the children were exposed to gluten, both the cumulative amount of any virus sequence reads (OR 1.41, 95% CI 1.00 to 2.00, p=0.05) and enterovirus sequence reads (OR 2.47, 95% CI, 1.12 to 5.48, p=0.03) were associated with CDA.

    There was a significant interaction between enteroviruses between 1 and 2 years of age and the cumulative gluten intake by 2 years of age in the risk of CDA (p=0.03) (table 4). The risk of CDA was the highest among enterovirus positive children who had the highest cumulative gluten intake by 2 years of age (OR 8.3 (95% CI 1.8 to 37.1), as compared with those consuming middle or low amounts (OR 2.9, 95% CI 1.2 to 7.1 and OR 1.0, 95% CI 0.4 to 2.8, respectively) figure 3.

    Table 4

    HLA-adjusted interactions of specific viruses or virus serotypes between 1 and 2 years of age with the cumulative amount of ingested gluten up to the age of 2 years

    Figure 3

    Effect of the enteroviral exposures between 1 and 2 years of age and risk of coeliac disease autoimmunity stratified by cumulative gluten consumption up to 2 years of age.


    This study showed that the cumulative number of stool enteroviral exposures between 1 and 2 years of age was associated with CDA. In addition, an interaction between enteroviral exposures and gluten intake was observed. More importantly, the risk of CDA was increased in cases reporting higher intake of gluten. These results indicate that enteroviral exposure augmented by higher gluten intake could act as triggers of coeliac disease in genetically at-risk children.

    Only few previous studies have reported an association between enteroviruses and coeliac disease of which one study found an increased number of tTG autoantibody positive subjects among individuals with a proven enteroviral infection.25 In addition, enteroviruses have previously been detected in the small bowel mucosa of coeliac disease patients.26 Moreover, enteroviral infections prior to 1 year of age were not associated with increased risk of coeliac disease, while those occurring after the age of 1 year increased the risk.12 27 Our findings are consistent with both of these previous studies. Additionally, in line with the Norwegian study,12 our results also showed that enteroviruses increased the risk of CDA only when restricting the analysis to the period when breastfeeding had ceased, but not while breastfeeding was still ongoing. As breastfeeding does not seem to be associated with the development of coeliac disease,28 29 this finding might indicate that enteroviral infections modulate CDA risk at an age when the child has ceased being breastfed.

    This study extends previous investigations by showing an interaction between cumulative enteroviral exposures between age 1 and 2 with cumulative amount of gluten intake by 2 years of age. In addition, enteroviruses were associated with high risk, particularly among children reporting higher gluten intake, indicating that the gluten amount could amplify the effect of enteroviruses in the development of CDA. As comprehensive BLAST searches of the enterovirus contigs compiled from the NGS-derived enterovirus sequence reads did not reveal hits to gluten (data not shown), molecular mimicry likely does not account for the additive risk effect. Distinct isolate of reovirus, T1L has recently been shown to abrogate oral tolerance to gluten in a mechanism involving a type 1 interferon-induced activation of gluten-specific inflammatory T cells and inhibition of regulatory T cells.15 30 Along with these, the T1L reovirus infection was also reported to activate tTG.15 As enteroviruses have been shown to induce type 1 interferons,31 it could be speculated that enteroviruses may promote the development of coeliac disease by a mechanism involving an enterovirus-induced type 1 interferon-mediated breakage of oral tolerance to gluten coupled with activation of tTG similar to reovirus. Higher gluten intake will result in more gliadin peptides available for tTG-mediated deamidation resulting in augmented activation of the immune system ultimately leading chronic gut inflammation.

    According to our results, cumulative adenoviral infections were associated with reduced risk for CDA after gluten introduction up to 1 year of age suggesting a putatively protective role. Our finding thus contradicts previous results that have pointed to either an inductive role11 32 33 or no role at all.12 34 35 Since none of the previous studies focused on the age window before 1 year of age, our finding of a possible protective effect of adenoviral infections warrants further research. However, if additional studies confirm the protective effect of adenovirus infections prior to 1 year of age, the diverging findings of negative and positive associations before and after 1 year of age raises the idea of a more generalised effective time-window effect. Contributing factors might involve the loss of immune protection from maternal antibodies as well as changes, also other than breastfeeding in the feeding habit that occur around 1 year of age.

    Previous studies based on vaccination data or determination of rotavirus targeted antibodies have suggested that rotaviruses could trigger coeliac disease.10 13 14 36 Other studies applying animal models or circulating reovirus antibodies indicate a role of reoviruses in the disease pathogenesis.15 30 In the present study, rotavirus and reovirus sequence reads were detected in a minority of the stool samples. In the case of rotavirus, this is consistent with previous reports using PCR for virus detection in the stool.37 Both rotavirus and reovirus are rapidly cleared from the stools after an infection.38 Thus, more frequent stool sampling and other methods would therefore be required to detect short-lived or transiently occurring stool viruses in this study.

    A major strength of this study is the design of analysing viral exposures from prospectively collected stool samples from birth until 2 years of age in cases compared with matched controls. Another strength is that our analysis focuses on viral exposures detected using capsid protein mappings prior to development of CDA, which can delineate disease progression and timing on the study outcome. However, the CDA case–control pairs were identified among either IA or T1D NCC pairs which resulted in high percentage of IA positive subjects in our cohort. As enterovirus infections are associated with the induction of beta-cell autoimmunity,39 it is possible that the higher frequency of IA positive subjects among the CDA cases may have confounded the results. In addition, the variation of sequence coverage and number of subject pairs restricted the identification of individual enterovirus types, which limits the analysis to species level. Another potential limitation was not adjusting for having a first-degree family member with coeliac disease. This was not carried out in order to avoid a selection bias due to more likely antibody screening of family members among the CDA cases.

    In conclusion, the present study found that a cumulative number of stool enteroviral exposures is associated with CDA in children at genetic risk for coeliac disease. In addition, the interaction of enteroviral exposures and higher gluten intake indicate a cumulative effect of these factors in the development of CDA.



    • KL and JL contributed equally.

    • Collaborators Aaron Barbour; Kimberly Bautista; Judith Baxter; Daniel Felipe-Morales; Kimberly Driscoll; Brigitte I Frohnert; Marisa Stahl; Patricia Gesualdo; Michelle Hoffman; Rachel Karban; Jill Norris; Stesha Peacock; Hanan Shorrosh; Andrea Steck; Megan Stern; Erica Villegas; Kathleen Waugh; Olli G Simell; Annika Adamsson; Suvi Ahonen; Mari Åkerlund; Leena Hakola; Anne Hekkala; Henna Holappa; Anni Ikonen; Jorma Ilonen; Sinikka Jäminki; Sanna Jokipuu; Leena Karlsson; Jukka Kero; Miia Kähönen; Mikael Knip; Minna-Liisa Koivikko; Merja Koskinen; Mirva Koreasalo; Jarita Kytölä; Tiina Latva-aho; Maria Lönnrot; Elina Mäntymäki; Markus Mattila; Maija Miettinen; Katja Multasuo; Teija Mykkänen; Tiina Niininen; Sari Niinistö; Mia Nyblom; Sami Oikarinen; Paula Ollikainen; Zhian Othmani; Sirpa Pohjola; Petra Rajala; Jenna Rautanen; Anne Riikonen; Eija Riski; Miia Pekkola; Minna Romo; Satu Ruohonen; Satu Simell; Maija Sjöberg; Aino Stenius; Päivi Tossavainen; Mari Vähä-Mäkilä; Sini Vainionpää; Eeva Varjonen; Riitta Veijola; Irene Viinikangas; Suvi M Virtanen; Jin-Xiong She; Desmond Schatz; Diane Hopkins; Leigh Steed; Jennifer Bryant; Katherine Silvis; Michael Haller; Melissa Gardiner; Richard McIndoe; Ashok Sharma; Stephen W Anderson; Laura Jacobsen; John Marks; Ezio Bonifacio; Anita Gavrisan; Cigdem Gezginci; Anja Heublein; Verena Hoffmann; Sandra Hummel; Andrea Keimer; Annette Knopff; Charlotte Koch; Claudia Ramminger; Roswith Roth; Marlon Scholz; Joanna Stock; Katharina Warncke; Lorena Wendel; Christiane Winkler; Åke Lernmark; Carin Andrén Aronsson; Maria Ask; Rasmus Bennet; Corrado Cilio; Helene Engqvist; Emelie Ericson-Hallström; Annika Fors; Lina Fransson; Thomas Gard; Monika Hansen; Hanna Jisser; Fredrik Johansen; Berglind Jonsdottir; Silvija Jovic; Helena Elding Larsson; Marielle Lindström; Markus Lundgren; Marlena Maziarz; Maria Månsson-Martinez; Maria Markan; Jessica Melin; Zeliha Mestan; Caroline Nilsson; Karin Ottosson; Kobra Rahmati; Anita Ramelius; Falastin Salami; Anette Sjöberg; Birgitta Sjöberg; Malin Svensson; Carina Törn; Anne Wallin; Åsa Wimar; Sofie Åberg; Michael Killian; Claire Cowen Crouch; Jennifer Skidmore; Masumeh Chavoshi; Rachel Hervey; Rachel Lyons; Arlene Meyer; Denise Mulenga; Jared Radtke; Matei Romancik; Davey Schmitt; Sarah Zink; Dorothy Becker; Margaret Franciscus; MaryEllen Dalmagro-Elias Smith; Ashi Daftary; Mary Beth Klein; Chrystal Yates; Sarah Austin-Gonzalez; Maryouri Avendano; Sandra Baethke; Rasheedah Brown; Brant Burkhardt; Martha Butterworth; Joanna Clasen; David Cuthbertson; Stephen Dankyi; Christopher Eberhard; Steven Fiske; Jennifer Garmeson; Veena Gowda; Kathleen Heyman; Belinda Hsiao; Christina Karges; Francisco Perez Laras; Qian Li; Shu Liu; Xiang Liu; Kristian Lynch; Colleen Maguire; Jamie Malloy; Cristina McCarthy; Aubrie Merrell; Hemang Parikh; Ryan Quigley; Cassandra Remedios; Chris Shaffer; Laura Smith; Susan Smith; Noah Sulman; Roy Tamura; Dena Tewey; Michael Toth; Ulla Uusitalo; Kendra Vehik; Ponni Vijayakandipan; Keith Wood; Jimin Yang; Michael Abbondondolo; Lori Ballard; David Hadley; Wendy McLeod; Steven Meulemans; Liping Yu; Dongmei Miao; Polly Bingley; Alistair Williams; Kyla Chandler; Olivia Ball; Ilana Kelland; Sian Grace; Masumeh Chavoshi; Jared Radtke; Sarah Zink; Nadim J Ajami; Matthew C Ross; Jacqueline L O’Brien; Diane S Hutchinson; Daniel P Smith; Matthew C Wong; Xianjun Tian; Tulin Ayvaz; Auriole Tamegnon; Nguyen Truong; Hannah Moreno; Lauren Riley; Eduardo Moreno; Tonya Bauch; Lenka Kusic; Ginger Metcalf; Donna Muzny; HarshaVArdhan Doddapaneni; Richard Gibbs; Sandra Ke; Niveen Mulholland; Kasia Bourcier; Thomas Briese; Suzanne Bennett Johnson; Eric Triplett.

    • Contributors Study concept and design: JL, HH, REL, MN, KK, EL, SK, MR, WH, JT, AGZ, BA, JPK, JFP, KL, DA; administrative, technical and material support: WH, JT, BA, JPK; acquisition, analysis and interpretation of the data: JL, HSL, HH, REL, MN, KK, EL, SK, WH, JT, AGZ, BA, JPK, JFP, KL, DA; statistical analysis: JL, HSL, MN, JPK, JFP; drafting of the manuscript: JL, HH, KL, DA; obtaining funding: MR, WH, JT, AGZ, BA, JPK; study supervision: HH, MR, WH, JT, AGZ, BA, JPK, DA. All authors have participated in the critical revision of the manuscript for important intellectual content.

    • Funding The TEDDY Study was funded by grants U01 DK63829, U01 6 DK63861, U01 DK63821, U01 42 DK63865, U01 DK63863, U01 DK63836, 7 U01 DK63790, UC4 DK63829, UC4 DK63861, UC4 43 DK63821, UC4 8 DK63865, UC4 DK63863, UC4 DK63836, UC4 DK95300, UC4 DK100238, and 44 UC4 DK106955, UC4 DK112243, UC4 DK117483, and Contract No. HHSN267200700014C from the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institute of 2 Allergy and Infectious Diseases (NIAID), Eunice Kennedy Shriver National Institute of Child Health 3 and Human Development (NICHD), National Institute of Environmental Health Sciences (NIEHS), 4 Centers for Disease Control and Prevention (CDC) and JDRF. This work supported in part by the 5 NIH/NCATS Clinical and Translational Science Awards to the University of Florida (UL1 6 TR000064) and the University of Colorado (UL1 TR001082).

    • Competing interests HH is a shareholder and chairman of the board of Vactech Ltd, which develops vaccines against picornaviruses.

    • Patient consent for publication Not required.

    • Provenance and peer review Not commissioned; externally peer reviewed.

    • Data availability statement De-identified datasets generated and analysed during the current study will be made available by request from the NIDDK Central Repository at The TEDDY metagenomics next-generation sequencing (NGS) data that support the findings of this study will be available by request from NCBI’s database of Genotypes and Phenotypes (dbGaP) with the primary accession code phs001442.