Article Text

Download PDFPDF

Original article
Geographical patterns of the standing and active human gut microbiome in health and IBD
  1. Ateequr Rehman1,
  2. Philipp Rausch2,3,
  3. Jun Wang2,3,
  4. Jurgita Skieceviciene1,4,
  5. Gediminas Kiudelis5,
  6. Ketan Bhagalia6,
  7. Deepak Amarapurkar6,
  8. Limas Kupcinskas4,5,
  9. Stefan Schreiber1,7,
  10. Philip Rosenstiel1,
  11. John F Baines2,3,
  12. Stephan Ott7
  1. 1Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel, Germany
  2. 2Max Planck Institute for Evolutionary Biology, Plön, Germany
  3. 3Institute for Experimental Medicine, Christian-Albrechts-University of Kiel, Kiel, Germany
  4. 4Institute for Digestive Research, Medical Academy, Lithuanian University of Health Sciences, Kaunas, Lithuania
  5. 5Department of Gastroenterology, Medical Academy, Lithuanian University of Health Sciences, Kaunas, Lithuania
  6. 6Bombay Hospital and Medical Research Center, Mumbai, India
  7. 7Department of General Internal Medicine, Christian-Albrechts-University of Kiel, University Hospital Schleswig-Holstein, Kiel, Germany
  1. Correspondence to Professor Philip Rosenstiel; Schittenhelmstr. 12, Institut für Klinische Molekular Biologie Kiel D-24105, Germany; p.rosenstiel{at} Professor John F Baines; Arnold-Heller-Str. 3, Haus 17, Institut für Experimentelle Medizin, Kiel D- 24105, Germany; j.baines{at} Dr. Stephan Ott; Arnold-Heller-Str. 3, Haus 6, Klinik für Innere Medizin I Kiel D-24105, Germany;{at}


Objective A global increase of IBD has been reported, especially in countries that previously had low incidence rates. Also, the knowledge of the human gut microbiome is steadily increasing, however, limited information regarding its variation on a global scale is available. In the light of the microbial involvement in IBDs, we aimed to (1) identify shared and distinct IBD-associated mucosal microbiota patterns from different geographical regions including Europe (Germany, Lithuania) and South Asia (India) and (2) determine whether profiling based on 16S rRNA transcripts provides additional resolution, both of which may hold important clinical relevance.

Design In this study, we analyse a set of 89 mucosal biopsies sampled from individuals of German, Lithuanian and Indian origins, using bacterial community profiling of a roughly equal number of healthy controls, patients with Crohn's disease and UC from each location, and analyse 16S rDNA and rRNA as proxies for standing and active microbial community structure, respectively.

Results We find pronounced population-specific as well as general disease patterns in the major phyla and patterns of diversity, which differ between the standing and active communities. The geographical origin of samples dominates the patterns of β diversity with locally restricted disease clusters and more pronounced effects in the active microbial communities. However, two genera belonging to the Clostridium leptum subgroup, Faecalibacteria and Papillibacter, display consistent patterns with respect to disease status and may thus serve as reliable ‘microbiomarkers’.

Conclusions These analyses reveal important interactions of patients’ geographical origin and disease in the interpretation of disease-associated changes in microbial communities and highlight the added value of analysing communities on both the 16S rRNA gene (DNA) and transcript (RNA) level.

View Full Text

Statistics from

Significance of this study

What is already known on this subject?

  • IBD impacts microbial community structure in the gut.

  • Microbial communities differ between human populations, potentially driven by variation in genetic polymorphism, lifestyle and environmental conditions.

  • Diversity within and between bacterial communities change over a lifetime.

What are the new findings?

  • Pathological community patterns observed in IBD are influenced by local, population-specific factors, but also show shared elements between the different cohorts.

  • The active bacterial component (rRNA) shows lower diversity and intercohort variation than the standing diversity (rDNA).

  • The bacterial communities investigated on the RNA level reveal stronger disease-associated patterns.

How might it impact on clinical practice in the foreseeable future?

  • Variation of mucosal microbial communities among human populations constitutes an important factor when considering the microbiome as a target for treatment of IBD.

  • The identification of disease-associated microbial patterns shared between distinct geographical regions might be specifically useful to tailor a set of ‘microbiomarkers’ for a molecular assessment of IBD.


While a steady increase of IBD has been observed over the past decades in North America and Europe, a dramatic rise in incidence rates is observed in countries that have recently adopted a Western industrialised lifestyle, for example, East and South Asia, or states of the former Soviet Union.1 ,2 Suspected environmental factors include increased levels of hygiene (decrease in antigen contacts), changes in nutritional habits and the industrialisation of food production and preservation.36 It can be assumed that all these environmental cues specifically act on the acquisition, composition and stability of the intestinal microbiome. Only recently have international studies begun to explore microbial communities in human populations at different body sites on broader geographical scales.7 ,8 A common concept in microbial biogeography proposes a world-wide, passive dispersal of bacteria, followed by environmental filtering of bacterial assemblies (ie, ‘everything is everywhere, but the environment selects’).9 Many studies addressed this hypothesis in an environmental context,10–12 but how those globally distinct environmental microbial assemblages are translated into the stable adult intestinal microbiome is still poorly understood. However, it seems obvious that physical distance and variables that correlate with this (eg, temperature), the genetic makeup of the host, diet and other sociocultural habits will play decisive roles. Whatever the exact reasons for geographical differences in human-associated microbial communities, they are likely important factors to be considered for understanding disease aetiologies among human populations. Qin et al13 were among the first to identify a common faecal microbial gene catalogue and to discern communities according to an underlying condition of IBD not restricted to a single human population. More recent in-depth analyses of faecal communities included divergent populations,8 but only recently focused on disease states.14

It can be hypothesised that the microbial communities tightly associated with the intestinal mucosa might be under stronger host control15 and less subject to transient perturbations compared with the luminal microbiota. Although they might have a greater impact on homeostasis, mucosal communities are comparatively understudied. In this study, we investigate the impact of the two major forms of IBD, Crohn's disease (CD) and UC, on the bacterial communities associated with the colonic mucosa in a geographical context. Colonic biopsies were obtained from patients with IBD and controls originating from Germany, Lithuania and India. While most previous studies focused on the 16S rDNA level, we employ bacterial community profiling on the levels of both 16S rDNA and rRNA. This comprehensive approach distinguishes between standing and active microbial communities, which together enables us to explore variation among geographically distinct microbiomes and also to relate these differences to the patterns observed in IBD.

Material and methods

Human samples

Colonic biopsies were taken from the sigmoid region of healthy subjects and patients in clinical remission. The diagnoses of UC and CD were based on standard clinical, endoscopical, radiological and histological criteria. All samples and phenotype information were pseudonymised before the procedure. All individuals agreed to participate by giving informed consent at least 24 h before sampling. Details on age, sex, disease status and medication are provided in table 1. Due to mean differences in age between population cohorts, a normalisation within populations was performed to account for this in interpopulation comparisons by subtracting the minimum age within each population.

Table 1

Patient information for each population

Nucleic acids extraction and 16S rRNA pyrosequencing

DNA and RNA were extracted using the Qiagen Allprep DNA/RNA as previously described (see online supplementary material16). RNA was reverse transcribed to cDNA using random hexamers (Qiagen, Hilden, Germany). Nucleic acid extraction and reverse transcription of Indian samples were performed on-site in India. Reverse transcribed cDNA and genomic DNA were freeze dried and transported on dry ice to Germany for further processing. Frozen biopsies sampled in Lithuania were transported on dry ice to be processed in Kiel, Germany. The 16S rRNA gene (RNA and DNA) was amplified with the 27F-338R primer pair and sequenced as described before.17 Sequences were processed using Mothur V.1.15.0,18 and filtered using stringent quality criteria (see online supplementary methods).

Statistical analysis

α Diversity and β diversity indices (Jaccard and Bray–Curtis (square root transformed)) were calculated in R.19–21 FASTUniFrac was used to calculate the unweighted and normalised weighted UniFrac metrics.22 Statistical analysis of community distances was performed with non-parametric distance-based analysis of variance (ANOVA) using ‘adonis’, Mantel correlation, Procrustes analysis and fitting of centroids were implemented in the ‘vegan’ package for R and tested with 105 permutations to assess significance.23 ,24 Redundancy Analysis (RDA) was carried out on Hellinger-transformed Operational Taxonomic Unit (OTU) tables and tested using a permutative ANOVA approach.25 Comparisons of means (ie, phyla abundances, α diversity) followed a linear model framework using standard model selection procedures (minimising AIC values without a significant loss of fit) requiring normally distributed residuals. Indicator species analysis was implemented via the R package ‘indicspecies’ with 105 permutations.26 The activity of genera and species was estimated through rRNA/rDNA ratio, while divisions of and by zero were set to zero. Differentially active bacteria were detected by Kruskal–Wallis tests. p Values of the genera/OTU associations (rDNA, rRNA, activity) were adjusted using the Benjamini and Hochberg procedure.


Phylum abundances are influenced by disease status and sampling population

To investigate the influence of IBD on the mucosa-associated microbiota in a broader geographical context, sigmoidal biopsies were obtained from ∼10 each of healthy controls, patients with CD and UC, residing in Germany, Lithuania and India, totalling 89 samples (cohort details in table 1). Pyrosequencing of the V1–V2 region of the 16S rRNA gene was performed on the level of both 16S rDNA and rRNA (reverse transcribed to cDNA, see the Materials and methods section). Normalisation (∼1000 sequences per individual) yielded 88000 rDNA and 86974 rRNA sequences. A single control rDNA sample from Germany and single control and CD rRNA samples from Lithuania (ie, in total three samples) were not included in further analysis due to low sequencing coverage. Species-level OTUs (97% identity OTUs) were clustered using the combined rDNA and rRNA-level datasets, and split accordingly. This resulted in a community coverage of 83.45±5.08% and 89.22±5.54% of species for rDNA and rRNA, respectively (Good's coverage, see online supplementary figure S1).

We first analysed phylum abundances in a global manner (ie, across all three populations), whereby complex differences between standing (rDNA) and active (rRNA) communities were observed for most phyla (see online supplementary figure S2 and S3). Overall, Bacteroidetes and Proteobacteria show inverse effects among the active and standing microbial communities and are negatively associated with each other (rDNA: r=−0.527, p=1.33×10−7, rRNA: r=−0.254, p=0.0176). Bacteroidetes show a significant increase with age in the rDNA samples (figure 1A), whereas the rRNA samples further reveal influences of disease status on the abundance of active Bacteroidetes, mainly by a higher abundance in UC samples across populations (figure 1B, see online supplementary table S1). Proteobacteria abundance decreases with age in the rDNA-based samples (figure 1E), inversely with Bacteroidetes. The abundance of active Proteobacteria in contrast does not decrease with age, but displays a decrease in patients with UC compared with patients with CD and healthy samples, which is also influenced by the subject's gender (figure 1F, see online supplementary table S1). The Firmicutes abundances based on rDNA mainly display differences between healthy controls and patients with UC across populations, especially among German and Lithuanian samples (figure 1C), which is confirmed in separate analyses for each population (see below, see online supplementary table S2). The Firmicutes abundances based on rRNA show significant differences between European (Germany, Lithuania) and Indian samples, as well as between pathologies within and among the sampling cohorts (figure 1D, see online supplementary table S1).

Figure 1

Comparative analysis of mucosa-attached bacterial communities at the phylum level. Plots of phyla abundances based on 16S rDNA (A, C, E) and rRNA (B, D, F) visualise the effects of the best statistical model (Firmicutes, Bacteroidetes, Proteobacteria, error bars represent SD). CD, Crohn's Disease; CON, control; GER, Germany; IND, India; LIT, Lithuania.

Second, we analysed each single population separately. This reveals pronounced differences especially in Firmicutes between disease groups within each population, based on both rRNA and rDNA, although the relative phylum-level patterns between investigated groups are not consistent among populations (see online supplementary table S2). In particular, Firmicutes abundance is the lowest in healthy German samples compared with diseased individuals, while Lithuanian and Indian patients with CD show the lowest Firmicutes abundances. Bacteroidetes, on the other hand, show common patterns of age and pathology in Lithuanian and Indian patients but not in Germans. Bacteroidetes show also a population-independent increase in abundance in the standing and active bacteria among healthy and UC subjects. Proteobacteria also display an increased abundance in CD among Lithuanians and Indians, while no apparent effects were present in German samples (see online supplementary table S2).

In summary, we revealed interesting age-related patterns for both Bacteroidetes and Proteobacteria, while population-specific disease-related patterns are present among the Firmicutes. Furthermore, basing analyses on 16S rRNA in general provided greater resolution in detecting disease and population-specific effects.

Patterns of bacterial diversity within and between individuals is influenced by age, population-specific effects and disease

α Diversity

We focused our analysis on a panel of diversity measures which provide information about the approximate species number,27 entropy and evenness of the community,19 as well as its phylogenetic diversity.20 Interestingly, we find significantly higher diversity in rDNA-based samples and a moderate correlation between the species diversities of the standing and active communities (figure 2A–C).

Figure 2

Analysis of mucosa-attached bacterial communities identifies a common increase of bacterial diversity with age regardless of diagnosis and geographical origin. Correlation of α diversity metrics based on rDNA and rRNA (Chao1 species richness: r=0.493, p=1.97×10−6 (A); Shannon H (Jost): r=0.609, p=9.223×10−11 (B); phylogenetic diversity: r=0.355, p=0.001 (C)). Species richness according to the best statistical model in rDNA (D) and rRNA (E) derived communities (table 2; for details on Shannon H and phylogenetic diversity, see figure S4). CD, Crohn's Disease; CON, control; GER, Germany; IND, India; LIT, Lithuania.

First, we analysed the panel of α diversity indices globally among all samples. Investigating species richness (using Chao1 index), we observe increases of species number with age in the standing microbial community (rDNA), while species richness in the active communities (rRNA) increases with age and shows significantly lower diversity among patients with CD (figure 2D, E, table 2). By applying Shannon entropy,19 which represents the distribution of species in a sample, we also find an increase in diversity with age in the rDNA-based and rRNA-based communities (see online supplementary figure S4A, table 2). Phylogenetic diversity of the standing community is also correlated with a subject's age, but increases only in patients with CD (see online supplementary figure S4C, table 2). The rRNA-based samples display differences between CD and healthy controls, between CD and UC, as well as between European and Indian samples (see online supplementary figures S4B,D, S5 and table S3; also see table 2), with the highest level of species and phylogenetic diversity among healthy individuals.

Table 2

Statistical analyses of α diversity based on species distribution (Shannon H), richness (Chao1) and phylogenetic diversity in DNA-based and RNA-based samples

The increase of community diversity with age can be a sign of community succession, that is, a change in community structure over time, or a lack of colonisation resistance. To further investigate potential confounding effects of disease on those succession patterns, we analysed each disease state and population cohort separately. Interestingly, the strongest signal of succession is present in the active and standing bacterial communities of patients with CD, while in healthy individuals and patients with UC, diversity does not consistently increase with age (see online supplementary figures S5, S6 and supplementary table S3). Thus, in summary, the rRNA-based samples display reduced diversity compared with rDNA-based samples, and at the same time provide more resolution to detect influences of sampling region and disease compared with rDNA. The age-related patterns of increasing species diversity appear to be largely limited to CD and may point towards a reduced colonisation resistance of the disturbed microbial communities in IBD. Further, although systematic differences are present between geographical locations, a consistent pattern with respect to disease status nested within each location, based on rRNA, is that the diversity decreases from healthy individuals, followed by patients with UC and is the lowest in patients with CD.

β Diversity

To further evaluate the contribution of geographical origin, disease status and their interactions, we performed analyses based on the phylogenetic β diversity measure UniFrac (weighted and unweighted), as well as on metrics considering the shared presence (Jaccard) or abundance (Bray–Curtis) of species level OTUs. First, we applied a model including each factor on all β diversity metrics using non-parametric multivariate ANOVA (‘adonis’, see the Materials and methods section). Second, we complemented these analyses with individual pairwise comparisons with respect to only population of origin and disease status (table 3). The analyses show that population is the most influential factor, displaying significant differences especially between European and Indian samples, for all four β diversity measures in rDNA-based and rRNA-based samples (table 3, see online supplementary figure S7). The influence of disease status alone is less apparent, with relatively small differences among the rDNA-based samples (table 3, see online supplementary figure S7A, C, E, G). By contrast, rRNA-based samples reveal significant influences of disease for all β diversity measures, and individual pairwise comparisons between health conditions also uncovered differences between the two pathologies (table 3, see online supplementary figure S7B, D, F, H). Interactions between population and disease were more pronounced than disease alone, displaying significant regional disease-associated communities based on presence/absence and abundance of bacteria in the rDNA-based and rRNA-based samples (see online supplementary figure S7A–D). Interestingly, in addition to the greater number of significant influences detected, consistently more variation is explained among the rRNA-based samples (table 3, also see online supplementary figure S7 and table S4), which are correlated with the standing community profiles (rDNA), but differ in the abundance of several genera (see online supplementary figure S8). Changes in community composition with respect to age were also observed, especially in the phylogenetic profile of rDNA-based and rRNA-based samples (table 3, see online supplementary figure S7).

Table 3

β diversity analyses via non-parametric distance-based analysis of variance (adonis) using population, disease condition, their interaction, age and the pairwise comparisons among countries and disease conditions (bold face highlights significant comparisons)

To directly assess the influences of population origin and disease condition, we applied RDA to model these effects on bacterial communities using individual bacterial distributions among those environmental factors. To test whether local disease patterns are present, we included the interaction between disease and population in the RDA model, which indicates the presence of significant local disease effects in the standing and active microbial communities (figure 3A, C). The variation explained by these models is relatively small (rDNA: adjusted R2=0.042; rRNA: adjusted R2=0.102), which stresses the high interpopulation and intrapopulation variability of the microbiome. However, an interesting observation is the dominating influence of population and that common disease effects are observed only in higher, less important dimensions of the ordinations (see online supplementary figures S9 and S10). To explore whether general influences of IBD can be observed, we cancelled out the influence of host population beforehand using partial Redundancy Analysis (pRDA). This revealed significant disease clusters in both datasets (rDNA: F2.83=1.183, p=0.040, R2=0.026; rRNA: F2.82=1.574, p=0.001, R2=0.033). Thus, these results highlight the importance of geographically restricted environmental factors driving microbial community differentiation, leaving a weak but universal disease imprint after correction for sampling population (figure 3B, D). Further analyses on the single populations and disease subsets support the existence of regional disease microbiomes, as similar pathologies differ in their microbial communities between sampling regions, which appear stronger in the active communities (see online supplementary table S4). As medication, in particular antibiotic use, can influence microbial communities and confound these analyses, we further investigated these variables, but identified only minor effects within and among the different populations (see online supplementary analyses).

Figure 3

Identification of population of origin and disease effects on microbial community structures: influence of active (16S rRNA) and standing (16S rDNA) bacteria. Redundancy analysis (RDA) (A and C) and partial RDA (B and D) of DNA-based and RNA-based datasets. An RDA of DNA-based microbial communities reveals strong influence of sampling population and local disease regimes (population-F2.79=2.650, R2=0.058, p=0.001; disease-F2.79=1.194, R2=0.026, p=0.027; disease by population-F4.79=1.163, R2=0.051, p=0.018) and RDA on the active community with higher explanatory power and an increased influence of disease condition (C: population-F2.78=4.747, R2=0.099, p=0.001; disease-F2.78=1.587, R2=0.033, p=0.001; disease by population-F4.78=1.269, R2=0.053, p=0.005). The axes shown in the pRDA are the main axes of variation (rDNA: pRDA 1-F1.83=1.198, p=0.073, pRDA 2-F1.83=1.167, p=0.083; rRNA: pRDA 1-F1.82=1.896, p=0.001; pRDA 2-F1.82=1.251, p=0.059). For additional significant dimensions of A and C, see online supplementary figures S9 and 10. CD, Crohn's Disease; CON, control; GER, Germany; IND, India; LIT, Lithuania.

Indicator bacteria analysis shows strong disease by population associations

To identify bacteria that are more frequently present and abundant with respect to population, disease or their interaction, we used indicator species analysis.26 This analysis was performed at the level of consensus genera (classification-based) and species-level OTUs for both the rDNA-based and rRNA-based samples, in addition to a proxy for bacterial activity (rRNA/rDNA). The indicator genera based on rDNA (N=28, see online supplementary table S5) are associated mostly with the sample population and, accordingly, those associated with disease are mostly predictive for pathologies within a certain population (see online supplementary figure S11 and table S5). The same patterns are also present in the active communities (n=24; see online supplementary figure S12 and table S6), with differentiation of some highly abundant genera between Europe and India (ie, Bacteroides, Prevotella, respectively). Active Bacteroides also appear to be significantly more abundant in the European control and UC subjects. As with rDNA, most genera display associations with disease within single populations, while Papillibacter shows preferential occurrence in healthy controls across all populations, representing a potentially active universal indicator of health status (see online supplementary table S6). An analysis of species-level OTUs shows qualitatively similar results (rDNA: n=24, see online supplementary figure S13; rRNA: N=122, online supplementary figures S14 and S15), but offers a more detailed view on certain low abundant taxa, such as the association of Chloroplasts/Cyanobacteria to Indian samples, possibly originating from higher plant intake in this region or higher colonisation with Cyanobacteria (see online supplementary tables S7 and S8). Again, the association of individual bacterial species to disease conditions is rare, and even absent in the rDNA-based analyses (see online supplementary figure S14 and table S7).

Further analysis of bacterial activity (rDNA/rRNA ratio) reveals several taxa with different mean activities (ie, dormant vs active) between pathologies (see online supplementary table S9) and among populations (see online supplementary tables S9, S10 and figure S16). An interesting generalisable finding is that the genera Papillibacter and Anaerococcus appear more active in healthy controls, which further strengthens the role of Papillibacter as a general health indicator. Bacteroides, on the other hand, is more active in patients with UC and Faecalibacterium lies almost dormant in patients with CD (figure 4).

Figure 4

Analysis of bacterial genera that are differentially active between diseases. The plots show the mean ratio of rRNA/rDNA as a proxy of metabolic activity of Anaerococcus (p=0.014; rRNA/rDNA=0.900±0.348 SEM), Papillibacter (p=0.014; rRNA/rDNA=1.513±0.579 SEM), Bacteroides (p=0.014; rRNA/rDNA=2.033±0.893 SEM) and Faecalibacterium (p=0.014; rRNA/rDNA=0.606±0.125 SEM; p Values adjusted by false discovery rate (FDR); see also online supplementary table S9).CD, Crohn's Disease.


There is increasing evidence that diet and socioeconomic conditions largely influence the composition of the intestinal microbiome, yet relatively little is known about exact differences among human populations. This is especially true for studies linking bacterial dysbioses to immune-mediated diseases, or more specifically to IBD.28 Most investigations were conducted on individual European or American focus populations,7 ,13 ,29 although these diseases are increasing worldwide.1 The patterns of microbial biogeography in IBD, depicting shared and private dysbiotic events between human populations, may help to understand the role of the microbiome in disease aetiology, and are of high relevance for any diagnostic and/or therapeutic approach targeting the microbiome.

Irrespective of community profiling on the 16S rDNA or rRNA level, the most influential variable throughout the analyses is the population origin of the sample. These differences in composition according to population may arise due to a number of factors. Present-day factors would include the differences in the surrounding source environments between, for example, Europe and the Indian subcontinent, as well as the accompanying cultural (eg, diet) and genetic differences between human hosts. These abiotic and biotic factors, amplified by differences in bacterial dispersal and transmission, are likely to affect diversity on multiple levels and increase the differences between microbial communities between host populations.30 ,31 While the relative importance of those factors may be difficult to disentangle in this study, a recent fine-scale survey of the intestinal microbiota in wild mouse populations indicated a predominant influence of geographical distance, which in this case was stronger than the underlying genetic distance.15

The results of our phylum abundance analysis show congruences with other studies. The increase of Firmicutes in IBD in European samples was previously observed,17 ,32 while others failed to find this pattern or even found the opposite.28 ,33 ,34 The Indian samples, in particular, showed either no or only weak signs of differential phylum abundance and diversity among disease conditions, by contrast with former lower coverage experiments in this ethnic group.35 These conflicting results might originate from the different sampling areas (here exclusively sigmoid colon) and types of samples (biopsies vs stool) in the respective studies. An interesting perspective emerged from the relationship of Bacteroidetes and Proteobacteria, which show age-dependent patterns and a negative correlation across multiple populations, which could be the result of competitive exclusion of Proteobacteria by Bacteroidetes. The exclusion of Proteobacteria was suggested by a study on infant microbiomes36 and seems to be supported in our adult cohort. The capability of Bacteroidetes to digest complex sugars, their ancient symbiotic relationship with their hosts37 ,38 and their central position in the gut microbiome39 might help to sustain their abundance over time. Further, Proteobacteria seem to belong to the early colonisers of the mammalian gut and may, therefore, be less competitive than well-adapted late colonisers over the course of succession.36 ,40 Succession processes are also apparent through the increasing diversity within (α-diversity measures) and between (β-diversity analyses) subjects with age.41 We observed an average decrease in active diversity in Indian patients and in patients with CD (figure 2E, also see online supplementary figure S4B, D), which might indicate less stable communities, although the connection between stability and diversity is not yet fully understood.42 Furthermore, the strongest differences between the sampled populations lies among the diseased individuals, which together with disease-specific increases in α diversity with age, point towards higher community turnover and decreased colonisation resistance among diseased subjects.43 The standing microbial communities (rDNA-based) show a general increase in species number and evenness with age (figures 2D, also see online supplementary figure S4A), while phylogenetic diversity remains relatively constant over time (figure 4C), which may be a result of the replacement of Proteobacteria by Bacteroidetes, two deep-branching bacterial groups. Likewise, the correlation of β diversity with age could be a product of succession with community turnover over time.44

Contrasting patterns between the active and standing community members are present at the level of phylum abundances as well as α and β diversity. The reduced species richness among the active samples and the stronger differences between populations and pathologies might be the result of limited sequencing coverage, as rRNA of rare or less active members may be outcompeted during sequencing by more active bacteria, thereby reducing the number of observed species. Nevertheless, communities obtained from either active or standing bacteria are correlated, but appear to emphasise different processes and patterns. These differences may be of particular importance in the context of IBD, as active bacteria and their products play a more significant role in inflammation than dormant bacteria.45 Dormant bacteria, on the other hand, can balance community disturbances and maintain diversity in the microbiome as a ‘seed bank’.46

With respect to the understanding and interpretation of disease-associated microbial patterns, the influence of the study population is of great importance as it overshadows that of disease condition. This, in part, may be one explanation for the often inconsistent findings among studies of patients with IBD.47 However, we identified two interesting exceptions displaying consistency across all populations. The first concerns the activity of Faecalibacteria, which is specifically reduced among patients with CD (figure 4). This adds to a growing list of examples demonstrating a reduction of Faecalibacteria in the context of CD,48 ,49 indicating it may be a true hallmark of CD-associated communities. A second interesting and not yet described association is the increased prevalence and activity of Papillibacter in healthy subjects compared with both CD and UC across all three populations (figure 4, see online supplementary figure S8). Papillibacter is a relative of Faecalibacteria, and both belong to the Clostridum leptum subgroup,50 which are common butyrate producers. This finding further emphasises the importance of short-chain fatty acid producers for enterocyte homeostasis.51 A recent targeted case–control study of the Clostridium leptum subgroup in an independent Indian cohort revealed similar results in the context of IBD,52 and further supports the use of this group as ‘biomarkers’.

By contrast with the low number of ‘universal’ disease indicators, we identified a greater number of taxa displaying disease-by-population associations (see online supplementary tables S5–S8). Another interesting observation, the increased activity, and to some extent abundance of Bacteroides in patients with UC, suggests a high adaptability and exploitation of the disturbed mucosa in IBD by this genus.53 Also, in the light of recent findings in a large, early onset biopsy cohort for CD, we found several bacteria negatively associated with CD in common (eg, Bacteroides, Blautia, Ruminococcus, Roseburia, Coprococcus, Lachnospiraceae, Faecalibacteria).54 Associations of those bacteria were mainly restricted to European samples, again stressing population-specific differences in microbiome composition and the need for broad sampling. The only shared genus positively associated with CD is Prevotella, which in our study associates with Indian samples and has been reported to associate with non-Western microbiomes.41 A possible concern with these results may lie with differences in diagnostic criteria between study cohorts, which could contribute to the heterogeneity in disease patterns. However, no over-representation of, for example, known pathogenic genera is identified among the taxa specific to IBD in any given location. An exception may be the higher abundance of the genus, Campylobacter, among the German controls, but this does not argue in favour of differences in diagnosis due to, for example, failure to identify pathogens.

In summary, our study provides several important findings that advance our understanding of the forces shaping diversity of the intestinal microbiota and their relationship with the disease. These include the influence of age and host population on numerous aspects of community composition and structure. We identify both shared and private IBD-related signatures regarding bacterial abundances, activity and community diversity in the investigated cohorts. It is important to note that our observations were made, in part, at the level of actively transcribing community members and highlights the merits of additional 16S rRNA profiling as a promising approach to identify disease-relevant bacterial-derived biomarkers in future studies.


We thank all study participants and Manuela Kramp and Dorina Ölsner for excellent technical assistance.


View Abstract

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:


  • AR and PR contributed equally.

  • Contributors AR, SS, and SO designed the research; AR, PRa., JS, GK, KB, DA, LK, SO and PRo performed the research; PRa, JW and JFB analysed the data; PRa, PRo and JFB wrote the paper.

  • Funding This work was supported by the Deutsche Forschungsgemeinschaft ExC 306 Excellence Cluster ‘Inflammation at Interfaces’ (CL Nucleotide lab, CL CCIM and RA Envirome) and the Broad Medical Research Program (IBD-0248R).

  • Competing interests None.

  • Patient consent Obtained.

  • Ethics approval University Hospital Schleswig-Holstein Ethics Committee (B231/98 and A154/06); Kaunas Regional Biomedical Research Ethics Committee (P2-84/2003); and Bombay Hospital (Mumbai, Maharashtra State) and Research Center (dated 8th July 2009).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Raw sequence data and related metadata can be accessed at the European Nucleotide Archive (ENA) under the accession number PRJEB6172.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.