Article Text

Download PDFPDF

Original article
Siblings of patients with Crohn’s disease exhibit a biologically relevant dysbiosis in mucosal microbial metacommunities
  1. Charlotte Hedin1,2,
  2. Christopher J van der Gast3,
  3. Geraint B Rogers4,
  4. Leah Cuthbertson3,
  5. Sara McCartney5,
  6. Andrew J Stagg2,
  7. James O Lindsay6,7,
  8. Kevin Whelan1
  1. 1Faculty of Life Sciences & Medicine, Diabetes and Nutritional Sciences Division, King's College London, London, UK
  2. 2Centre for Immunology and Infectious Disease, Blizard Institute, Queen Mary University of London, London, UK
  3. 3NERC Centre for Ecology & Hydrology, Wallingford, Oxfordshire, UK
  4. 4South Australian Health and Medical Research Institute, Infection and Immunity Theme, Flinders University, Adelaide, Australia
  5. 5Centre for Gastroenterology and Nutrition, University College London, London, UK
  6. 6Centre for Digestive Diseases, Blizard Institute, Queen Mary University of London, London, UK
  7. 7Department of Gastroenterology, Barts Health NHS Trust, London, UK
  1. Correspondence to Professor Kevin Whelan, Faculty of Life Sciences & Medicine, Diabetes and Nutritional Sciences Division, King's College London, Franklin Wilkins Building, 150 Stamford Street, London SE1 9NH, UK; kevin.whelan{at}


Objective To determine the existence of mucosal dysbiosis in siblings of patients with Crohn's disease (CD) using 454 pyrosequencing and to comprehensively characterise and determine the influence of genotypical and phenotypical factors, on that dysbiosis. Siblings of patients with CD have elevated risk of developing CD and display aspects of disease phenotype, including faecal dysbiosis. Whether the mucosal microbiota is disrupted in these at-risk individuals is unknown.

Design Rectal biopsy DNA was extracted from 21 patients with quiescent CD, 17 of their healthy siblings and 19 unrelated healthy controls. Mucosal microbiota was analysed by 16S rRNA gene pyrosequencing and were classified into core and rare species. Genotypical risk was determined using Illumina Immuno BeadChip, faecal calprotectin by ELISA and blood T-cell phenotype by flow cytometry.

Results Core microbiota of both patients with CD and healthy siblings was significantly less diverse than controls. Metacommunity profiling (Bray–Curtis (SBC) index) showed the sibling core microbial composition to be more similar to CD (SBC=0.70) than to healthy controls, whereas the sibling rare microbiota was more similar to healthy controls (SBC=0.42). Faecalibacterium prausnitzii contributed most to core metacommunity dissimilarity both between siblings and controls, and between patients and controls. Phenotype/genotype markers of CD risk significantly influenced microbiota variation between and within groups, of which genotype had the largest effect.

Conclusions Individuals with elevated CD-risk display mucosal dysbiosis characterised by reduced diversity of core microbiota and lower abundance of F. prausnitzii. This dysbiosis in healthy people at risk of CD implicates microbiological processes in CD pathogenesis.

View Full Text

Statistics from

Significance of this study

What is already known on this subject?

  • Patients with Crohn's disease (CD) have mucosal dysbiosis, including reduced abundance of Faecalibacterium prausnitzii.

  • Low-mucosal F. prausnitzii predicts relapse after surgery in patients with CD.

  • Healthy siblings of patients with CD have an increased risk of developing CD and have altered abundance of key species in the gut lumen.

What are the new findings?

  • There is a distinct dysbiosis in the mucosal microbiota of healthy siblings of patients with CD.

  • The sibling dysbiosis comprises a fundamental distortion of microbial community composition, most notably reduced diversity of core microbiota and low abundance of mucosal F. prausnitzii.

  • Mucosal microbiota disruption is not merely a consequence of the inflammation in CD but is present at healthy individuals at risk of CD.

How might it impact on clinical practice in the foreseeable future?

  • Identification of this at-risk dysbiosis signals pathways in CD pathogenesis and raises the possibility of CD risk identification and CD risk intervention.


Disruption of gut microbiota (dysbiosis) is an established feature of IBD. The dysbiosis in Crohn's disease (CD) has been well described and includes reduced microbial diversity, reduced abundance of Firmicutes particularly Faecalibacterium prausnitzii, reduced abundance of Bifidobacteria, increased γ-proteobacteria and disturbances in Bacteroides populations.1 The involvement of several CD susceptibility genes in the recognition and handling of bacteria (eg, NOD2, ATG16L1, IRGM) reinforces the position of the gut microbiota at the centre of IBD pathogenesis.2–4

Whether the CD dysbiosis is involved with pathogenesis is uncertain. The dependence on the presence of gut microbiota for the development of inflammation in animal models5 as well as patients with CD,6 and the association between reduced mucosal F. prausnitzii and postoperative relapse7 implies a pathogenic role. Conversely, the lack of therapeutic benefit of manipulating the microbiota8 ,9 suggests that dysbiosis in CD may not drive inflammation, but rather is consequent to established disease, reflecting, for example, the differential survival of various species in an inflamed environment. Moreover, attempts to identify aspects of the CD dysbiosis that were present at disease initiation, which therefore potentially have a role in pathogenesis, may be obfuscated by both the mature disease phenotype of the patients studied and the effect of the medical, surgical and patient-initiated attempts to treat and control symptoms.

Siblings of patients with CD have a relative risk (RR) of developing CD of up to 35 times that of the general population.10 This risk is partly genetic, but is also driven by non-genetic factors many of which they share with their CD-affected sibling.10 ,11 Several of these proposed non-genetic risk factors, such as mode of delivery, breast feeding, maternal inoculum, home environment and weaning diet,12 potentially impact gut microbial acquisition and development. It follows that any aspect of the CD dysbiosis which is also present in a healthy sibling cannot be disrupted as a consequence of disease, and rather may be implicated in processes driving CD pathogenesis.12

Attempts have been made to determine whether aspects of the CD phenotype are present in patients’ unaffected relatives. These have assessed dysbiosis13 and other features of the CD phenotype such as raised faecal calprotectin (FC), increased intestinal permeability (IP) and the presence of antimicrobial antibodies.12 Using PCR probes selected to detect dominant species that comprise the dysbiosis in CD, we have previously indicated that a faecal dysbiosis exists in healthy siblings of patients with CD characterised by reduced faecal Firmicutes, including F. prausnitzii.14 Moreover, we previously demonstrated in siblings that a combination of luminal dysbiosis, raised FC, reduced abundance of circulating naïve T-cells, disturbances in their expression of gut-homing β7 integrin and at-risk genotype could be combined to create a multidimensional risk phenotype, which significantly distinguished healthy siblings of patients with CD from healthy, unrelated controls.14

It has been speculated that mucosal microbiota is of greater significance in CD pathogenesis than luminal microbiota, given their closer spatial relationship to the gut immune system. Yet, studies comparing mucosal microbiota in patients with CD, their families and healthy controls are rare due to the invasiveness of procedures required to obtain mucosal samples from otherwise healthy individuals. However, the potential rewards of obtaining such samples have been amplified by recent advances in the analysis of large, diverse and complex microbial communities. Pyrosequencing technology and metacommunity profiling enables greater sampling depth permitting detection of dominant microbial community members and low-abundance (rare) taxa.15 ,16 The capacity to characterise core and rare microbial communities separately may reveal microbial features associated with disease not otherwise readily apparent. Furthermore, 16S rRNA gene pyrosequencing and other next-generation technologies have demonstrated that microbial diversity can be orders of magnitude higher than previously appreciated.17 Measuring diversity may be significant as healthy gut microbiota exhibit high diversity compared with microbial populations in other human body habitats.18 Moreover, gut microbial diversity is consistently described as reduced both in CD1 and other human diseases including obesity,18 ,19 colorectal cancer20 and eczema,21 and in addition, has been linked with smoking.22

Therefore, we sought to use 454 pyrosequencing and metacommunity analysis to comprehensively characterise the structure and composition of the mucosal microbial community in an at-risk group of CD siblings compared with patients with CD and healthy controls.

Materials and methods

Patients with inactive CD (CD Activity Index <150 and C-reactive protein (CRP) ≤5 mg/L) and their healthy siblings (both aged 16–35 years) were recruited from clinics at Barts Health NHS Trust and University College Hospitals NHS Foundation Trust (London, UK). Patients required a confirmed diagnosis of CD for >3 months. All healthy siblings who volunteered and did not meet exclusion criteria (detailed in online supplementary table S1) were included, to limit bias in the selection of siblings with specific characteristics. Healthy controls were recruited by email sent to staff and students at King's College London (London, UK), during the same period. Participants were informed that involvement in the study did not constitute screening for disease and that detection of clinical disease in any sibling or control would lead to exclusion from the study. Only participants consenting to rectoscopy and providing analysable biopsies were included. All participants provided written, informed consent.

At screening, demographics, medical and drug exposure history, physical examination, CRP, inclusion and exclusion criteria were assessed. Instructions regarding avoidance of prebiotics/probiotics for 4 weeks (to prevent impact on microbiota), non-steroidal anti-inflammatory drugs for 1 week and alcohol for 24 h before the study (to prevent impact on IP) were provided. Blood samples were taken for routine haematology/biochemistry, T-cell analysis and genotyping. Participants completed a 5 h urine collection for measurement of IP and underwent flexible rectoscopy without bowel cleansing. Biopsies from non-inflamed rectum were snap frozen, and stored at −80°C before processing for histological and microbiological analyses. Stool was obtained and stored at −20°C before processing for FC quantification.

Faecal calprotectin

FC extraction and ELISA analysis (Calpro AS, Lysaker, Norway) were carried out according to manufacturer's instructions using duplicate appropriately diluted samples. FC concentration (µg/g) was determined relative to standard curves.

Peripheral blood T-cell flow cytometry

Whole blood, collected in lithium-heparin Vacutainer tubes (BD Bioscience), was stored at room temperature for ≤4 h before labelling with fluorescently conjugated monoclonal antibodies to detect CD3 T-cells, naïve (CD45RA+) and memory (CD45RA) subsets of CD4 and CD8 T-cells. Integrin α4β7 expression was assessed by labelling with anti-β7 (see online supplementary methods for antibodies used). Data were acquired using a LSRII 4-colour flow cytometer (BD Bioscience) and collected using fluorescence-activated cell sorting Diva software V.4.1.2 (BD Bioscience) using Flow-Count fluorospheres (Beckman Coulter) for absolute quantitation. Colour compensation was performed offline using Winlist V.6.0 (Verity Software House).


Human DNA was extracted from whole blood using the phenol chloroform-isoamyl alcohol method. Genotyping was performed using the Illumina Infinium Immunochip.2 ,23 To increase detection of NOD2 mutations and capture the enhanced risk of NOD2 compound heterozygosity, three NOD2 mutations (rs2066845/G908R, rs2066844/R702W and rs5743293/3020insC) were individually assessed. Cumulative genotype RR (GRR) for each participant was therefore calculated across 72 CD-risk loci. A population distribution model of CD risk was generated using the REGENT R program24 and previously published ORs.2 Participants were categorised into reduced, average, elevated or high-genotype risk with reference to this model.25

Intestinal permeability

IP was measured using lactulose–rhamnose tests as described previously.14

Gut mucosal microbiota

Biopsy DNA extraction was carried out using a phenol/chloroform-based method, as described previously.26 A detailed extraction protocol is provided in the online supplementary methods. DNA extracts were quantified using the Picodrop Microlitre Spectrophotometer (GRI, Braintree, UK). Negative controls (sterile water) were included in the DNA extraction and PCR amplification steps.

Bacterial tag-encoded FLX amplicon pyrosequencing (bTEFAP) was performed as described previously using Gray28F 5′-TTTGATCNTGGCTCAG-3′ and Gray519r 5′-GTNTTACNGCGGCKGCTG-3′.27 Detailed protocols for 16S rRNA gene sequencing and sequence data processing are provided in the online supplementary methods.

To assign bacterial identities to 16S rRNA gene sequences, sequence data were de-noised, assembled into operational taxonomic unit (OTU) clusters at 97% identity, and queried using a distributed .NEt algorithm that uses Blastn+ (KrakenBLAST, against a database of high-quality 16S rRNA gene bacterial sequences. Using a .NET and C# analysis pipeline, the resulting BLASTn+ outputs were compiled, data reduction analysis performed and sequence identity classification carried out, as described previously.28

Statistical analyses

Bacterial species within each metacommunity were partitioned into common and rare groups using a modification of a previously described method.15 Three complementary measurements of diversity were used to compare microbial diversity between samples, as described previously: species richness (S*, the total number of species), Shannon–Wiener (H’, a metric accounting for both number and relative abundance of species) and Simpson's (1-D, a measure of the probability that two species randomly selected from a sample will differ).15 ,26 To avoid potential bias due to varying sequences per sample, all measures were calculated using randomised re-sampling to a uniform number of sequence reads per sample.26 Mean diversity measures were calculated from the re-sampling of the reads from each specimen to the lowest number of sequence reads among all specimens for 1000 iterations. Diversity analysis was performed in R.29 Two sample t tests, regression analysis, coefficients of determination (r2), residuals and significance (P) were calculated using Minitab software (V.16, Minitab, University Park, Pennsylvania, USA). Canonical correspondence analysis (CCA), analysis of similarity (ANOSIM) and similarity of percentages (SIMPER) analysis were performed using the PAST (Palaeontological Statistics, V.3.01) program available from the University of Oslo website link ( run by Øyvind Hammer. The Bray–Curtis quantitative index of similarity was used as the underpinning community similarity measure for CCA, ANOSIM and SIMPER tests.


Demographic and disease characteristics of the 21 patients with quiescent CD, 17 of their healthy siblings and 19 unrelated healthy controls that were included are summarised in table 1. At the time of the study, only one patient was cohabiting with one of the included siblings. GRR, FC, faecal Firmicute abundance and circulating T-cell characteristics were all significantly different in both patients with CD and healthy siblings compared with healthy controls as published previously,14 as summarised in table 1.

Table 1

Summary of demographic variables in patients, siblings and controls as well as clinical characteristics in patients

A total of 180 696 bacterial sequence reads (mean per sample 3235±SD 205), identifying 160 genera and 351 distinct OTUs classified to species level (see online supplementary table S2), were generated from all samples combined. The numbers of bacterial sequence reads per sample were similar among the three cohorts (mean±SD): CD, 3296±258 (n=21); siblings, 3190±423 (n=17); and healthy controls, 3210±393 (n=19).

Species abundance was directly correlated with distribution

We have previously established that the categorisation of human microbiota into core and rare species revealed important aspects of metacommunity species-abundance distributions that would be neglected without such a distinction.15 A coherent metacommunity could be expected to exhibit a direct relationship between prevalence and abundance of individual species within the constituent communities. Consistent with this prediction, the abundance of species in each study group significantly correlated with the number of individual sample communities those species occupied (CD (R2=0.62, F1,227=366.9, p<0.0001); siblings (R2=0.71, F1,259=590.1, p<0.0001); and healthy controls (R2=0.68, F1, 258=552.6, p<0.0001)), (figure 1).

Figure 1

The distribution and abundance of bacterial species within microbiota samples within the (A) Crohn's disease (CD), (B) siblings and (C) healthy control cohort metacommunities. Given is the number of mucosal samples for which each bacterial taxon was observed to occupy, plotted against the mean abundance across all samples ((A) n=21, r2=0.62, F1, 227=366.9, p<0.0001; (B) n=17, r2=0.71, F1, 259=590.1, p<0.0001; and (C) n=19, r2=0.68, F1, 258=552.6, p<0.0001). Core species were defined as those that fell within the upper quartile (dashed lines), and rare species defined as those that did not.

In patients with CD a lower proportion of the mucosal microbiota were core species

Individual species in each cohort metacommunity were then classified as core or rare based on their falling within or outside the upper quartile of subject occupancy, respectively (figure 1). Of the 229 species that comprised the CD metacommunity, only 7 were core and 222 were rare species. The healthy siblings metacommunity (261 species) comprised 18 core and 243 rare species, and the healthy controls metacommunity (260 species) comprised 25 and 235 species, respectively. In addition, the core species within each cohort metacommunity accounted for 44.7%±4.8% (CD), 67.6%, ±5.5% (healthy siblings) and 67.4%±4.6 (healthy controls) of the mean (±SD) relative abundance. The mean relative abundances in the CD core microbiota were significantly lower than the healthy siblings and healthy controls (p<0.0001 in both cases), but were not different between the siblings and healthy controls (p=0.907).

Microbial diversity was lower in both siblings and patients compared with controls

The mean microbial diversity of subject communities for each cohort was compared using three indices of diversity (figure 2). Diversity was compared between the three cohorts for the whole microbiota, as well as core and rare species groups (figure 2). These analyses revealed the siblings’ whole and core microbiota to be significantly more diverse than the CD cohort, but the sibling core microbiota was significantly less diverse than the healthy core microbiota. No significant difference in diversity was observed between the whole microbiota between the siblings and healthy cohorts, emphasising the advantage of analysing core and rare populations separately. In addition, the CD rare microbiota was significantly less diverse than the other two rare species cohorts, which in turn were not significantly different from each other. All of these observations were underpinned by all three measures of diversity in each instance (figure 2).

Figure 2

Diversity of whole, core and rare microbiota within the Crohn's disease (CD, black columns), siblings (grey), and healthy control (white) cohorts. Given are three indices of diversity; Species richness (S*), Simpson's index of diversity (1-D) and Shannon–Wiener index of diversity (H′). Error bars represent the SD of the mean (CD n=21, siblings n=17, and healthy n=19). Asterisks denote significant differences in comparisons of diversity at the p<0.05 level determined by two sample t tests.

Interestingly, within the CD population, diversity of the whole microbiota was lower in the nine patients with an ileocaecal resection/right hemicolectomy compared with the 11 patients without these operations (as shown by Richness p<0.0001; Shannon–Wiener p=0.046; but not Simpson's p=0.768). This was largely driven by lower diversity of rare taxa (as shown by Richness p<0.0001; Shannon–Wiener p=0.019; but not Simpson's p=0.159) rather than core taxa (Richness p=0.523; Simpson's p=0.612; Shannon–Wiener p=0.824).

Significant divergence in whole and core microbial composition between patients with CD and healthy controls, but not between patients with CD and healthy siblings

The distribution of the microbiota within the three cohorts was determined by direct ordination using Bray–Curtis similarity measures. Using ANOSIM tests, the CD and healthy whole and core microbiota were significantly divergent from each other. However, the whole and core microbiota of siblings were not significantly divergent from either that of the CD or healthy controls (figure 3). In all instances, rare microbiota was significantly divergent between cohorts, including between siblings and healthy controls.

Figure 3

Analysis of similarities (ANOSIM) of whole, common and rare microbiota between subject cohorts. Given is the ANOSIM test statistic (R) and probability (P) that two compared groups are significantly different at the p<0.05 level (* denotes p<0.001 and **p<0.0001). ANOSIM R and p values were generated using the Bray–Curtis measure of similarity. R scales from +1 to −1. +1 indicates that all the most similar samples are within the same groups. R=0 occurs if the high and low similarities are perfectly mixed and bear no relationship to the group. A value of −1 indicates that the most similar samples are all outside of the groups.

Lower F. prausnitzii made the greatest contribution to the dissimilarity in microbiota between both healthy siblings and healthy controls and between CD and healthy controls

Given the involvement of core species in differences of relative abundance, diversity and microbiota composition, the contribution of individual taxa to the dissimilarity between core microbiota was assessed by SIMPER analyses (table 2). Both F. prausnitzii and Escherichia fergusonii contributed the most to the dissimilarity between all cohorts. As a proportion of core species, F. prausnitzii had a higher relative abundance in the healthy controls (30.9%) than both the CD (22.4%) and siblings (24.2%). Conversely, E. fergusonii was more abundant in the CD cohort (21.4%) than in siblings (9.7%) and healthy controls (4.1%).

Table 2

Similarity of percentages (SIMPER) analysis of microbial community dissimilarity (Bray–Curtis) between core species groups for (A) CD and siblings, (B) healthy controls and siblings and (C) CD and healthy cohorts

Genotypical and phenotypical features associated with CD and CD risk significantly explained microbiota variation

CCA was used to relate the variability in the distribution of microbiota between cohorts to clinical and demographic variables (table 3 and figure 4). Variables that significantly explained variation in mucosal microbiota were determined with forward selection (999 Monte Carlo permutations; p<0.05) and used in CCA. Based on the direct ordination approach, the microbiota between cohorts was significantly influenced by factors listed in table 3. The same analytical approach was used to assess the extent to which variance in the microbiota distribution within cohorts could be accounted for by variation in measures of clinical and demographic factors (table 3). GRR was the most significant factor in explaining variance between the three cohorts and within each cohort. FC was also significant in explaining variance between cohorts, particularly in the core microbiota. However, in the within-group analyses, FC was significant in explaining microbial variance in patients and siblings but not in controls. Blood T-cell factors explained a higher proportion of variance in siblings and controls than in patients. Conversely, age significantly associated with variance in controls but not in patients or siblings.

Table 3

Canonical correspondence analyses for determination of per cent variation in the whole, core and rare microbiota between and within the three subject cohorts by clinical variables significant at the p<0.05 level*

Figure 4

Canonical correspondence biplots for (A) whole, (B) core and (C) rare microbiota. Red crosses represent microbiota samples from the Crohn's disease (CD) cohort, yellow filled triangles for the siblings cohort and green diamonds for the healthy cohort. In each instance, the 95% concentration ellipses are given for the CD (red), siblings (yellow) and healthy (green) cohort microbiota. Biplot lines for clinical variables that significantly accounted for variation within the microbiota at the p<0.05 level (see table 3) show the direction of increase for each variable, and the length of each line indicates the degree of correlation with the ordination axes. Canonical correspondence analysis (CCA) field labels: Calprotectin, Gender, ‘% Memory T-cells’—Proportion of blood T-cell with memory phenotype (%), ‘CD4+ T-cells’—Blood concentration of naïve CD4+ T-cells (cells /mL), ‘β7 integrin’—proportion of CD4 naïve T-cells expressing β7 integrin (%), ‘GRR’—genotype relative risk, (cumulative GRR for each participant was calculated across 72 CD-risk loci (detected using the Illumina Infinium Immunochip), participants were categorised into reduced, average, elevated or high genotype risk with reference to a population distribution model of CD risk). Percentage of community variation explained by each axis is given in parentheses.


This is the first study to detail the mucosal microbiota of clinically and genetically well-characterised healthy siblings of patients with CD, and to compare them with both their CD-affected siblings and healthy controls. Moreover, this study is unique in uncovering interactions of mucosal microbiota with genotype and features of the CD-risk phenotype. This manuscript is a significant advance on the preliminary account of the multidimensional risk phenotype as described previously, which centred on quantitative PCR (qPCR) sampling of faecal microbiota.14 The current study focuses on the mucosal microbiota and employs next-generation sequencing and advanced statistical analysis to reveal the complexity of the metacommunities in healthy siblings of patients with CD. The core mucosal microbiota in siblings was characterised by lower diversity compared with controls, and lower abundance of F. prausnitzii made the greatest contribution to the dissimilarity between these two groups. Genetic CD risk explained the highest proportion of microbial variance both between all three groups, and within the patient and sibling groups. These findings are unlikely to be confounded by current cohabitation as only one patient cohabited with one sibling.

Although related healthy individuals are known to harbour similar gut microbiota,19 the similarity in the microbiota between patients with CD and their healthy siblings is of considerable pathogenic relevance. Previous studies have shown that when one sibling has CD, familial microbial similarity is disrupted, even in disease-discordant monozygotic twins.30 Thus, microbial features that are similar between affected and unaffected siblings, but that are not present in low CD-risk healthy individuals, may be part of the CD-risk phenotype and therefore pertinent to CD pathogenesis. In order to discern these features associated with familial risk, a comparison with healthy, unrelated individuals is essential.

The validity of the data presented is supported by the correlation between species-abundance and distribution, which is consonant with a coherent metacommunity structure and is similar to distributions described in other ecological communities.15 This feature of community structure facilitated delineation of core species which are abundant and persistent, and allowed resolution of features of the mucosal microbiota without obfuscation from rare microbiota which may be highly variable, transient and scarce. A significantly higher proportion of the microbiota in patients with CD belonged to the rare group compared with healthy siblings and healthy controls. As described below, this is at least in part attributable to loss of principal members of the core group, most notably Firmicutes.

Reduced microbial diversity is an almost universally reported feature of mucosal CD dysbiosis.1 The current study reveals that core microbiota diversity is also lost in siblings of patients with CD, indicating that this may be a fundamental step in CD pathogenesis. Reduced diversity may be an indicator of the health of human microbial communities, as it is reduced in a variety of disorders.18–21 Lower diversity may be associated with incomplete occupation of ecological niches resulting in reduced resistance to pathogen colonisation; additionally a more restricted gut metagenome contains a lower array of genes which may result in the loss of key functions.

Lower diversity indicates altered mucosal microbial composition, and microbial composition in patients with CD and healthy controls were significantly distinct from one another. In contrast, the composition of the whole and core microbiota in healthy siblings was not significantly different from either patients with CD or healthy controls, indicating that from a microbial metacommunity perspective, siblings lie somewhere between patients and controls. The greater variability in the composition of the microbiota in at-risk siblings (illustrated by larger 95% concentration ellipse in figure 3B) probably reflects the range of CD risk contained within this group, with siblings with higher CD risk lying closer to or within the CD region. In addition, diversity was lower in core and rare microbiota in patients with ileocaecal resection/right hemicolectomy, potentially explained by differences in disease phenotype, or the absence of the ileocaecal valve that would otherwise constitute a barrier between small and large intestinal microbiota.

Consonant with previous work highlighting the importance of F. prausnitzii in CD dysbiosis,7 ,12 ,14 F. prausnitzii made the greatest contribution to the dissimilarity between patients with CD and healthy control microbiota. The prominence of F. prausnitzii has biological significance as it is the only microbial factor shown to be predictive of the natural history of CD,7 and response to treatment.31 Strikingly, F. prausnitzii was also the biggest contributor to the dissimilarity of the core mucosal microbiota between healthy siblings and healthy controls, establishing that mucosal F. prausnitzii correlates to the natural history of CD and is a key feature of the at-risk phenotype. Taken together, these findings strongly support the hypothesis that depletion of F. prausnitzii is part of CD pathogenesis rather than consequent to established CD. Several mechanisms exist whereby F. prausnitzii and other Firmicutes may contribute to gut health, including the production of short-chain fatty acids (SCFAs),32 ,33 SCFA-independent, NFκB-mediated effects,7 and via production of longer-chain fatty acids such as conjugated linoleic acid.34

The pathogenic role of reduced F. prausnitzii in CD has been questioned by a study describing increased mucosal F. prausnitzii in newly diagnosed paediatric IBD.35 However, whether increased abundance of F. prausnitzii is a distinctive feature of paediatric-onset IBD, with low F. prausnitzii being associated with later-onset CD, or whether the abundance of F. prausnitzii may bloom in childhood and then critically decline in those at risk of CD, may only be determined by longitudinal studies.

Other species contributing to the dissimilarity in the core mucosal microbiota between patients with CD and healthy controls were congruent with species identified previously as characterising the CD dysbiosis, including a greater abundance of most Proteobacteria such as E. fergusonii and E. coli. Similar species contributed to the dissimilarity between siblings and controls. However, the presence of E. coli was specific to CD mucosa, and therefore may be a feature of established CD rather than pathogenic. Features of the inflamed gut such as increased activity of nitric oxide synthases,36 or reduction in faecal butyrate producers which will result in a rise in pH, potentially favour the survival of organisms that are inhibited at acidic pH such as E. coli.37

GRR was the factor associated most strongly with the variation in the microbiota in both the between-group analysis and analysis within each of the three groups. Although the proportion of variation in mucosal microbiota explained by GRR was small, it is nevertheless significant. The combination of loci used to estimate GRR in the current study does not include more recently detected risk loci and can be expected to account for a limited proportion of the genetic risk.38 Therefore, these data will tend to underestimate the effect of genotype. Furthermore, since other factors known to affect gut microbiota such as diet were not controlled, this signal of the interaction between genotype and the mucosal microbiota is striking.

The direction of the vector in figure 3 illustrates that FC contributed to the axis separating patients from the other two groups in the whole, core and rare microbiota, implying that microbial composition in CD is partly associated with the degree of inflammation. This would support the hypothesis that CD-specific elements of the dysbiosis may be consequent to intestinal inflammation, through mechanisms such as the enhanced survival of E. coli in an inflamed environment as proposed above.

When each group was considered separately, the effect of each factor in different groups could be compared. Several factors were significant in all groups (GRR, gender, proportion of CD4+ naïve T-cells expressing β7 integrin). Other factors were significant in patients and siblings but not controls: FC and blood naïve CD4+ T-cell concentration were significant only in patients and siblings, whereas age was significant only in controls. Disease phenotype was significant in explaining microbial variation within the CD group as would be predicted from previous studies.30 However, we have also demonstrated that for healthy siblings, disease site in their affected relative was significantly associated with the variation in their own microbiota. This would suggest that specific risk phenotypes are associated with different disease phenotypes.

Overall these factors accounted for a higher proportion of the variance in the microbial composition in siblings, compared with controls or patients, indicating that this multidimensional risk phenotype is specific, and that in low CD-risk individuals the microbial composition is associated with other factors, such as age. Furthermore, it would appear that in CD the influence of factors associated with the original risk phenotype is obfuscated by established CD and its surgical and medical management.


Healthy siblings of patients with CD, who themselves have elevated risk of CD, have a dysbiosis of the core mucosal microbiota characterised by reduced diversity and loss of Firmicutes, notably F. prausnitzii. Genotype determines a proportion of the at-risk mucosal microbial phenotype. Notwithstanding the limited extent to which known loci account the observed CD risk,39 it is also clear that the sibling risk goes beyond genotype and that non-genetic factors within families contribute to the development of an at-risk microbiota. How and why patients and their siblings acquire the microbiota that marks out this risk is not known. However, knowledge of the at-risk microbial phenotype illuminates possible pathways in CD pathogenesis and raises the prospect of intervention to impact human health and influence disease risk.


View Abstract

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:


  • CH and CJvdG contributed equally.

  • Contributors CH: study concept and design; obtained funding; recruitment of participants and acquisition of data; analysis and interpretation of data; statistical analysis; drafting of the manuscript; critical revision of the manuscript for important intellectual content. CJvdG: analysis and interpretation of data; statistical analysis; drafting of the manuscript; critical revision of the manuscript for important intellectual content. GBR: DNA extraction; analysis and interpretation of data; statistical analysis; critical revision of the manuscript for important intellectual content. LC: analysis and interpretation of data; statistical analysis. SM: assistance with recruitment of participants; critical revision of the manuscript for important intellectual content. AJS, JOL and KW: study concept and design; obtained funding; analysis and interpretation of data; critical revision of the manuscript for important intellectual content; study supervision.

  • Funding CH was supported by a Fellowship from the charity Core. CJvdG and LC were supported by the UK Natural Environment Research Council (grant number NE/H019456/1).

  • Competing interests None.

  • Ethics approval Bromley Local Research Ethics Committee (reference 07/H0805/46).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement The sequence data reported in this paper have been deposited in the NCBI Short Read Archive database (Accession number SRP045959).

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.