Article Text

Download PDFPDF

Advances in inflammatory bowel disease pathogenesis: linking host genetics and the microbiome
  1. Dan Knights1,2,3,
  2. Kara G Lassen1,2,
  3. Ramnik J Xavier1,2,4
  1. 1Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
  2. 2Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, Massachusetts, USA
  3. 3Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, USA
  4. 4Gastrointestinal Unit, Center for the Study of Inflammatory Bowel Disease, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
  1. Correspondence to Dr Ramnik J Xavier, Center for Computational and Integrative Biology, Massachusetts General Hospital, 185 Cambridge Street, Simches 7222, Boston, MA 02114, USA; xavier{at}


Studies of the genetics underlying inflammatory bowel diseases have increased our understanding of the pathways involved in both ulcerative colitis and Crohn's disease and focused attention on the role of the microbiome in these diseases. Full understanding of pathogenesis will require a comprehensive grasp of the delicate homeostasis between gut bacteria and the human host. In this review, we present current evidence of microbiome–gene interactions in the context of other known risk factors and mechanisms, and describe the next steps necessary to pair genetic variant and microbiome sequencing data from patient cohorts. We discuss the concept of dysbiosis, proposing that the functional composition of the gut microbiome may provide a more consistent definition of dysbiosis and may more readily provide evidence of genome–microbiome interactions in future exploratory studies.


Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Family history is a well known risk factor for developing inflammatory bowel diseases (IBD), a group of diseases that include Crohn's disease (CD) and ulcerative colitis (UC). As such, the risk of developing IBD has long been recognised to have a genetic contribution. This concept has advanced considerably over the past decade as genetic studies have identified numerous loci involved in IBD susceptibility. These studies have identified key cellular pathways in IBD and enhanced our understanding of how these pathways might contribute to disease (figure 1). However, these studies have also made clear that susceptibility alleles are not sufficient on their own to trigger disease and that other genetic and non-genetic risk factors play a role in pathogenesis. Mounting evidence has indicated that among these potential factors the diversity and composition of the gut microbiota, which includes gut resident symbiotic microorganisms, are major environmental factors influencing gut homeostasis. In this review, we provide a brief overview of the recent advances that have shaped our understanding of the complex interplay of the gut microbiome and genetic susceptibility to IBD.

Figure 1

Genes and pathways in inflammatory bowel disease (IBD). More than 160 loci have been associated with susceptibility to IBD. A selected list of candidate genes is shown (left) along with the cellular pathways in which these genes are thought to function. These pathways likely interact in cellular networks (centre) which, when perturbed, contribute to the clinical phenotypes of IBD (right).

Untangling pathways from genetics

More than a decade ago, nucleotide oligomerisation domain 2 (NOD2) was identified as the first susceptibility gene for CD.1 ,2 In the last 5 years, population-based genome-wide association studies (GWAS), followed by subsequent meta-analysis of GWAS and Immunochip data, have greatly expanded the number of IBD-associated loci to more than 160.3

These types of genetic studies have reinforced the importance of genes and pathways previously known to be involved in IBD pathogenesis, such as barrier function, the role of T cell subsets and cytokine–cytokine receptor signalling. In addition, these studies have helped uncover unanticipated new genes and pathways, including autophagy, regulation of interleukin 23 (IL-23) signalling and others (table 1). Recent studies have also highlighted the importance of host defence pathways, specifically those involved in the handling of mycobacteria, as important in balancing inflammatory responses as discussed further below.3 IBD GWAS have also demonstrated a significant degree of overlap between loci for UC and CD, as well as a high degree of overlap between susceptibility genes for IBD and for other complex diseases. Given this overlap, it is likely that similar pathways underlie IBD and that small differences result in diverse phenotypic presentations of UC and CD. Potentially the same polymorphism can have different cell- or tissue type-specific effects. Also, additional sequencing studies in carefully phenotyped patients may identify additional variants within the same loci that result in different phenotypes. Many genetic factors underlying early onset and adult IBD also appear to be the same, although it remains unknown whether the genes/genomic regions are entirely the same regardless of the age of onset. In patients 0–2 years of age, clinical presentation of IBD is atypical, with CD cases affecting the colon and UC patients often presenting with pancolitis. The great degree of overlap between IBD and other autoimmune diseases can largely be explained by the involvement of pathways such as antigen presentation and pathways that contribute to intestinal mucosal homeostasis. However, the identification of pathways that involve Th17 cells, CARD9, NOD2, reactive oxygen species and IL-1β suggests that IBD can also be defined as an autoinflammatory disease. The diagnostic value of these genetic associations with disease remains unclear. Although some studies have begun to find clinical associations, these approaches need to be confirmed in large well-phenotyped cohorts.

Table 1

Pathways identified by IBD GWAS

Deciphering phenotype from pathways

Although IBD-associated susceptibility genes and pathways offer great promise for researchers, limiting our investigations to the identification of these risk factors is likely to lead to an incomplete understanding of pathogenesis. Given that IBD-associated genetic variants are present in many individuals who do not develop disease, as well as the suggestion that classic loss-of-function variants play only a minor role in pathogenesis, a full explanation of disease complexity will require significantly more knowledge.4 ,5 For example, it will be imperative to understand the cell types involved in disease initiation and progression, how IBD pathways are regulated in these cell types, how a single pathway can exert different biological phenotypes in discrete cell and tissue types and how multiple disease variants interact.

Many of the pathways in IBD are known to have heterogeneous effects when activated in different cell types, and these cellular outcomes may be compounded to affect disease. For example, in epithelial cells, autophagy pathways play a key role in bacterial clearance; however, in macrophages, the same autophagy genes affect the ability of cells to secrete IL-1β, a key inflammatory mediator involved in host defence. Therefore, the same pathway can affect disease pathogenesis through different actions in discrete cell types. Furthermore, IL-1β can act through both innate lymphoid cells and CD4 T cells to stimulate IL-17 secretion and chronic intestinal inflammation,6 demonstrating that the same cytokine can act on multiple arms of the immune system to promote inflammation.

The IL-23–Th17 pathway is a key IBD pathway with well-characterised roles in microbial defence and intestinal immune homeostasis, and a number of genes within this pathway have been found within risk loci. In addition to its effects on classical inflammatory monocytes, myeloid cells and stromal cells, this pathway is highly influenced by environment. In recent studies of murine models of IL-23-dependent colitis, inflammatory cytokines have been shown to affect haematopoietic stem and progenitor cells, resulting in the overproduction and accumulation of granulocyte-monocyte progenitor cells in the intestine that are sufficient to aggravate the colitis phenotype.7 Additional studies will help define the heterogeneous spectrum of cellular phenotypes associated with each IBD pathway and contribute to our understanding of IBD pathogenesis.

Decoding non-coding DNA

To date, the bulk of the research on IBD genetics has focused on the impact of mutations in coding regions of genes and on the placement of these genes into discrete pathways. While these studies have been valuable in helping to characterise IBD susceptibility genes, approximately 70% of known IBD susceptibility loci are not coding variants. Understanding the implication of this finding has been aided by recent results published by the Encyclopedia of DNA Elements (ENCODE) consortium. This consortium has provided an unprecedented view of the genome that goes beyond cataloguing human sequence variation and begins to provide an integrated view of the role of functional elements in gene regulation.8 Importantly, the ENCODE project found that more than 80% of the genome is involved in at least one biochemical process, although this number is likely overestimated.9

Data from the ENCODE project and similar studies could help characterise largely unstudied IBD susceptibility elements and reveal important insights into key genes and pathways. To address how these non-coding DNA elements might have an impact on disease, the ENCODE project evaluated thousands of reported single-nucleotide polymorphisms (SNPs) curated in the National Human Genome Research Institute GWAS catalogue, finding that the majority of disease-associated SNPs reside in ENCODE-defined regulatory regions.8 ,10 This observation is consistent with data suggesting that most disease-associated SNPs reside in regions that affect gene expression.11 ,12 For IBD, 64 of the associated SNPs are in linkage disequilibrium with variants that are known to regulate gene expression.3 In T helper cells, CD-associated variants are sensitive to DNase, a phenotype generally associated with cis-regulatory modules such as promoters and enhancers.11 This finding suggests that fine mapping studies of the regulatory landscape surrounding IBD SNPs in specific cell types and tissues could yield important phenotypic insights into pathway heterogeneity.

Gene regulation goes far beyond the presence of promoters or enhancers. Non-coding risk variants could act at the level of DNA, through modification of transcription factor binding sites or epigenetic modification of regulatory regions that control the expression level of a given gene, as well as at the RNA level, through long intergenic non-coding RNAs or microRNAs (miRNAs). Mounting data suggest clear roles for epigenetic modifications in maintaining immune homeostasis. For example, studies have revealed that altering the metabolic rate of intracellular transmethylation reactions is sufficient to ameliorate autoimmunity in mouse models of lupus.13 However, the precise role of non-coding RNAs in IBD pathogenesis is still being defined. Non-coding RNAs are not transcribed into protein products, although they can interact with chromatin regulators to adjust the expression of other genes in cis or, more commonly, in trans.14 Treatment of cells with pro-inflammatory molecules such as muramyl dipeptide, an activator of the NOD2 pathway, causes upregulation of a specific group of miRNAs, implying that these miRNAs could help fine-tune the inflammatory response and act synergistically with IBD variants in these inflammatory pathways.15 Recent studies have also catalogued changes in miRNA expression in IBD and suggested a role for miRNAs in IBD pathogenesis (reviewed in16).

Demystifying dysbiosis

Much of the genetic susceptibility data from IBD studies suggest impaired handling of bacteria as well as an improper immune response to potential pathogens. Consistent with this hypothesis, dramatic shifts in the gut microbiota have been associated with IBD. These include alterations in the relative abundances of approximately a dozen bacterial taxa, as well as a decrease in the diversity of the community.17 ,18 It remains largely unknown whether the severity of gut dysbiosis is the cause of, or the response to, the severity of the disease. Although certain opportunistic pathogens such as Enterobacteriaceae have increased relative abundance in IBD patients19 and in mouse models of intestinal inflammation,20 in most cases causal connections remain elusive, and the possibility remains that alterations in the abundance of gut commensal bacteria play a role in IBD pathogenesis. Plausible causal mechanisms have been proposed for certain taxa, such as the noted decrease in the genera Roseburia and Phascolarctobacterium associated with CD. Based on studies of related taxa, these bacteria are expected to produce butyrate21 and propionate,22 respectively; Roseburia is also expected to increase production of T regulatory cells.23 A reduction in the relative abundance of these members could therefore cause a decrease in anti-inflammatory agents. In the cases of alterations of other common gut commensals such as Ruminococcaceae and Leuconostocaceae, the direction of causality remains unclear.

A severe imbalance in the composition of the gut microbiome is often referred to as dysbiosis, but the term is poorly defined. A balance of healthy gut commensal bacteria is required for suppression of pathogenic infections,24 with increasing evidence that restoration of normal commensals via transplant is more effective at fighting Clostridium difficile infection than antibiotics.25 Transplants are especially relevant for IBD patients, where recurrent C difficile infections increase morbidity and mortality and are increasing in prevalence.26 Given the recent findings of high variability of relative abundances of constituent taxa both between healthy individuals and within a single healthy individual over time,27 ,28 we may continue to find dysbiosis challenging to define in terms of taxonomic or phylogenetic composition. It has been proposed that human gut microbial communities may be partitioned into three discrete clusters.29 If this were true, it would greatly simplify the definition of dysbiosis and could have important implications for disease diagnosis and treatment, but subsequent analysis of a substantially larger population revealed that gut microbial community composition follows a relatively smooth distribution across the global human population, with primary variation largely driven by continuous gradients of dominant taxa.30 In contrast, the functional repertoire of the gut microbiota appears to be relatively stable both within and between individuals.27 Changes in functional composition have been observed in subjects with IBD, including enrichment of genes in sulfur-metabolism pathways, and a decrease in butanoate and propanoate metabolism specifically in subjects with ileal CD. Bacterial proteases, from both pathogens and commensals, have also been implicated in intestinal inflammation.31 Given its relative stability, the functional composition of the gut microbiome may provide a more consistent definition of dysbiosis and may more readily provide evidence of genome–microbiome interactions in future exploratory studies.

Altered immune response to bacterial products

IBD-associated genes in host cells indicate altered response to gut microbiota as a primary determinant of disease risk and a likely mechanism for the disease. A number of host biological functions related to protection from and management of gut bacteria are susceptible to deleterious genetic mutations in constituent genes (figure 2). These include NOD2, which stimulates the immune system to respond to the presence of certain bacteria-produced peptidoglycans. Several NOD2 mutations are known to be pathogenic in CD.3 ,32 Although NOD2-deficient mice are more susceptible to infection by specific bacterial pathogens,33 it is not known the extent to which NOD2 deficiencies alter host immune response to gut commensal bacteria. The IL-23 receptor (IL23R) also plays an important role in response to pathogens34 and mutations of IL23R associate with increased IBD risk.3 Elevated levels of IL-23 have been found in the epithelial mucosal barrier in subjects with IBD, further indicating the role of IL-23 in the chronic inflammatory response to luminal bacteria.

Figure 2

Interaction network of host genetics, the gut microbiome and diet in overview (A) and in detail (B). Chronic inflammation in the intestinal epithelium has been associated with increased production of Th17 cells, impaired innate immune response, decreased mucosal barrier, impaired autophagy and a decrease in antimicrobial agents. There is a complex network of potential interactions, in some cases involving feedback, among impaired host immune functions, diet, and the taxonomic and functional dysbiosis of the gut microbiome. For example, deleterious mutations in NOD2, GPR35, ATG16L1 or IRGM may lead to impaired immune response to commensal bacteria, and subsequently to taxonomic dysbiosis, an imbalance in the taxonomic composition of the gut microbiota; taxonomic dysbiosis may cause metabolic dysbiosis, an imbalance in the metabolic capabilities of the gut microbiome; metabolic dysbiosis may include increased biosynthesis of tryptophan; increased tryptophan is expected to lead to decreased antimicrobial activity through several pathways (see text); and impaired antimicrobial activity may lead to further taxonomic and metabolic dysbiosis. A similar feedback system may be proposed for the physical integrity of the epithelial barrier: impaired innate immune response and increased production of Th17 cells may lead to decreased integrity of the mucosal barrier; altered or impaired mucus production due to MUC19 deficiency may compound this effect; and subsequent invasion of pathobionts, or opportunistic pathogens, may increase inflammation, leading to further breakdown of the epithelial barrier.

The association of genes SLC22A5, GPR35 and GPR65 with IBD pathogenesis suggests an impaired immune response to bacteria-derived ligands and metabolites.35 Although it is known that gut commensals can produce pathogen-associated molecular patterns similar to pathogenic species, in healthy subjects toll-like receptors in dendritic cells respond selectively to pathogenic bacteria while largely ignoring gut commensals.36 Due to the high degree of selectivity required by this task, even subtle defects in microbial product sensing might be expected to contribute to a chronic inflammatory response. Bacterial proteases are also expected to play a role in intestinal inflammation, including those proteases from commensals. The gut commensal Enterococcus faecalis produces gelatinase, a metalloproteinase that disrupts the epithelial barrier and increases inflammation in mice.31 This disruption occurs only when the host has genetic susceptibility to inflammation, for example, via IL10 or NOD2 deficiency, thereby associating genetic risk of IBD with increased sensitivity to by-products of commensal bacteria.

The role of dietary nutrients and metabolites

In addition to short-chain fatty acids, a number of other metabolites are expected to be involved in host–microbiome interactions. Tryptophan provides an important intermediate metabolite for the action of aryl hydrocarbon receptor (AHR) in suppressing immune responses in dendritic cells. Specifically, AHR induces indoleamine 2,3-dioxygenase (IDO), which catabolises tryptophan into kynurenine (Kyn).37 A deficiency in AHR causes reduced Kyn and subsequently increased production of pro-inflammatory Th17 cells. In a similar case, a synthase acting to reduce tryptophan levels, tryptophan hydroxylase-1, is required for immune suppression in several inflammation models.38 Direct interaction of tryptophan levels with the microbiome has been demonstrated through the tryptophan-induced production of antimicrobial peptides via the mammalian target of rapamycin pathway.39 Recent findings implicate cytokine IL-27 in maintaining epithelial barrier protection against normal intestinal bacteria, and IL-27 has been associated through GWAS with increased risk of CD.3 Transcriptional analysis indicates that IL-27 activates several members of the signal transducers and activators of transcription family (STAT1, STAT3, STAT6), and depends on STAT1 to activate IDO.40 In this study, IDO was found to exert antibacterial effects on luminal bacteria specifically by depletion of tryptophan independent of the presence of Kyn. Thus, modulation of gut tryptophan levels by diet or by microbial biosynthesis is likely to have differential effects in individuals with altered or impaired function in AHR, IL-17-IL-23 or IL-27 pathways.

Long-term dietary habits are associated with the overall structure of the gut microbiome in humans41 and with expansion of a particular pathobiont, Bilophila wadsworthia, in mice.42 Increased levels of taurocholic acid induced by consumption of certain saturated fats cause an increase in organic sulfur in hepatic bile. This leads to the subsequent blooming of sulfite-reducing B wadsworthia, which then leads to higher rates of colitis in IL10-deficient mice. Together, these interactions provide a possible mechanism by which shifts in dietary nutrients could have inflammation-inducing interactions with the gut microbiota in individuals with certain genetic mutations (figure 2).

Genetic risk of infection

IBD genetic risk loci have substantial overlap with risk loci for primary immunodeficiencies, especially with susceptibility to mycobacterium infection.3 IBD risk loci potentially involved in IL-17 production and therefore in response to bacterial infection are enriched for balancing selection, often seen in genes related to antiparasite immunity. Given this and other recent evidence of genetic predisposition to acquisition of certain pathogens or parasites in humans, we must consider the possibility of such effects in IBD pathogenesis. There is suggestive evidence that 13 genetic loci are associated with colonisation by periodontal pathogens,43 although the mechanism of action is not understood. Similarly, mutations in two genetic loci increase susceptibility to severe forms of falciparum malaria.44 Here the loci, found through GWAS, led directly to likely disease mechanisms when coupled with a thorough understanding of the complex stages of infection. There are a number of plausible mechanisms for a parallel effect in IBD. For example, impaired autophagy of invasive bacteria is a plausible point for genome–microbe interaction in which defences against otherwise commensal bacteria allow them to become pathogenic. Mutations in loci containing immunity-related GTPase family M (IRGM) and autophagy-related 16-like 1 (ATG16L1) are both found to increase CD risk (OR 1.3 and 1.2, respectively).3 NOD2, one of the strongest GWAS associations with CD risk, induces autophagy in dendritic cells and requires proper function of ATG16L1.45

Altered mucosal protection and bacterial invasion

The epithelial mucosal barrier that normally helps isolate the lamina propia from luminal bacteria is reduced in patients with IBD.46 The possible association of MUC19 with IBD risk3 could indicate an additional mechanism for host–microbe interaction. It is known that certain bacterial species such as Akkermansia muciniphila and Enterorhabdus mucosicola degrade mucus and can thrive on the mucus layer.47 Therefore, inherited alterations in mucosal composition or production have the potential to alter the composition of the luminal bacteria, especially those most proximal to the host epithelial cells (figure 2).

Multistage triggers for chronic inflammation

Given the large number of genetic loci associated both with IBD risk and with host–microbiome interactions, we may wish to consider pathogenic models that include multiple stages of disease development (figure 2). This implies bi-directional causality among altered host immune function and altered bacterial community functions, features, or by-products. A number of simple mechanisms may be considered that involve multiple stages of triggers. For example, a host genetic variant in NOD2 or IL23R may lead to elevated inflammatory response to the presence of a pathogen. This excessive response may damage the epithelial barrier, leading to colonisation by an opportunistic pathogen or an imbalance in normal gut commensal bacteria; increased exposure to these bacteria may cause further inflammation, leading to a chronic state of dysregulation. Genetic variants causing impaired mucosal barrier production may accelerate the process. Another such mechanism begins with impaired host sensing of bacterial by-products and metabolites via defects in SLC22A5A, GPR35 or GPR65, again leading to dysbiosis and eventually a chronic inflammation state. In contrast, a healthy immune system would respond appropriately to transient infections by opportunistic pathogens without entering the overinflamed state, thereby avoiding the development of chronic dysbiosis.

Bi-directional causality with multistage triggers is supported by mouse experiments in which disease phenotypes can be caused by a genetic mutation and transplant of the microbiota from the mutant mouse to a wild-type mouse. This has been observed for colitis in mice deficient in NOD-like receptor family, pyrin domain containing 6 (NLRP6)48 and in malnutrition-related intestinal inflammation in ACE2 knockout mice.39 In the former case, mutant mice had altered faecal microbiota and increased susceptibility to chemically induced colitis; wild-type mice acquired the increased susceptibility after transplant. In the latter case, a detailed mechanism was determined involving impaired production of antimicrobial peptides via the mammalian target of rapamycin pathway. Mice with the knockout developed inflammation, whereas wild-type mice developed the same inflammation after transplant of the ‘inflamed’ microbiome. In closer relation to IBD, NOD2 deficiency has been found to induce colitis-causing dysbiosis in mice, with phenotype conferred by either genetic inheritance or inheritance through maternally transmitted microbiota.49 Host–microbiome feedback in chronic inflammation is further supported by evidence that immune response to transient infection can lead to long-term adaptive immune response to gut commensals.50 During mucosal infection by Toxoplasma gondii, some T cells differentiate into memory cells specific to gut commensals. Because increased CD4 T cell activation is associated with IBD,3 ,51 ,52 it is possible that improper sensing of commensal bacteria leads to chronic inflammation, but only after exposure to a bacterial infection.

Pursuing microbiome-wide association studies–GWAS

Each of the above models requires both genetic predisposition to IBD and exposure to certain types of bacteria or bacterial products. Although the details of such interactions are largely suppositional, given the large environmental component of disease risk in IBD, the strong associations of genetic risk loci with response to microbial symbionts, and the associations of a number of bacterial taxa with IBD, genome–microbiome interaction is a likely candidate for further study. No such connections have been demonstrated to date in a diseased cohort. Discovery of such interactions is complicated by a number of factors including the multiple stages of pathogenesis, the large number of interactions to be tested, and high intersubject and intrasubject variability in the gut microbiota. Depending on the strength of the true associations, these limitations may be overcome by careful treatment of cohort selection and data analysis. The feedback-based models described above involve cascades of several trigger events that lead to eventual establishment of chronic inflammation. Therefore, the ideal analysis would involve observation of the gut microbiota longitudinally before and after disease presentation. It is likely that exploration of interactions between host genetics and the functional, rather than taxonomic, composition of the gut microbiome will provide both stronger association signals and more direct insights into the mechanisms of the disease.

Key messages

  • More than 160 genetic loci have been associated with susceptibility to inflammatory bowel disease (IBD). These genetic findings have led to the identification of several known and novel pathways that are involved in IBD, but understanding the cell types involved in these pathways remains an important unresolved goal.

  • IBD-associated genes in host cells indicate that altered responses to gut microbiota may be a primary determinant of disease risk and a likely mechanism for the disease.

  • The diversity and composition of the gut microbiota are major environmental factors influencing gut homeostasis. A severe imbalance in the composition of the gut microbiome, often referred to as dysbiosis, has been associated with IBD.

  • The concept of dysbiosis remains poorly defined. Describing dysbiosis in terms of taxonomic or phylogenetic composition is likely to remain challenging due to high intraindividual and interindividual variation. In contrast, defining dysbiosis in terms of the relatively stable functional composition of the gut microbiome may be a more promising approach.

  • Particular dietary nutrients and metabolites likely interact with host genetics to influence host–microbiome interactions and thereby contribute to inflammation.

  • Microbiome interactions with host genetics may be best understood as a bi-directional relationship between altered host immune function and altered bacterial community functions, features or by-products.


We thank Natalia Nedelsky for editorial assistance. This work was supported by funding from the Crohn's and Colitis Foundation of America Genetics Initiative as well as NIH grants DK097485 and DK062432 to RJX.



  • DK and KGL contributed equally.

  • Contributors DK, KGL and RJX wrote the review.

  • Funding This work was supported by funding from the Crohn’s and Colitis Foundation of America Genetics Initiative as well as NIH grants DK097485 and DK062432 to RJX.

  • Competing interests None.

  • Provenance and peer review Commissioned; externally peer reviewed.