Article Text

Download PDFPDF

Family studies in Crohn's disease: new horizons in understanding disease pathogenesis, risk and prevention
  1. Charlotte R Hedin1,2,
  2. Andrew J Stagg2,
  3. Kevin Whelan1,
  4. James O Lindsay3
  1. 1King's College London, School of Medicine, Diabetes and Nutritional Sciences Division, London, UK
  2. 2Centre for Immunology and Infectious Disease, Blizard Institute of Cell and Molecular Science, Queen Mary University of London, London, UK
  3. 3Digestive Disease Clinical Academic Unit, Barts and the London NHS Trust, London, UK
  1. Correspondence to Dr James O Lindsay, Barts and the London NHS Trust, Endoscopy Unit, The Royal London Hospital, Whitechapel, London E1 1BB, UK; james.lindsay{at}


Crohn's disease (CD) is an incurable intestinal disorder in which the loss of immune tolerance to the commensal gut microbiota leads to chronic inflammation. The reason this occurs in specific individuals is unclear; however, a genetic predisposition is fundamental and relatives of patients with CD are at significant risk of developing the disease. Knowledge relating to the genetic loci that predispose to CD is accumulating, which raises the possibility of disease prediction and prevention in susceptible populations. However, the genetic basis of CD is complex and genotyping alone is likely to be insufficient to predict disease risk accurately. Specific physiological abnormalities associated with CD, such as increased intestinal permeability and raised faecal calprotectin, are also abnormal in some relatives of patients with CD. The combination of genotypic factors and biomarkers of risk makes the development of models of disease prediction a realistic possibility. Furthermore, enhanced understanding of the genotype and phenotype of the at-risk state in relatives of patients with CD allows the earliest stages in the pathogenesis of CD to be investigated and may allow intervention to prevent disease onset. This article reviews current knowledge of the at-risk phenotype in relatives of patients with CD and focuses on the implications for the design of future studies.

  • Crohn's disease
  • family
  • siblings
  • risk
  • primary prevention
  • gut immunology
  • intestinal microbiology
  • intestinal permeability
  • stool markers
View Full Text

Statistics from


Crohn's disease (CD) is a chronic intestinal inflammatory disorder that results in significant morbidity, impaired quality of life, loss of earnings and significant social and health cost burdens.1 2 Its incidence is increasing in specific populations3 and it predominantly affects young people, with up to 25% of patients diagnosed before their 18th birthday. Current theories of pathogenesis suggest that the intestinal inflammation is due to an abnormal and prolonged T cell-mediated immune response directed against the commensal gut microbiota that occurs in genetically susceptible individuals after, as yet undefined environmental insults.4

CD is a complex polygenic disease, and having a sibling or parent with CD is a significant risk factor for its development. The siblings of patients with CD have an RR of developing CD up to 35 times background population risk,5 and offspring where both parents have CD have even greater risk; ∼36% are likely to develop the disease.6 Twin studies have indicated that heritability (the proportion of the total phenotypic variance in a population due to genetic variation)7 of CD is high, with monozygotic twin disease concordance up to 50%.8 Disease location is also highly concordant between monozygotic twins,9 and there is a cumulative effect of carrying risk variants in multiple genes which increases the incidence and severity of disease.10 Although a recent re-analysis of an early twin cohort (which added twin pairs not previously included) has suggested lower monozygotic and dizygotic concordance rates of 27% and 2%, respectively,11 studies of the families of patients with CD have been central to uncovering the genetic basis of CD. Studying the families of patients with CD also allows comparisons of behaviour and environmental exposure between patients and unaffected relatives that enables the effect of influences such as childhood environment, birth order,12 smoking,13 breastfeeding14 and diet to be elucidated. Furthermore, healthy but genetically predisposed relatives may manifest biomarkers that reflect genetic risk, environmental exposures or incipient disease. There may be advantages in exploring these biomarkers in siblings rather than patients: in patients, documented abnormalities in immunological, microbiological or physiological characteristics are somewhat removed from the initial disturbances that occurred at disease onset, due to the progression of the disease as well as its pharmacological and surgical management. In contrast, the study of unaffected relatives allows the role of these factors to be resolved in the absence of the confounding effect of significant disease and may provide a window into aspects of early disease pathogenesis. In addition, an accurate description of the ‘at-risk’ state for the siblings and offspring of patients with CD, combining genetic factors and biomarkers of risk, gives potential for disease prediction and prevention. Longitudinal surveys in families who are enriched for both genetic and environmental risk factors create a more manageable cohort to investigate disease prediction. The complexity of cumulative subtle genetic effects may be difficult to capture, and predictive models of chronic disease are likely to require a combination of genotype and biomarkers. In this context, studies of the families of patients with CD take on heightened importance. This article evaluates current knowledge of the at-risk state in relatives of patients with CD. The potential of family studies to inform the design of predictive models of CD risk and to elucidate the early stages of CD pathogenesis is explored and the implications for disease prevention are discussed.


Genome-wide association studies (GWAS) have flooded the literature with newly discovered genetic loci that predispose to CD.15 16 However, the presence of inflammatory bowel disease (IBD)-associated genes in many unaffected individuals17 and the failure of the ∼71 genetic susceptibility loci identified thus far to explain more than a quarter of the genetic risk of CD18 highlights the complexity of the genetic basis of IBD. It is likely that most of the larger-effect variants associated with CD have already been described. Therefore, disease-risk alleles identified in future GWAS are likely to have a much smaller impact, with each accounting for a diminishing proportion of the heritability of CD. In addition, the predictive value of genetic models has thus far been disappointing. A measure of the discriminatory power of a risk model can be obtained using the area under the curve (AUC) of a receiver operating characteristic curve. In a model including 30 of the earliest identified CD-predisposing loci, the AUC for prediction of disease was estimated to be ∼73%. However, in a theoretical model encompassing the 142 susceptibility loci estimated to exist, the resultant increase in AUC (to 79%) was modest.19 Thus, increments in the proportion of heritability explained do not necessarily increase the power of prediction, and it may be that common variants identified by GWAS will neither be sensitive and specific enough to predict disease risk, nor account for the known heritability.

Several explanations for the ‘missing heritability’ exist,20 including multiple independent effects at GWAS loci and epistatic, epigenetic or parent-of-origin effects. In addition, some variants may reside in largely unexplored regions of the genome (such as areas of deletion, duplication and inversion) which are poorly captured by existing arrays,20 and variants with small effects are often below the threshold for detection in most GWAS.20 The contribution of small-effect variants may be significant since models using the full data from GWAS to estimate the distribution of effect sizes for complex traits imply a large number, possibly thousands, of loci with very small effect sizes.21 Finally, genotype–environment interactions, such as that described between Atg16L1 and murine norovirus in a mouse model of CD,22 are likely to be important.23 Furthermore, recent estimates of the heritability apparent in epidemiological studies of CD are lower and thus the proportion that remains to be explained may be less than previously assumed.11

Future genetic studies in families will be an important tool in disentangling the nuances of several potential explanations for the missing heritability and may enable the cumulative impact of rare variants on phenotype to be examined.20 Furthermore, prediction of relatively uncommon diseases, such as CD, at a population level will be challenging but may be feasible in at-risk populations of relatives. However, the diminishing returns from larger GWAS imply that a significant proportion of the heritability in CD will not be captured and, therefore, in order to predict disease, combinations of genotype and biomarkers of risk will need to be pursued.

The dimensions of the at-risk phenotype

Intestinal permeability (IP)

Patients with CD have increased IP,24 both in active disease25 and in unaffected areas of intestine,26 implying that increased IP may not simply be a consequence of localised, overt inflammation. Animal models such as knockouts of the main intestinal secreted mucin gene Muc227 and the SAMP1/YitFc model, in which spontaneous ileitis is preceded by increased IP, suggest a role for increased IP in pathogenesis.28 In addition, GWAS have identified a CD risk locus in MUC1929 and MUC118 genes which code for protein components of the intestinal mucous layer and provide barrier protection for the epithelium.

Increased IP has also been reported in the healthy relatives of patients with CD.30 This finding has been disputed,31 although the studies that report normal IP in unaffected CD relatives have compared mean or median IP scores between groups, which will conceal heterogeneity within the CD relatives group.32 Indeed, given the distribution of predisposing genetic variants between relatives and the potentially inconsistent exposure to environmental precipitants, only a proportion of CD relatives would be expected to manifest an at-risk phenotype, such as increased IP. Accordingly, a re-analysis of data from one of the studies that was initially reported as negative confirmed a subset of healthy relatives (but not controls) with increased IP.32 Consequently, subsequent studies have reported the proportion of relatives with increased IP, rather than merely the mean or median of the group. Thus a subgroup of between 20% and 54%33–37 of CD relatives have IP above the normal range.

There may be a genetic basis to the increased IP in CD relatives as it is more commonly found in multiplex families (with more than one member affected by CD) than in families of sporadic cases of CD.37 In addition, CD relatives have been shown to have increased IP regardless of cohabitation with the CD index case.36 Furthermore, increased IP in CD relatives is associated with NOD2 mutations.36 37 It has therefore been proposed that increased IP is a pathogenic mechanism and risk marker for CD, and indeed one case report describes increased IP prior to the onset of CD in one individual.38 In contrast, some authors have reported increased IP in (genetically unrelated) spouses,34 and have suggested that increased IP may be a product of a shared environment. However, the proportion of spouses with increased IP in these studies is lower than that of genetically related family members and is not a universally consistent observation.36 37 Thus, combinations of genetic and environmental factors may be required to induce altered IP. Indeed, some CD relatives have been shown to have an exaggerated increase in IP in response to aspirin despite having normal IP at baseline.34 Thus, differing exposure to such environmental triggers could explain some of the variation in the rate of increased IP in CD relatives between studies. Potentially relevant environmental factors that alter IP include dietary components such as soluble plantain,39 calcium and oligosaccharides.40 41 High fructo-oligosaccharide intake has been associated with ileal CD,42 and prebiotic use in patients with CD is low.43 Furthermore, given the reported protective effect of breastfeeding for early-onset CD,14 it is noteworthy that IP is reduced in breastfed compared with formula-fed neonates.44 Families of patients with CD who are predisposed to increased IP may be the ideal population to detect and test the dietary factors that contribute to or prevent CD onset. However, for factors such as breastfeeding, the relative influence of genetic and environmental factors may be difficult to discern given that siblings are likely to share both.

In conclusion, the presence of increased IP in a subgroup of CD relatives implies that either constitutively increased IP or an abnormal permeability response to environmental triggers is a pathogenic pathway in CD, rather than a consequence of the inflammatory process. However, it is not possible to distinguish whether increased IP is a primary cause of CD or whether other primary events trigger subclinical inflammation that breaches the intestinal barrier.

Calprotectin and innate immunity

Calprotectin is a calcium-binding S100A8/S100A9 heterodimeric protein expressed mainly by granulocytes.45 Faecal calprotectin concentration is elevated as a consequence of leucocyte recruitment to the intestine and is therefore a marker for local innate immune activity. Raised calprotectin reflects disease activity in CD46 and can be used clinically to identify individuals whose symptoms are likely to be caused by IBD.47 Despite the high specificity of raised faecal calprotectin (generally regarded as ≥50 μg/g of stool) for detecting organic intestinal pathology,47 it has also been detected in the healthy relatives of patients with CD. A study assessing faecal calprotectin in patients with quiescent CD and their families found that 88% of patients, 49% of the relatives, but only 13% of spouses had faecal calprotectin over the 95th percentile for normal controls.48 Furthermore, raised calprotectin in relatives was associated with greater genetic proximity to the CD-affected proband; thus mean faecal calprotectin concentrations in full siblings were higher than in half-siblings, both of which were higher than in controls. This finding was reproduced in 135 first-degree relatives of patients with CD, 23% of whom had raised faecal calprotectin.49 This latter study also demonstrated an association between raised calprotectin and increased IP in a third of patients. In contrast, a small study of paediatric patients with CD and their families reported normal calprotectin concentrations in siblings of patients with CD.50 However, the younger age of siblings in this study suggests that sufficient accumulation of environmental exposures may not have occurred, preventing the manifestation of the at-risk phenotype.

These results imply that intestinal inflammation is present in some healthy relatives of patients with CD. However, although raised faecal calprotectin predicts disease relapse in patients with CD51 there are, as yet, no studies with the longitudinal design required to determine whether raised calprotectin can predict CD onset. Furthermore, data on the immunological factors that may generate or perpetuate subclinical intestinal inflammation in relatives are lacking, although reduced chemotaxis of neutrophils in CD relatives has been demonstrated.52

Serum antibodies and adaptive immunity

Serum antibodies to specific microbes are associated with CD, including those directed towards mannan antigens of Saccharomyces cerevisiae (ASCA), Escherichia coli outer membrane porin C (OmpC), Pseudomonas flourescens-related sequence I2, bacterial flagellin (Cbir1) and several carbohydrate antigens associated with bacteria and yeasts.53 ASCA positivity in patients with CD correlates with disease phenotype,54 is stable over time35 and is associated with altered immune function including increased responsiveness to T cell stimulation.55 ACSA do not vary with disease activity or treatment,56 suggesting that they may be a marker of disease phenotype rather than a correlate of inflammation. Furthermore, a retrospective analysis of blood samples taken from army recruits who had subsequently developed CD showed that ASCA were present in 10 (31%) individuals prior to the onset of disease.57 Therefore, ASCA positivity appears to be an early, possibly genetically determined marker of CD susceptibility.

The genetic determination of antimicrobial antibodies in CD is also suggested by their presence in unaffected relatives of patients with CD. Approximately 16–34% of healthy relatives of patients with CD have elevated ASCA35 58–60 which are rarely found in healthy controls and spouses. There is also evidence of aggregation of antimicrobial antibody positivity within families.60–62 In addition, ASCA have been associated with CD-predisposing genetic variants such as NOD2 mutations in both patients63 and CD relatives.64 However, some controversy exists as other authors have not found this genetic association in patients65 and some groups have reported that ASCA positivity occurs in relatives regardless of the ASCA status of the proband.58 59 66 In addition, in a study of monozygotic twins, ASCA concordance was high in twins concordant for CD but low when twins were discordant for CD, implying a role for environmental factors.66 The lack of ASCA positivity in spouses of patients with CD35 implies that such environmental factors may produce serum antibody responses only in genetically predisposed individuals or that they are a product of early life environmental exposures. ASCA could therefore be a useful marker for prediction of CD onset.57 However, in the study of army recruits, the shorter the interval between blood sampling and diagnosis, the higher the likelihood of ASCA being detected, implying that ASCA positivity may develop immediately prior to disease onset. Therefore, monitoring of ASCA levels over time may be required to detect risk adequately.

In addition to ASCA, antibodies to several other microbial antigens and to autoantigens have also been demonstrated in CD, some of which are also increased in CD relatives, including anti-OmpC,62 pancreatic autoantibodies,67 antinuclear antibodies,68 antibodies to epithelial antigens69 and goblet cell antibodies.70 However, none of these correlates as strongly with CD as ASCA, and consequently their utility in screening relatives is likely to be limited.

Finally, other features of adaptive immunity, that are disturbed in patients with CD, are abnormal in their relatives. Thus, the ability to induce systemic antigen-specific tolerance by feeding with an antigen (ie, to induce oral tolerance) is impaired in CD71 as well as in CD relatives. This effect is manifest as increased T cell proliferation and cytokine production following antigen re-challenge in the skin.72 In addition, CD relatives have been reported to demonstrate increased interleukin 2 (IL-2), IL-6 and IL-8 in intestinal biopsy samples,73 raised serum C-reactive protein (CRP) and haematological perturbations similar to those seen in patients with CD.74 However, immunological abnormalities known to occur in CD, such as changes in peripheral T cell homing behaviour75 76 or changes in T helper 17 (Th17) cell cytokines,77 have not been investigated in relatives and thus the mechanisms of the subclinical inflammation remain obscure. In addition, the factors in unaffected relatives that limit the intestinal inflammation to a level that does not cause clinical disease are unexplored.


The gut microbiota plays a crucial role in the pathogenesis of CD and many animal models of intestinal inflammation.78 Patients with CD exhibit altered gut microbiota (dysbiosis) compared with healthy controls.79 However, defining a specific CD-associated dysbiosis has been challenging, partly due to large interindividual variation in both healthy people and patients with CD.80 Despite this, some specific features of the CD-associated microbiota have been elucidated,4 including decreased gut microbial diversity, alterations in specific mucosal bacteria such as increased E coli and bacteroides, alterations in luminal bacteria such as increased E coli, and decreased faecal clostridia. Lower levels of Faecalibacterium prausnitzii have been found in patients with active compared with quiescent disease or controls,81 and low mucosal F prausnitzii concentrations following surgically induced remission predict subsequent endoscopic disease recurrence.82 Although the concept of an intestinal dysbiosis in patients with CD is well established, it is not clear whether disease develops in the context of a normal microbiota which then develops features of dysbiosis (dysbiosis as a consequence of CD), or whether disease develops in genetically susceptible individuals with a pre-existing dysbiosis (dysbiosis as a cause). Studies of the gut microbiota in unaffected relatives of patients with CD may answer this question, although few have been performed to date.83 84

The rationale for imputing a dysbiosis in CD relatives is provided by evidence of host genetic influence over the gut microbiota. The similarity in gut microbiota is greater in monozygotic than in dizygotic twins,85 a finding that persists when twins have lived apart for many years.86 This suggests that the gut microbiota in healthy humans is either genetically programmed, or defined by shared early environment factors. However, in diseased individuals the established microbiota is destabilised. In studies of twins concordant and discordant for CD, the microbiota of patients is more similar to that of other patients than to that of their unaffected twin.87 This tendency of an individual to deviate from their programmed microbiota under disease conditions may not be specific to CD, and a similar pattern has been shown in obesity.88 Nevertheless, components of the CD-associated dysbiosis are thought to be present in some relatives, including decreased Clostridium innocuum and Enterococcus.83 Decreased faecal bifidobacteria have also been demonstrated in both patients with CD and their relatives in association with possession of NOD2 mutations, suggesting that CD-associated genes may influence gut microbiota regardless of host phenotype.83 Recently, a unique dysbiosis present in CD relatives that distinguished them from healthy, unrelated controls but was also distinct from the microbiota of the affected proband has been described.84 CD relatives had higher concentrations of Ruminococcus torques, which has enhanced mucin degradation capacity. In contrast, patients with CD had lower F prausnitzii which is a major producer of butyrate, a short chain fatty acid with immunoregulatory properties. This study reinforces the concept of the CD relative as a window into aspects of early CD pathogenesis, which may no longer be apparent in patients with established disease. Research in families, particularly longitudinal studies, is ideally placed to disentangle primary pathogenic and secondary changes in the gut microbiota in CD.

A multidimensional at-risk phenotype

It is clear that the at-risk phenotype in some CD relatives is multidimensional, encompassing factors such as alterations in gut microbiota, IP and immune function (figure 1). In the most simplistic model it would be predicted that abnormalities in these factors in unaffected relatives should match those in their affected relative. However, in reality the relationship between these factors appears to be more complex and in most studies described above there are patient–relative pairs who are discordant for aspects of the at-risk phenotype. Given the complexity of the genetic determination of CD and the known interaction with environmental risk factors, this pattern is unsurprising. First, the degree of genetic relatedness of full siblings is on average ∼50% but detailed analysis of similarity between sibling genomes reveals that they can vary as much as 37–62%.89 Secondly, if factors such as exposure to aspirin34 can modify the expression of increased IP in relatives34 then variations in other environmental exposures including those from the diet could account for discordance within families. In addition, most studies have relied upon measurements made at a single time point, precluding detection of relatives with an abnormal but fluctuating phenotype. Furthermore, phenotypic abnormalities in patients with CD may be altered by drug and surgical treatments, and abnormalities detected in a mature inflammatory reaction may not reflect the situation that was present at disease onset.

Figure 1

Relatives of patients with Crohn's disease (CD) are an intermediate group between low-risk healthy individuals and patients with CD. Relatives are enriched for CD risk alleles and manifest biomarkers of this at-risk state. The genetic basis of CD is defined by several larger- and moderate-effect alleles and many, perhaps even thousands, of small-effect variants. Relatives of patients with CD are therefore likely to possess high numbers of predisposing variants. The cumulative effect of these variants in relatives is insufficient to produce the CD phenotype, but biomarkers of this at-risk state such as increased intestinal permeability, raised faecal calprotectin and formation of anti-Saccharomyces cerevisiae antibodies are also present in a subgroup. New data suggest that a dysbiosis exists that is unique to relatives and distinct from that seen in patients with CD. However, there is little knowledge about other factors that are characteristic of CD such as altered T cell responses. Furthermore, apart from smoking, the specific environmental factors that interact with this at-risk state are ill defined, although food components, drugs and infective agents are suitable candidates.

Examining the various dimensions of the at-risk phenotype within an individual may reveal the interactions that occur during disease pathogenesis, and will be essential in constructing predictive models that would probably require a combination of biomarkers and genotyping to define risk accurately. Some studies have attempted to establish the relationship between different risk biomarkers. For example, patients with CD and their relatives with increased IP, also have a higher proportion of CD45RO+ (memory) B cells, whereas relatives with normal IP do not show this immunophenotype. This may suggest a link between antigen exposure, via a compromised intestinal barrier, and immune cell activation.33 IP has also been shown to correlate with faecal49 and intestinal lavage fluid calprotectin concentrations90 in CD relatives, and it could be inferred that a dysfunctional intestinal barrier is due to increased innate immune activity in the gut. However, increased IP in CD relatives has been shown not to co-segregate with loss of oral immune tolerance,72 pancreatic autoantibodies67 or ASCA positivity.35 The latter observation would suggest that ASCA do not reflect a generalised increase in exposure to microbial antigens due to increased IP, and this is in keeping with the lack of antibodies to other luminal antigens in ASCA-positive patients.61 Overall, there are few studies that have attempted to correlate the various dimensions of the at-risk phenotype in CD relatives, but larger studies may reveal pathogenetically important relationships.

Potentially the most valuable contribution of family studies in CD is the prospect of disease prediction and prevention. However, to date this has not been realised; a small 7 year longitudinal study attempting to determine the predictive use of serum antibodies in relatives of patients with CD found only two new cases of IBD, both in antibody-negative relatives.91 In a larger study, the combination of the number of relatives affected by IBD and the number of serum antimicrobial antibodies for each relative was used to attempt to predict disease onset.92 However, this combination of risk factors was insufficient to predict the four new cases of IBD that occurred in this 54 month longitudinal follow-up study.

Many family studies are limited by the inclusion of unaffected parents of patients with CD who are beyond the peak age of CD incidence, which will reduce the power to detect prediagnosis biomarkers of IBD susceptibility. Inclusion of parents in studies seeking to define the at-risk phenotype is also inadvisable because factors including faecal calprotectin93 and gut microbiota94 vary with age in healthy populations. In addition, comparison of individuals separated by a generation will also be confounded by changes in the prevailing environmental conditions in the intervening period. Confining analyses of phenotypes of relatives to siblings would circumvent these issues and, furthermore, increases the likelihood of inclusion of patients with CD prediagnosis. However, the highest risk group is offspring where both parents have CD,6 and longitudinal studies should be designed to capture this group in order to optimise the testing of predictive models.


The study of the phenotype of CD relatives, particularly siblings and offspring, has the potential not only to provide unique insights into CD pathogenesis but also to facilitate the development of methods of disease prediction. The successful mining of the human genome for CD risk loci has brought closer the possibility of disease prediction and prevention, and studies of the families of patients with CD are, therefore, the next logical step. However, genotyping in isolation is likely to have limited predictive value and, therefore, there is a compelling need to define a multidimensional at-risk phenotype to allow disease prediction and to determine the value of preventive interventions. Observations in patients with established CD may be dominated by features of mature inflammation, which are not specific to CD, whereas in an unaffected, at-risk individual, pathogenic mechanisms may be more clearly resolved. Future studies should target the highest risk groups (siblings and offspring below or around the peak age of onset of CD) and aim to examine a wide variety of dimensions of the at-risk phenotype within the same individual, particularly focusing on microbial and immunological risk markers and dietary factors that have so far not been examined. Adequately powered longitudinal studies combined with periodic measurements in at-risk individuals are essential to allow rigorous testing of models of prediction as well as the detection of fluctuating risk markers in order to facilitate the realisation of the full potential of studies of the families of patients with CD. This may then allow strategies designed to prevent the onset of CD in the ‘at-risk’ population to be conducted.


The assistance of Professor David van Heel in reviewing the manuscript is gratefully acknowledged.


View Abstract


  • Funding CRH is a clinical research fellow funded by Core.

  • Competing interests None.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.