Main

IBD is a spectrum of chronic relapsing inflammatory disorders affecting the gastrointestinal tract that can be classified into Crohn disease and ulcerative colitis. The identification of CARD15 (refs. 13) and several loci associated with susceptibility to IBD in independent linkage studies4 documents the polygenic etiology of IBD. We previously identified a locus in the pericentromeric region of chromosome 10 that was associated with susceptibility to IBD in a genome-wide linkage scan involving 282 families of European descent5. Fine mapping at an average distance of 5 cM using an additional 11 microsatellite markers in an extended linkage cohort (111 additional families including 422 affected sibling pairs) confirmed initial linkage findings and identified a two-peak linkage curve extending from D10S547 to D10S192 (multipoint lod score = 2.07, P = 0.0033 at D10S548) and a second peak at D10S201 (multipoint lod score = 1.6) for associated with Crohn disease (Fig. 1). We used a hierarchical linkage disequilibrium (LD) study to search for the causal variant(s) in the 40-cM interval (Fig. 1). Transmission disequilibrium testing (TDT) of trios randomly drawn from each family showed a significant single-point association with Crohn disease at D10S201 (P < 0.01), located in the second linkage peak on 10q22–10q23. We finely mapped the underlying 5-Mb region at an average distance of 75–120 kb using 37 single-nucleotide polymorphisms (SNPs) selected from the TSC allele frequency project in 457 independent trios with IBD (Supplementary Table 1 online). The marker TSC0376484 (rs1344966) in this panel was significantly associated with Crohn disease (χ2 = 9.00, P = 0.002) and more strongly with IBD (Crohn disease and ulcerative colitis, χ2 = 11.65, P = 0.0006; Fig. 1).

Figure 1: Genetic variants of DLG5 are associated with IBD.
figure 1

The experimental steps from linkage mapping on the long arm of chromosome 10 to identification of genetic variants in DLG5 associated with IBD, Crohn disease (CD) and ulcerative colitis (UC) are shown. Annotation of the most important microsatellite and SNP markers that led to the identification of DLG5 as a susceptibility gene for IBD and Crohn disease is also shown. The markers with strongest association with the IBD phenotype are indicated with bold vertical tick marks. The corresponding association results are presented in detail in Tables 1 and 2 and Figure 3.

TSC0376484 is located near two genes of possible (patho)physiological relevance to chronic intestinal inflammation: KCNMA1, encoding a potassium-gated calcium channel6, and DLG5, a member of the membrane-associated guanylate kinase gene family, which is important in the maintenance of epithelial cell integrity7. To genetically narrow the association signal to one single candidate, we used LD mapping and genotyped selected publicly available SNPs from each gene in the 457 trios with IBD. The association signal was confined to DLG5, and none of the markers in KCNMA1 was significantly associated with IBD (Supplementary Table 2 online).

We sequenced coding exons 2–32 and the exon-intron boundaries of DLG5 in 47 individuals with IBD and identified or verified 33 SNPs (Supplementary Tables 3 and 4 online). We then tested all these SNPs for disease association. Association with the IBD phenotype was strongest (Table 1), with 18 markers in DLG5 showing significant association. Separate analysis of the Crohn disease and ulcerative colitis subgroups showed a strong association in the Crohn disease subgroup, which is in accordance with the original linkage observation5. The weaker signal in the ulcerative colitis subgroup may be due to reduced power in a small sample size. Because the combined group had the strongest association, we suggest that the signal in DLG5 reflects a factor associated with general susceptibility to IBD rather than to Crohn disease only.

Table 1 Summary of TDT results in German trios for single-point association with IBD, Crohn disease and ulcerative colitis

Pairwise LD measures (D′) indicated strong LD across the entire gene, defining a single haplotype block of 85 kb and D′ values >0.8, except at TSC0000361 (located on the neighboring LD segment). We observed a sharp decline in LD at the boundaries of the haplotype block, differentiating DLG5 from the neighboring genes KCNMA1 and RPC155 (Fig. 2). Analysis of the extended DLG5 haplotype identified four common haplotypes (Fig. 3), with haplotype A tagged by eight SNPs (haplotype-tagging SNPs or htSNPs) of equivalent genetic information content. Haplotype A was significantly undertransmitted to individuals with IBD and Crohn disease, whereas haplotype D, uniquely tagged by the coding variant 113A, was significantly overtransmitted to individuals with both IBD (χ2 = 8.08, P = 0.004) and Crohn disease (χ2 = 4.15, P = 0.04; Fig. 3).

Figure 2: LD across DLG5 and flanking genomic region.
figure 2

D′ values for pairwise LD between each marker (red = D′ > 0.8) are represented. Top of figure shows spacing of markers across the genomic region. All markers in DLG5 (large red block) are in strong LD. LD drops off sharply (white blocks) at the boundaries of DLG5, indicating that the neighboring genes are located on different (independent) genomic segments.

Figure 3: Multilocus haplotype results.
figure 3

(a) The four most common haplotypes for DLG5. The boxed alleles refer to the htSNPs. Haplotype D carries the risk-conferring htSNP rs1248696 (113G→A) and is significantly overtransmitted in trios with IBD and Crohn disease (TDT results shown in b). Haplotype A is tagged by eight htSNPs that carry equivalent genetic information and is significantly undertransmitted in trios with IBD and Crohn disease (TDT results in shown in c).

To corroborate our initial association finding, we genotyped the DLG5 htSNPs in an independent sample consisting of trios with IBD who had not yet been analyzed (n = 485; Supplementary Table 1 online). The htSNP DLG5_e26 in haplotype A was undertransmitted to the individuals with IBD (transmitted:untransmitted (T:U) ratio of 165:214), replicating the observed association (P = 0.006 in a one-tailed test), and rs1058198 had a T:U ratio of 196:237 (P = 0.024). 113A was overtransmitted in both IBD (T:U 90:73, P = 0.09) and Crohn disease (T:U 58:43, P = 0.065) but the distortion was not statistically significant. This can be explained by the smaller proportion of trios with Crohn disease in our replication sample and the reduced power in replication situations. We therefore tested the associated markers in a second independent sample (538 Crohn disease cases and 548 controls) using the case-control study design to estimate the attributable risk in a diverse population of European descent. The 113A variant was significantly associated with the IBD phenotype (P = 0.001, odds ratio (OR) = 1.62) as was rs2289310 (4136C→A, resulting in the amino acid substitution P1371Q; P = 0.01, OR = 1.51). DLG5_e26, tagging haplotype A, was significantly associated with IBD (Table 2), providing a second independent replication of association. The combined P values for the repeated, independent associations with IBD (n = 2) were 0.029 for 113A and 0.0007 for DLG5_e26, and those for the repeated, independent associations with Crohn disease (n = 3) were P = 0.001 for 113A and P = 0.0004 for DLG5_e26.

Table 2 Association of DLG5 SNPs with IBD in an independent case-control sample

Because the 4136A risk allele is not included on the common haplotypes carrying 113A, but instead on a rare haplotype (frequency <1%), we calculated the global differences in genotype combinations for 113A and 4136A to estimate the risk for homozygosity or compound heterozygosity. This analysis identified a significant difference in genotype frequencies (global χ2 = 13.61, P = 0.0029) in individuals with IBD compared with healthy controls. The OR was 1.74 (95% confidence interval = 1.31-2.32) for individuals carrying at least two risk alleles (113A and/or 4136A), suggesting that the overall clinical impact of rare single coding mutations such as 4136A on the IBD phenotype is limited. Our disease model that links 113A and 4136A to the positional signal detected in DLG5 is further supported by the identification of rare, coding, 'private' variants (resulting in the amino acid substitutions S121G, E514Q, R957H and P979L; frequency <0.5%) through systematic sequence analysis in 47 individuals.

We were interested in the hypothetical impact of the associated variants, R30Q and P1371Q, on the function of the DLG5 protein. DLG5 has been implicated in regulating cell growth and maintaining cell shape and polarity8. A recent study9 suggested an epithelial function for DLG5 as a binding partner of vinexin at sites of cell-cell contact, and our preliminary results on expression of DLG5 mRNA in a variety of tissues confirm the presence of the transcript in the colon, the intestine and isolated intestinal epithelial cells (Supplementary Fig. 1 online). It is therefore conceivable that DLG5 has a role in maintaining epithelial structure and that genetic variants in DLG5 interfere with epithelial barrier function in the colon.

DLG5 contains one DUF622 domain, four PDZ domains and one SH3 domain followed by one guanylate kinase domain (Fig. 1)10,11. All these domains are assumed to be involved in protein-protein interactions, supporting the idea that DLG5 is a multifunctional adapter and scaffold protein. We carried out in silico analysis of the potential structural and functional implications of the variants R30Q and P1371Q (Supplementary Fig. 1 online). The results of this analysis suggested that both variants probably impair the scaffolding functions of DLG5 (Supplementary Methods and Supplementary Fig. 1 online).

Finally, we examined potential locus-locus interactions between variants of DLG5 and variants of CARD15, the first susceptibility gene identified for Crohn disease1,2,3. Genetic susceptibility to Crohn disease is mainly conferred by three polymorphisms that induce structural changes in the leucine-rich repeats of CARD15 (R702W, G908R and 3020insC). The allele frequencies of these SNPs range from 4% to 14% in the cohorts with Crohn disease examined to date, and 30–40% of individuals with Crohn disease are heterozygous for at least one of the variants, compared with 10% of control subjects1,2,3. We examined interactions between DLG5 and CARD15 by stratifying trios in two groups according to the genotype of the affected child. In trios with IBD and Crohn disease, haplotype A (represented by DLG5_e26) was undertransmitted in the groups carrying both the risk-associated and non-risk-associated variants of CARD15 (Table 3), which suggests that haplotype A reflects genetic variation that acts independently of CARD15 variants. In trios with Crohn disease, we observed significantly greater transmission of 113A in individuals carrying the risk-associated versus non-risk-associated variants of CARD15 (Table 3). This suggests that the 113A variant is of particular relevance in individuals with Crohn disease and, further, that an interaction may exist between the risk-associated haplotype of DLG5 and the risk-associated variants of CARD15.

Table 3 Association of DLG5 haplotypes and interaction with risk-associated CARD15 alleles

We found replicated association between genetic variations in DLG5 and the risk of developing IBD. The risk-associated DLG5 haplotype D is uniquely distinguished by the 113A variant and is suggested to be causative, as are rare, private SNPs. The conferred risk is moderate, which is in agreement with a polygenic disease model. Genetic interaction studies suggest interactions between CARD15 variants and 113A in DLG5, but these studies are not yet conclusive and will require large, consolidated efforts by several groups to achieve appropriate statistical power. Future studies in diverse and very large samples are needed to evaluate the population relevance of variants in DLG5 in this chromosomal region. Functional studies need to define the molecular properties of DLG5 variants and their influence on the clinical presentation of IBD.

Methods

Study samples.

Individuals with IBD were recruited by the clinical group through the Charité University Hospital (Berlin, Germany) and at the Department of Internal Medicine I, University Hospital Kiel, Germany. Diagnosis of IBD and subsequent classification into Crohn disease or ulcerative colitis was determined by standard diagnostic criteria12,13 and has been described previously3,5,13. All individuals were of European descent. We carried out LD mapping in trios consisting of father, mother and child affected with IBD, in which one parent or neither parent was affected with Crohn disease or ulcerative colitis. These trios were identified for LD mapping and have been described3. For our confirmatory cohort, we extracted trios randomly from the multicase families used in our previous linkage studies5 and supplemented this group with 92 additional trios recruited for this purpose. For case-control association, we compared 538 additional, independent individuals with IBD (singletons) with age- and sex-matched volunteers from the Kiel University blood donation program. All study participants gave informed, written consent. The recruitment protocols and study procedures were approved by the ethics committees of the Charité University Hospital, Berlin, Germany, and the Schleswig-Holstein University Hospital, Campus Kiel, Germany, respectively.

Microsatellite typing.

In the first stage of microsatellite LD mapping, we genotyped 11 microsatellite markers (D10S547, D10S548, D10S211, D10S611, D10S213, D10S1780, D10S220, D10S1790, D10S609, D10S201 and D10S2470) in 393 families with IBD (422 affected sibling pairs).

Information on primer sequence, allele size range, suggested amplification conditions and genetic position can be obtained from the Genethon and Marshfield databases (see URLs). Genotypes were generated at the University of California Los Angeles using PCR and fluorescence-labeled primers on an ABI 377 sequencer.

SNP discovery in DLG5.

To identify all crucial SNPs in the coding sequence of DLG5, as well as exon-intron boundaries and the promoter region, we sequenced 47 individuals with IBD using an ABI 3700 automated sequencer as previously described3. The primers and probes for 33 SNPs discovered or verified by resequencing, and the sequences of the new SNPs, are given in Supplementary Tables 3 and 4 online.

SNP genotyping.

We selected SNP markers for the initial fine mapping experiment based on information available from the public databases. For analysis of DLG5, we used SNPs generated or verified in-house. We generated SNP genotypes using the TaqMan allelic discrimination method as previously described3. Taqman assays were from ABI.

Statistical analysis.

We tested each marker for Hardy-Weinberg equilibrium in the control populations using a χ2 test and then carried out genetic analyses at several levels. To confirm the association with Crohn disease, we first subjected each marker to single-locus tests for linkage and transmission disequilibrium testing (TDT) analysis followed by haplotype analysis as implemented in GENEHUNTER (Vs. 2.1; ref. 14). To assess significance of the TDT results for each marker, we did permutation tests using the same genotype data described previously15,16. In 105 permutations of the entire data set of 28 analyzed markers for DLG5, we observed a single χ2 value greater than 9.91 4,635 times (empirical P = 0.046), and 874 simulations had two markers with a χ2 value greater than 14.5 (empirical P = 0.0087). We calculated pairwise LD between each marker pair and between haplotype blocks as described15,16. For case-control analysis, we calculated χ2 values using Fisher's exact test; we calculated genotype-based ORs using Fisher's contingency tables and tested association similarly. We calculated combined P values for determining the overall significance of the observed independent association findings as outlined17.

Exon 1 identification.

Because a BLAST analysis of the sequence from exon 1 as described10 showed this sequence to be derived from human mitochondrial DNA, we concluded that this sequence probably arose as an artefact of RACE amplification. We used sequence from exon 2 instead to identify expressed-sequence tags from porcine and bovine genomes containing unique 5′ sequences (EMBL IDs BI402246, BM484383 and BI847653). These have high similarity and could be identified within the human contig containing DLG5. This new exon of at least 300 nucleotides in the 5′ untranslated region is located 57 kb upstream of exon 2 of DLG5.

URLs.

The Marshfield database is available at http://research.marshfieldclinic.org/genetics. The Genethon database is available at http://www.genethon.fr. The National Center for Biotechnology Information's SNP database is available at http://www.ncbi.nlm.nih.gov/SNP. The SNP Consortium website is available at http://snp.cshl.org. The National Genome Research Network is available at http://www.ngfn.de.

Note: Supplementary information is available on the Nature Genetics website.