Background and aims: Genome wide scans in inflammatory bowel disease (IBD) have indicated various susceptibility regions with replication of 16cen (IBD1), 12q (IBD2), 6p (IBD3), 14q11 (IBD4), and 3p21. As no linkage was previously found on IBD regions 3, 7, 12, and 16 in Flemish IBD families, a genome wide scan was performed to detect other susceptibility regions in this population.
Methods: A cohort of 149 IBD affected relative pairs, all recruited from the Northern Flemish part of Belgium, were genotyped using microsatellite markers at 12 cM intervals, and analysed by Genehunter non-parametric linkage software. All families were further genotyped for the three main Crohn’s disease associated variants in the NOD2/CARD15 gene.
Results: Nominal evidence for linkage was observed on chromosomes 1 (D1S197: multipoint non-parametric linkage (NPL) score 2.57, p = 0.004; and at D1S305-D1S252: NPL 2.97, p = 0.001), 4q (D4S406: NPL 1.95, p = 0.03), 6q16 (D6S314: NPL 2.44, p = 0.007), 10p12 (D10S197: NPL 2.05, p = 0.02), 11q22 (D11S35-D11S927: NPL 1.95, p = 0.02) 14q11-12 (D14S80: NPL 2.41, p = 0.008), 20p12 (D20S192: NPL 2.7, p = 0.003), and Xq (DXS990: NPL 1.70, p = 0.04). A total of 51.4% of patients carried at least one NOD2/CARD15 variant. Furthermore, epistasis was observed between susceptibility regions 6q/10p and 20p/10p.
Conclusion: Genome scanning in a Flemish IBD population found nominal evidence for linkage on 1p, 4q, 10p12, and 14q11, overlapping with other genome scan results, with linkage on 14q11-12 supporting the IBD4 locus. The results further show that epistasis is contributing to the complex model of IBD and indicate that population heterogeneity is not to be underestimated. Finally, NOD2/CARD15 is clearly implicated in the Flemish IBD population.
- IBD, inflammatory bowel disease
- CD, Crohn’s disease
- UC, ulcerative colitis
- ASCA, anti-Saccharomyces cerevisiae antibodies
- PCR, polymerase chain reaction
- SNP, single nucleotide polymorphism
- NPL, non-parametric linkage
- inflammatory bowel disease
- genome scan
Statistics from Altmetric.com
- IBD, inflammatory bowel disease
- CD, Crohn’s disease
- UC, ulcerative colitis
- ASCA, anti-Saccharomyces cerevisiae antibodies
- PCR, polymerase chain reaction
- SNP, single nucleotide polymorphism
- NPL, non-parametric linkage
The inflammatory bowel diseases (IBD) Crohn’s disease (CD) and ulcerative colitis (UC) are chronic disorders of the gastrointestinal tract with a prevalence estimated at 2/1000.1,2 Although CD and UC are grouped under the unifying term IBD, clinical but also pathogenic differences exist. Both diseases are also characterised by different serological antibody expression: CD expresses ASCA (anti-Saccharomyces cerevisiae antibodies)3,4 whereas UC is associated with the occurrence of perinuclear antineutrophil cytoplasmic antibodies.5,6
Family studies,7–9 ethnic differences,10 and twin studies11–13 have underlined the contribution of an important genetic susceptibility in the pathogenesis of IBD. From these studies, a recurrence risk for siblings (lambda s (λs)) of 15 has been calculated for IBD overall (25 for CD and 10 for UC).7–9
The model of inheritance, which appears to best explain the data, is that CD and UC are a group of polygenic diseases sharing some, but not all, susceptibility genes: a model of genetic heterogeneity. This model can explain the clinical variability in disease presentation—a specific disease phenotype may result from the interaction between environmental factors with specific polymorphisms in a number of susceptibility loci. Apart from clinical differences, serological antibodies, genetic markers, and knockout mice models also support this heterogeneity concept.
Whereas the first two genome wide scans in IBD14,15 showed linkage on chromosome 16 and chromosomes 3–7 and 12, respectively, subsequent genome wide scans have shown at least nominal evidence for linkage on various chromosomes.16–20 A number of the implicated regions have been replicated: 16cen (the IBD1 locus),14,21–28 12q (the IBD2 locus),15,29–31 6p (the IBD3 locus),15,20,32 14q11–12 (the IBD4 locus),18,19 and recently 3p21.15,33
Successful identification of IBD1 as NOD2/CARD15,34,35 a gene involved in recognition of microbial lipopolysaccharides and nuclear factor κB activation, showed that this gene is only the first of many more genes underlying IBD susceptibility, and therefore greatly encouraged investigators in the field to continue the search for other genes.
Several explanations can account for the variation in results seen among the genome searches: genetic heterogeneity among populations, different phenotypes studied, insufficient power to detect linkage, or ethnic differences. Some genome wide searches have included a large group of Ashkenazi Jewish patients16,18 who are known to be genetically different from Caucasian patients and who are highly predisposed to the development of IBD. Others have restricted their genome wide analyses to only CD families.18,19
We, along with others, could not find evidence for linkage on IBD susceptibility regions located on chromosomes 3–7–12 and 16 in a smaller cohort of Belgian IBD affected sibling pairs.36,37 We were able to exclude regions on chromosomes 3, 7, and 12 in our Flemish sample with a LOD score of less than −2 for a locus with λs = 2. Additional IBD affected relative pairs have been collected for this report and a genome scan has been performed to identify IBD susceptibility regions in this Belgian population.
SUBJECTS AND METHODS
Eighty nine IBD affected families, living in the Flemish part of Belgium, were recruited. Place of birth was obtained from all patients, parents, and all four grandparents. In this context, it needs to be stressed that our study population has a very homogeneous ancestry, as 97% of all families were born and living in the northern Flemish part of Belgium for at least three generations, and their four grandparents had Flemish surnames. Only three families included members with a different origin. There were no Jewish patients in our dataset (table 1).
All families had two or more members affected. Seventy six families were pure CD affected. In 13 families, some relatives had CD (n = 18) while others had UC (n = 13) (mixed families). There were no pure UC families. All patients were seen at the gastroenterology unit of the University Hospital Gasthuisberg Leuven, Belgium. Confirmation of a diagnosis of CD or UC was obtained from review of the clinical, radiological, endoscopic, and histological data, according to well established criteria.38 In total, 217 patients were IBD affected: 204 (94%) were diagnosed with CD and 13 (6%) with UC. There was one affected mother of a CD affected sibling pair in whom differential diagnosis between CD and UC was impossible and who was diagnosed as indeterminate colitis. This family was analysed as mixed. In total, 149 relative pairs were genotyped. Among these, 125 were affected sibling pairs (106 CD only, 19 mixed CD-UC) and 24 were second degree relative pairs (four aunt/nephew; three aunt/niece; three uncle/nephew; five uncle/niece; nine second cousin pairs). Of the total study cohort, chromosomes 3, 7, 12, and 16 had previously been studied in 56 families.37 After obtaining informed consent, blood samples for DNA extraction were collected. DNA of both parents was available in 46% and of one parent in 20% of families. If one or both parents were absent, one or more unaffected siblings were genotyped to infer the missing parental genotype. Ethics agreement for the study was given by the Catholic University of Leuven.
DNA was extracted using a salting out procedure from whole venous blood and amplified by polymerase chain reaction (PCR) using 323 microsatellite markers spaced over the genome (Applied Biosystems ABI Prisms Linkage Mapping Set 1, PE Belgium SA/NV). Average spacing between the markers was 12.6 cM and was calculated from the Généthon human linkage map.39 Optimal conditions were obtained for each primer before amplification over ranging annealing temperatures (55–60°C) and MgCl2 concentrations (1.5–2.5 mmol). PCR reactions were done in 15 µl volumes containing 5 µl of DNA (10 ng/µl), 0.04 µl Taq polymerase (5 U/µl), 0.6 µl of each primer (66 ng/µl), 1.5 µl 10× buffer, 0.375 µl dNTPs (concentration 100 µMol), and MgCl2 in sterile H2O. Amplified fragments were detected using 6% acrylamide gels electrophoresed on 373A DNA sequencers (Applied Biosystems ABI). Semi-automated DNA fragment sizing was carried out using the Genescan software 3.1. Genotyping was done using Genotyper 2.1 (ABI).
In a second phase, all families were genotyped for the three identified single nucleotide polymorphisms (SNPs) in the IBD1 gene, NOD2/CARD15, and results obtained in this cohort of patients were compared with a control group consisting of 157 healthy hospital workers (94 females and 63 males).34,35 The missense mutation R702W (SNP8, GenBank accession No G67950) was genotyped by an allele specific PCR procedure with primers: 5′-ATC TGA GAA GGC CCT GCT CC-3′ (wild-type, forward), 5′-ATC TGA GAA GGC CCT GCT CT-3′ (mutated, forward), and 5′-CCC ACA CTT AGC CTT GAT G-3′ (reverse) (annealing temperature 58°C), followed by detection on a 2% ethidium bromide stained agarose gel. The missense mutation G908R (SNP12, GenBank accession No G67951) was genotyped by means of PCR-restriction fragment length polymorphism, using PCR-primers 5′-CCCAGCTCCTCCCTCTTC-3′ and 5′-AAGTCTGTAATGTAAAGCCAC-3′ (annealing temperature 55°C), followed by digestion by Hha1 (Gibco BRL, Merelbeke, Belgium) and agarose gel electrophoresis, resulting in a band of 380 bp (wild-type) or two bands of 138 and 242 bp (variant). For the frameshift mutation 1007fs (SNP13, GenBank accession No G67955), we refer to the method described by Ogura and colleagues.35
The Genehunter 1.2 program analysed the proportion of alleles shared identity-by-decent at a certain locus.40 The observed proportion of alleles shared was compared with the expected, and an excess of sharing of 2 versus 1 or 0 alleles was taken as evidence for linkage. Genehunter provides a non-parametric linkage (NPL) score from which the significance level (p value) is derived. Allele frequencies for each marker were calculated from the unrelated parents in the samples. The statistical probability of linkage (two point and multipoint) was calculated for IBD overall, for CD and mixed families individually, as well as for families expressing ASCA. With the current sample size, we could detect a locus with λs⩾2.5 at an NPL of 1.63 (p = 0.05) with a power of >80%.41 Epistatic interactions between susceptibility regions were tested in two ways. Firstly, we analysed each of the linkage regions, conditioned on the presence (positive NPL scores for that region) or absence (negative NPL scores) of linkage on each of the other loci with the Genehunter program.16 Therefore, in an analysis for chromosome A conditioned on the results for chromosome B, the criteria to stratify a family to the “linked to chromosome B subset” was a positive NPL score only at those markers within the observed chromosome B linkage peak. Similarly, epistasis between the NOD2/CARD15 gene and other susceptibility loci was tested by reanalysing each of the linkage regions only with those families showing at least one variant/no variant in the NOD2/CARD15 gene. Since each of the seven identified loci were stratified in these analyses based on the results of the other six loci, p values were corrected by multiplying them by 42. Both corrected and uncorrected p values are given. To further assess the significance of our results, we also reanalysed the data in a second step using the Genehunter-Twolocus software. This program performs non-parametric linkage analysis using two marker maps. Allele sharing is evaluated simultaneously at two disease loci. The position of the first disease locus on one chromosome is held fixed at one particular site on the first marker map (at the site of the observed linkage with the Genehunter program). The position of the second disease locus is then varied along a second marker map.42,43
The markers showing nominal evidence for linkage on two point analysis are listed in table 2. Strongest linkage on two point analysis was seen at D1S252 (156.7 cM; NPL 2.61; p = 0.004) and D20S192 (18.5 cM; NPL 2.60; p = 0.004). On chromosome 14, three consecutive markers on 14q11 showed nominally significant NPL scores and p values in the total cohort of IBD: D14S50 (12.4 cM; NPL 2.34; p = 0.009), D14S80 (26.5 cM; NPL 2.15; p = 0.016), and D14S49 (36.7 cM; NPL 2.44; p = 0.007) (table 2).
The results of multipoint non-parametric analysis for all chromosomes as well as for the CD and mixed subgroups are shown in figs 1–3. The multipoint curve for chromosome 1 showed two peaks: around D1S197 (78.9 cM; max NPL 2.57; p = 0.004), overlapping with the region found by Cho and colleagues16 and around D1S305 and D1S252 (156.7 cM; max NPL = 2.97; p = 0.001) (fig 1). On chromosome 4 at D4S406 (112 cM), a multipoint NPL in the CD subgroup of 2.21 (p = 0.013) was seen, overlapping partially with the region found by Hampe and colleagues17 (fig 1). On chromosome 6, maximum multipoint linkage was obtained around D6S314 (137 cM) on 6q16 (NPL = 2.44; p = 0.007) (fig 1). Chromosome 10 showed linkage around D10S197 (50.5 cM; NPL = 2.05; p = 0.02), again partially overlapping with the region found by Hampe and colleagues17 (fig 2). On 11q22, linkage was observed over 15 cM between D11S35 and D11S927 (104 cM; NPL 1.95, p = 0.02). Linkage was observed on 14q11–12 over 30 cM and was maximal around 14S80 (26.5 cM; NPL = 2.41; p = 0.008) (fig 2). This region coincides with the reported IBD-4 locus (Ma and colleagues18; Duerr and colleagues19). On chromosome 20, a peak of linkage was observed around D20S192 (18.5 cM; NPL = 2.71; p = 0.003) (fig 3). Earlier findings of linkage on Xq around DXS990 were confirmed (101 cM; NPL = 1.70; p = 0.04).42
On chromosomes 8 (NPL = 2.12; p = 0.018), 9 (NPL = 2.77; p = 0.003), 16 (NPL = 2.25; p = 0.013), and 17 (NPL = 2.40; p = 0.008), significant NPL scores were seen only in mixed families (figs 1, 2). However, as this subgroup consisted of only 13 families and 20 affected relative pairs, no definitive conclusions should be drawn from these results. We noted that the region on chromosome 17 overlapped with the findings of the genome scan by Ma and colleagues18 although the latter genome scan only consisted of CD families. Our linkage curve on 17 for the CD only pairs, on the contrary, did not show significant linkage in this region.
Subsequently, all families were genotyped for the three main associated variants in the NOD2/CARD15 gene. Overall, 51.4% of CD patients carried at least one variant within NOD2/CARD15 compared with 31 cases (19.7%) in the control group (n = 157). Allele frequencies for Arg702Trp (16.7%), Gly908Arg (3.8%), and fs1007insC (12.7%) differed significantly from those observed in controls (6.4%, 1.9%, 2.3%, respectively; all p<0.01). There were 6.4% homozygotes and 7.3% compound heterozygotes observed in patients. Among the controls, two cases were found to be compound heterozygous.
ASCA was determined in all patients and relatives. For CD, 126/204 (62%) CD patients and 29/138 (21%) unaffected relatives were ASCA positive. Families expressing ASCA (71/89) were reanalysed for all chromosomes as a separate subgroup to see whether expression of this antibody is linked to certain chromosomal regions (figs 1–3). ASCA+ multipoint curves overall corresponded well with the curves of the CD subgroup and no single region associated with ASCA expression could be identified. This linkage curve correspondence reflects the close association between ASCA and the CD phenotype.
To test epistasis, all families were reanalysed for chromosomes 1, 4, 6, 10, 11, 20, and X, conditioned on the presence or absence of linkage on each of the other regions of linkage. Interaction between loci was observed at chromosomes 6 and 20, both when conditioned for linkage on chromosome 10 (fig 4): an NPL score of 4.30 was reached on 6q when analysing only those families with +NPL scores on chromosome 10 (uncorrected p<0.0001, corrected p = 0.0004). Also, an NPL score of 3.74 was observed on chromosome 20 when analysing only those families with +NPL scores on chromosome 10 (uncorrected p<0.0001, corrected p = 0.0004). The results obtained from the Genehunter-Twolocus output were very similar: significant NPL scores were obtained for 6q when conditioned on chromosome 10 (NPL 2.47; p = 0.006 uncorrected, p = NS corrected at 128 cM distance) and on chromosome 20 when conditioned on chromosome 10 (NPL 2.35; p = 0.009 uncorrected and NS corrected at 15 cM). Given the earlier reported epistasis between 1p and IBD1,16 the region harbouring the NOD2/CARD15 gene, those families carrying at least one NOD2/CARD15 variant, were reanalysed for regions on chromosomes 1, 4, 6, 10, 11, 20, and X. No arguments for epistatic interactions were seen.
Given the absence of linkage on chromosome 16, and even exclusion on chromosomes 3, 7, and 12 in a smaller Belgian dataset of IBD families,37 a genome wide search in a larger Belgian IBD population was performed to see if other linkages could be identified. Lander and Kruglyak have proposed a classification with thresholds of linkage for genome wide scans.44 Although none of the identified regions in our genome scan meet the Lander and Kruglyak criteria for significant (Lod >3.6, p = 2×10−5) or suggestive linkage (Lod>2.2, p = 7×10−4), several findings are noteworthy and deserve attention. Firstly, four of the susceptibility regions found in this genome scan coincided with regions found by other investigators. Intriguing is the fact that two of these regions—namely, on chromosomes 4 and 10—overlapped with findings from the European collaborative study.17 This study consisted of 353 affected sibling pairs originating from the UK, the Netherlands, and Germany mainly. The migration waves that took place in Europe between 8000 and 5000 BC consisted of a German-Austrian wave, of which the Flemish population (Northern Belgium) was the most Southern branch. The French and Walloons (Southern Belgium) descend from the Mediterranean wave. This reflects in the language differences between the (Northern) Germanic and (Southern) Romanic parts of Europe. Furthermore, within the Germanic populations, two subclusters can be distinguished based on phylogenetic tree analysis: a subcluster which is made up of Dutch, Danish, and English people and a subcluster of Austrian, Swiss, German, and Flemish people.45,46 The overlapping findings of our genome scan with Hampe et al’s fit into the model of common ancestry between these populations. However, the exact origin and distribution of the study population in the study of Hampe and colleagues17 are not entirely clear and it would be interesting to know whether the observed linkages were detected in all subpopulations. In this context, it needs to be stressed that our study population had a very homogeneous ancestry, as 97% of all families were born and living in the northern Flemish part of Belgium for at least three generations, and their four grandparents had Flemish surnames.
Again, no linkage was observed for IBD1 in our population after having almost doubled the number of affected pairs for the actual genome scan. However, the first underlying gene indicating susceptibility to CD was identified as NOD2/CARD15 and all families were genotyped for the three main associated variants within this gene. Overall, 51.4% of CD patients carried at least one variant within NOD2/CARD15; 6.4% were homozygotes and 7.3% compound heterozygotes. Although previously, and also in this genome scan, we did not observe linkage to the IBD1 region, the underlying NOD2/CARD15 gene is indeed implicated in the Flemish CD population. This finding emphasises once more the difficulty in replicating linkages in complex diseases. This is almost only possible with large collaborative approaches and hence more studies, such as the combined study on IBD1 by the International Genetics IBD Consortium,28 are warranted to analyse other underlying genes in IBD.
Earlier findings of linkage on Xq around DXS990 were also observed in this genome scan and further fine mapping within this region is ongoing.47 Finally, new linkages were observed in this genome scan—namely, on chromosomes 1q, 6q, and 20p12.
In this study, families expressing ASCA were analysed as a separate subgroup, independent of CD and mixed families. ASCA have been detected in a higher proportion of unaffected first degree relatives of CD patients, in contrast with healthy spouses,4,46 and intrafamilial aggregation of ASCA has been demonstrated,48,49 which indicates that ASCA expression is genetically modulated. However, in this genome scan, we could not identify a specific locus associated with ASCA expression. In contrast, the multipoint linkage curves of the ASCA+ subgroup coincided very well with those of the CD subgroup, further indicating the close relationship between this serological marker and the clinical CD phenotype. Similarly, mixed families were analysed as a separate subgroup as we demonstrated that serological antibody expression in mixed families differs from pure CD or UC families, being less stringently related to the phenotype.4 There was linkage observed on chromosomes 8, 9, 16, 17, and 20 with only these mixed families. However, given the small size of this subgroup, no definitive conclusions should be drawn from these results, and whether these mixed families represent a different entity needs to be established.
We further tested epistasis between the observed linkage regions and found interactions with chromosomes 6 and 20, both when conditioned for the identified susceptibility region on chromosome 10. It is believed that epistasis results from interactions among loci that contribute to the same biochemical or developmental pathways. All complex disorders probably include gene-gene interactions, at least to some extent.50,51 However, there is no agreement on the best way to look for locus-locus interactions and also how to correct for multiple comparisons. We therefore acknowledge that our findings remain exploratory until confirmed. Identification of epistasis however can help to lead us towards common gene pathways. Indeed, all observed regions of linkage harbour a number of interesting candidate genes (table 3).
In conclusion, this genome wide scan in a Belgian population of IBD affected families supports the existence of IBD4 on 14q11, has shown additional evidence for the existence of other susceptibility loci, and has revealed new regions, all in which lie a number of interesting candidates. The results further indicate that epistasis and gene-gene interactions are also present in IBD and that population heterogeneity is not to be underestimated.
This work was supported by a grant from the Funds for Scientific Research (FWO), Brussels, Belgium (to SV) and a grant from A Lazarri.
The authors would like to thank the IBD patients, their families, and the National Crohn’s and Colitis Foundation (CCV) for their collaboration. We would also like to thank the following physicians for their contribution of families and clinical data: Dr F Baert, Dr S Bourgeois, Dr L Van De Mierop, Dr F D’Heygere, Dr P Schurmans, Dr G D’Haens, Dr P Potvin, Dr G Deboever, Dr P Maes, Dr L Terriere, Dr Bertrand, Dr B De Schepper, Dr F Van De Mierop, Dr M Cabooter, Dr M Simoens, Dr Van Outryve, Dr L Vandeputte, Dr Mokabbem, Dr J Wyndaele, Dr R Milo, Dr N Büscher, Dr P Van Der Spek, Dr G Van Assche, Dr D Mendez de Leon, Dr R Kums, Dr D Walgraeve, Dr P Pelckmans, Dr M Ferrante, Dr J Callens, and Dr D Sprengers.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.