Introduction

Crohn's disease (CD) and ulcerative colitis (UC), the two major forms of inflammatory bowel disease (IBD), display a variety of clinical manifestations and have a multifactorial genetic component. Linkage studies have identified at least eight IBD susceptibility loci on different chromosomes (OMIM #266 600), some possibly involved in both CD and UC, and some in one disease only.1 The aetiology of CD and UC is still unclear. Clinical and experimental evidence, however, points to the involvement of a deregulated natural and immune response to intestinal antigens in their pathogenesis.2 The IBD1 locus on chromosome 16 is a major susceptibility locus for CD,3 and the only one in which a responsible gene, initially named NOD2 and later CARD15, has been identified.4,5 It encodes a cytoplasmic protein, mainly expressed in myelomonocytic and dendritic cells, which is known to regulate inflammation and apoptosis through activation of the NF-κB pathway.6 The CARD15 protein is also expressed in the intestinal epithelium, where it exerts an antibacterial role.7 In Caucasian populations, three recurrent CARD15 variants, namely two missense mutations, R702W and G908R, and the C-terminal 3020insC frameshift mutation leading to L1007fs, have shown a significant association with CD, but not with UC. A previous genetic analysis of Italian patients was confined to the study of L1007fs in CD.8 We have now investigated the association of all variants with both CD and UC.

Patients and methods

In all, 276 unrelated patients, with an established diagnosis of CD (184) or of UC (92), were retrospectively enrolled in three Torino gastroenterology centres: 17% of the CD patients and 12.5% of the UC patients were from multiplex families. Clinical features were recorded on a standard form according to the Vienna criteria for CD:9 age at onset (A1 <40, A2 >40 years), disease behaviour (B1 nonstricturing, nonpenetrating, B2 stricturing, B3 penetrating), location (L1 terminal ileum, L2 colon, L3 ileocolon, L4 upper GI), together with severity, extraintestinal manifestations (rheumatological, dermatological, ocular, liver and biliary, amiloidosis), type of onset and known risk factors such as smoking (yes for current smokers at the time of the study, ex for smokers in the past) and familiarity. The clinical features considered for UC were age at onset, behaviour and extent of the disease, presence of extraintestinal manifestations, familiarity and smoking habits. Diagnoses and all clinical data were validated by the referring clinicians, who were not aware of the CARD15 genotype results. DNA samples from 177 medical students were used as controls. Informed consent was obtained from each of the participants.

Molecular analysis

The entire sequence of exons 8 and 11 and the relevant segment of the large exon 4 of the CARD15 gene were amplified by PCR using the following primers designed on the published genomic sequence (GenBank CARD15 gene accession number: AJ303140):

A measure of 100 ng of genomic DNAs was amplified in a final volume of 25 μl, containing the buffer supplied by the manufacturer, 2 mM MgCl2, 10 pmol of each primer, 50 nmol each dNTP and 1 U AmpliTaq polymerase (Applera). An initial denaturation of 5 min was followed by 30 cycles of 30 s at 94°C, 30 s annealing at 61°C for exon 4, 60°C for exon 8, 55°C for exon 11, 30 s elongation at 72°C, followed by a final extension of 7 min at 72°C. The size and amount of the products were checked by agarose gel electrophoresis. The three variants were detected: by digestion of exon 4 with MspI for R702W (NCBI SNP CLUSTER ID: rs2066844), exon 8 with HhaI for G908R (rs2066845) and exon 11 with NlaIV for L1007fs (rs2066847). R702W homozygotes and heterozygotes were discriminated by separation of the small restriction products from exon 4 (wt allele: 329+66+54 bp, mutated allele: 329+120 bp) by 6% polyacrylamide gel electrophoresis in nondenaturing conditions.

Statistical analysis

The association of the three CARD15 variants were estimated as odds ratios (OR) and 95% confidence intervals (CI) through univariate logistic analysis. The genotype frequencies observed in patients and controls were compared with Hardy–Weinberg expectations by using the χ2 statistics. Phenotype–genotype correlations were first analysed by univariate logistic regression and then by multivariate analysis, adjusting the effect of CARD15 positivity according to the status of smoking, familiarity and sex. For all calculations, we used the S-Plus2000 software (Insightful, Inc, USA).

Results

Genotypes of the three recurrent variants in CD and UC patients and controls are reported in Table 1. We first evaluated the association with positivity for each variant by grouping homozygous and heterozygous genotypes (Table 2): G908R and L1007fs were significantly more frequent in CD than in controls and UC. The frequency of R702W was also increased, but not significantly, in both CD and UC; even the conditional comparison of this variant in patient and control subgroups negative for the two other variants only marginally strengthened the association with CD (OR l.66, CI 0.9–3.0), and not that with UC (OR 1.69, CI 0.8–3.1). In accordance with other studies, we then grouped all three variants into a single ‘mutated’, M allele, whose gene frequency was 19.6% in CD, 14.1% in UC and 9.6% in controls (Table 3). Again, positivity for this allele was significantly associated with CD only (Table 2). The association with CD was stronger for the M/M genotype (OR 13.9), and weaker but still significant for the M/wt genotype (OR 1.7, Table 4). Moreover, the M/M, M/wt and wt/wt genotype frequencies closely approached Hardy–Weinberg expectations among controls, while there was a significant excess of M/M in CD. To assess the mutation–phenotype correlations, we first considered the frequency of positivity of CARD 15 variants in CD and UC patients with different clinical features. Univariate analysis disclosed in UC patients no significant association between CARD15 variants and any of the clinical features. In CD, the estimated chance of finding at least one genetic variant was significantly increased among patients with a stricturing behaviour, and showed a marginally significant increase among those with ileal location (Table 5 third column). The same contingency tables were then used to consider each clinical feature (age at onset, behaviour, location, extraintestinal manifestations, type of onset) as an outcome, with sex, familiarity, positivity for a CARD 15 variant and smoking as explanatory variables (Table 5 sixth column). After adjusting CARD15 positivity for smoking, familiarity and sex, multivariate analysis resulted in higher risk estimates for stricturing behaviour and ileal location, with OR 2.76 and 3.0, respectively; the risk of penetrating behaviour also became significant with OR 2.59.

Table 1 Observed CARD15 genotypes
Table 2 Association of the recurrent variants with CD and UC
Table 3 Gene frequencies of the recurrent variants
Table 4 CARD15 genotype associations and Hardy–Weinberg disequilibrium in CD
Table 5 Genotype–phenotype correlations

The statistical significance of most of these correlations vanished when M/M and M/wt genotypes were considered, and only held for stricturing behaviour (not shown).

Discussion

This study of the association of the three recurrent CARD15 variants with IBD in Italians found significant associations for L1007fs and G908R with CD only. The frequency of R702W was slightly but not significantly increased in both CD and UC patients vs controls. Since a previous study in Italy only considered L1007fs,8 these findings provide the first confirmation in Italian patients of the CARD15/CD association reported in other Caucasian populations (Table 6). The overall fraction of CD patients with at least one variant was 32.5%, lower than in Quebec10 and western Europe,11 but similar to that in Germany and the UK,12 and higher than in Ireland13 and Crete.14 In most studies, the L1007fs variant displayed the strongest association, while those of G908R and R702W were weaker and only significant in some studies. This may result both from the limited number of patients (ranging from 55 to 688) or controls (62−409), as well as from true ethnic differences. As noted by Hirschhorn et al15 and Colhoun et al,16 replication of weaker associations, even if true, is difficult, and requires larger samples. In this respect, the case of CARD15 variants in CD is a robust exception if compared with all previous IBD association studies, which considered several polymorphisms, for example, in the TNFα promoter, IL-1 and other cytokine genes and receptors, and led to highly variable findings.1 An extreme example of an ethnic difference is the lack of the three variants repeatedly reported in CD and the general population of Japan17,18,19 and Korea.20 Since, in these studies, CARD15 has been extensively resequenced in CD patients, any role of IBD1 is unlikely to depend on other coding sequence variants. Other IBD susceptibility loci are probably responsible for CD susceptibility in those Eastern populations. In Ashkenazi Jews, a population with high incidence of CD, the CARD15 gene is involved, but with a peculiar allelic spectrum encompassing G908R and L1007fs, but not R702W.21 Interestingly, an additional, population-restricted risk haplotype was present in a substantial fraction of these patients, but no pathogenic mutation(s) could be identified.21 In the survey conducted by Lesage et al11 on 453 European CD patients, the fraction of positive patients increased from 41.5 to 49.4%, thanks to an extensive search of rarer variants in the entire coding sequence in addition to the three recurrent ones. Similarly, to fully estimate the contribution of CARD15 to CD susceptibility, it would also be necessary to characterize rare or population-restricted variants in Italians. In other populations, stronger associations have been reported for homozygotes and compound heterozygotes (OR values of 20–40) than for simple heterozygotes (OR 2–4).1 Our study is in agreement with such a gene-dosage effect, although at lower levels – OR 13.9 and 1.7, respectively. An excess of patients carrying two CARD15 variant alleles also resulted from comparison with Hardy–Weinberg expectations, a kind of analysis that may overcome limitations originating from the rarity of some genotypes in the control group. The excess of homozygotes and compound heterozygotes suggests a multiplicative rather than additive action of the two variants.22 Hardy–Weinberg disequilibrium is expected to involve, together with the true disease-susceptibility gene, other neutral nearby loci, if their alleles happen to be associated by linkage disequilibrium. Thus, the different association strength of the three recurrent CARD15 variants may depend on a different linkage disequilibrium with an unknown nearby mutation, that would be stronger for L1007fs and G908R and weaker for R702W. This hypothesis would easily accommodate the finding of IBD1 risk haplotypes with no identified mutation, in both Ashkenazi Jews21 and the German population.23 However, this hypothesis is now made unlikely by growing evidence of a defective function of the mutated protein in response to pathogen-associated molecular patterns. Ogura et al24 first demonstrated that CARD15 transduces NF-κB activation signals in transfected epithelial cells in response to lipopolysaccharide, and specifically, as shown later, to its peptidoglycan contaminant.6,25,26 Muramyl-dipeptide has been identified as the bacterial component whose response is specifically mediated by CARD15.27 These studies demonstrated a defective transducing activity of the three CD-associated variants. CARD15 genotypes with a more severe defect would confer a greater susceptibility, and show stronger disease association, than those with only partially defective function, whose associated risk is indeed weaker.

Table 6 Association of CD with recurrent CARD15/NOD2 variants in different populations

The finding of a lower but significant risk of individuals carrying a single variant (heterozygous genotype) could be ascribed to a halved CARD15 dose in their cells.

Alternatively, as postulated by Inohara et al,27 even patients typed as heterozygotes may actually lack the CARD15 function because of a second undetected variant on the other allele.

The analysis of mutation–phenotype correlations revealed an increased chance of mutation positivity in patients with stricturing behaviour, and a weaker association with ileal location, in agreement with most previous studies.1 However, since ileal location and stricturing behaviour are intertwined and associated with age at onset, familiarity and surgery,28,29,30 these associations are probably influenced by several confounding effects. In a prospective study by a Belgian group,31 early development of stricturing or penetrating behaviour was influenced by disease location, clinical activity of the disease, and smoking habit, but not by the CARD15 genotype. In our series, the proportion of familial and sporadic cases and the proportion of patients with smoking habits were largely similar in the groups of patients with CD with or without CARD15 variant. However, when multivariate analysis was performed with ileal location and stricturing and penetrating behaviour as outcomes, and CARD15 positivity as explanatory variable, we found a significant association after weighting for smoking, familiarity and sex. In agreement with other studies, no significant association involved other features, such as type of onset or extraintestinal manifestations.

The three recurrent CARD15 variants are more similar in terms of high prevalence and low conferred risks to gene polymorphisms associated with common multifactorial diseases, than to mutations responsible for Mendelian diseases. As usual in other common diseases, the chances of developing CD are determined by complex interactions among various involved genes and environmental factors. Identification of such genetic and environmental factors could help to better understand and control the development of inflammatory lesions. Our findings in the Italian population confirm the CARD15 genotype as an explanatory variable to predict the pattern of disease presentation and progression.