Article Text


Genetic association between EPHX1 and Crohn’s disease: population stratification, genotyping error, or random chance?
  1. A P Cuthbert1,
  2. S A Fisher1,
  3. C M Lewis1,
  4. C G Mathew1,
  5. J Sanderson2,
  6. A Forbes3
  1. 1Division of Genetics and Development, Guy’s, King’s, and St Thomas’ School of Medicine, King’s College London, Guy’s Hospital, London SE1 9RT, UK
  2. 2Department of Gastroenterology, St Thomas’ Hospital, London SE1 7EH, UK
  3. 3St Mark’s Hospital, Northwick Park, Watford Rd, Harrow, Middlesex HA1 3UJ, UK
  1. Correspondence to:
    Professor C G Mathew
    Department of Medical and Molecular Genetics, GKT School of Medicine, 8th floor Guy’s Tower, London SE1 9RT, UK;

Statistics from

We read with interest the article by de Jong and colleagues (Gut 2003;52:547–51) reporting studies of genetic associations between DNA polymorphisms in xenobiotic metabolising genes and Crohn’s disease (CD). The authors employed a case control study design to test seven polymorphisms in five candidate genes for disease association. Evidence was found for a significant association of a single nucleotide polymorphism (SNP), Tyr113His (348T>C), in the microsomal epoxide hydrolase 1 gene (EPHX1), with CD. Homozygosity for the T (Tyr 113) allele was significantly higher in cases than in healthy controls (χ2 = 23.7, p<0.0001, odds ratio 2.9). The observed frequency of the T allele in controls was 41%, which is outside the range of frequencies (58–94%) reported in other control populations (reviewed in de Jong et al). Its frequency in CD cases was 67%. In view of the strength of reported association, we sought to replicate this observation. We genotyped the Tyr113His SNP (ref SNP ID rs1051740) in 307 independent sporadically ascertained cases of CD and 344 ethnically matched healthy control subjects.1 This compared with 151 cases and 149 controls typed by de Jong et al. Our study design provided 80% power to detect a significant difference (p<0.05) in allele frequency of ⩾7.5% between cases and controls compared with the difference of 26% observed in the published study. Our power calculations were based on an observed minor (C) allele frequency of 30.2% in our control cohort, the common (T) allele frequency being 69.8%.

We used TaqMan chemistry (Applied Biosystems) to genotype DNA from cases and controls with an Applied Biosystems 7700 Sequence Detection System. Preoptimised primers and fluorescent probes were obtained from Applied Biosystems (SNP assay ID C_14938_1). All cases and controls were previously genotyped for three confirmed disease susceptibility alleles for CD (R702W, G908R, L1007fs) in CARD15,2–4 permitting stratification of data by CARD15 mutational status to identify potential gene-gene interactions.5 Allele and genotype frequencies were compared between cases and controls using a χ2 test for difference in proportions. Likewise, a χ2 test was used to assess Hardy-Weinberg equilibrium (HWE) across genotypes.

We found no significant differences in allele or genotype frequencies between cases and controls (table 1). Stratification of the data by CARD15 mutation status showed no significant differences in Tyr113His allele frequencies in CD cases with none, one, or two CARD15 mutations. Genotypes in our cases and controls were in HWE (p>0.5).

Table 1

 Allele and genotype frequencies between cases and controls

Case control based studies of genetic association assume that differences in allele frequencies relate directly to the phenotype under investigation, and that no unobserved confounding factors exist which may be attributable to the associated allele. While having greater power than family based studies to detect associations through linkage disequilibrium mapping, case control analysis is susceptible to type I errors (false positives).6 One of the most commonly cited explanations for non-replication of genetic associations is stratification, through population admixture, and variability in disease frequencies between and within component subpopulations. However, relatively few instances of this have been clearly established. Stratification may be identified and potentially controlled for by incorporating anonymous genetic markers into the study design.7,8 However, the efficacy of this approach depends on the level of stratification present, and the difference in SNP frequency and disease prevalence in the normal and affected populations. We noted that in de Jong et al the distribution of genotypes in controls for SNP Tyr113His was not in HWE (χ2 = 5.67, p = 0.017). It is possible that this may have generated a type I error in their analysis. A degree of population admixture in their control cohort could account for the deviation from HWE and give rise to the observed association between the normally common T allele (as we observed) and Crohn’s disease. Alternative explanations are genotyping error and random chance. We examined the genotype distribution for the seven SNPs tested by de Jong et al and found that in addition to Tyr113His, the Ile462Val (1506A/G) SNP in CYP1A1 was not in HWE (χ2 = 7.87, p = 0.005). A recent review of published association studies by Xu and colleagues9 found that 12% of SNPs tested were inconsistent with HWE in control subjects.

Our findings highlight the value of testing genetic association data for normal genotype distribution, and for rigorous replication of genetic associations with adequate statistical power.


View Abstract

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles