BACKGROUND Susceptibility to coeliac disease is genetically determined by possession of specific HLA DQ alleles, acting in concert with one or more non-HLA linked genes. The pattern of familial risk is most parsimonious with a multiplicative model for the interaction between these two classes of genes. Haplotype sharing probabilities across the HLA region in affected sibling pairs suggest that genes within the MHC complex contribute no more than 40% of the sibling familial risk of coeliac disease, making the non-HLA linked gene (or genes) the stronger determinant of coeliac disease susceptibility. Attempts to localise these non-HLA linked genes have been carried out using both linkage and association tests.
AIMS To review the evidence for the involvement of non-HLA linked genes in coeliac disease, and to compare the relative merits of linkage and transmission disequilibrium tests (TDT) to detect the non-HLA linked gene (or genes) contributing to the development of coeliac disease.
METHODS Under a range of genetic models the number of affected sibling pairs needed to detect linkage was compared with the number of families required to show a relation between marker and disease, adopting the TDT strategy.
RESULTS AND CONCLUSIONS Power calculations show that, if there is a single major non-HLA linked susceptibility locus, a non-parametric linkage approach may well prove effective. However, if there are a number of non-HLA susceptibility genes, each with small effect, the sample size necessary for linkage studies will be prohibitive and a systematic search for allelic association should be a more effective strategy.
- coeliac disease
- non-HLA linked genes
- TDT test
Statistics from Altmetric.com
Gluten sensitivity, or coeliac disease, is due to T cell sensitisation and results in a range of mucosal abnormalities that may lead to malabsorption.1 Studies of small bowel biopsy specimens from first degree relatives of coeliac patients provide compelling evidence that genetic factors influence susceptibility to the disease.2 Reported recurrence risks vary between 5% and 20%, with most being around 10%.2 This variation probably results not only from genetic and environmental heterogeneity among populations, but also from differing diagnostic criteria between studies. Further support for an inherited predisposition to develop coeliac disease comes from twin studies. The concordance rate of coeliac disease in monozygotic twins is around 70%.2Incomplete concordance between monozygotic twin pairs suggests that additional factors might be involved, although not all the twin pairs studied had proven monozygosity, and some of the twin pairs had insufficient long term follow up to preclude the future development of coeliac disease.
Coeliac disease shows a strong HLA association. The DQ2 molecule encoded by the alleles DQB1*0201 and DQA1*0501 is possessed by 95% of coeliac patients compared with 20–30% of controls.3 This DQ(A1*0501,B1*0201) heterodimer can be encoded in cis or intrans configuration. The possibility that the DQ2 molecule conferring susceptibility to coeliac disease might be unique has been excluded by showing that the DQB1*0201 and DQA1*0501 alleles do not show any disease specific sequences in patients.3 The difference in concordance rates between monozygotic twins and HLA identical siblings (70% versus 30%) clearly implicates the involvement of non-HLA genes in the genetic predisposition to coeliac disease.2 3
The precise mode of inheritance of coeliac disease is unknown. Using family data, Pena and colleagues4 proposed that a prerequisite for developing coeliac disease was homozygosity at an HLA unlinked locus. This proposal has been supported by some, but not all studies.5 6 It is clear, however, that the stronger determinant of disease susceptibility is the non-HLA linked component. Detecting the non-HLA linked gene or genes can be undertaken by a number of different strategies. The purpose of this article is to compare the relative power of linkage versus the transmission disequilibrium test (TDT) for identifying a non-HLA linked coeliac predisposition gene.
Linkage using affected sibling pairs
Localisation of disease genes using the classical linkage approach is based on the demonstration of cosegregation of markers in families with the disease. The essential prerequisite for utilising this approach to detect disease genes is that the model of inheritance of the disease can be specified with some degree of certainty. Unfortunately many diseases like coeliac disease display a complex pattern of inheritance indicative of the interaction of a number of distinct susceptibility genes. To circumvent the requirement for a specified model of inheritance a number of non-parametric methods have been developed. These are based on determining which regions of the genome are identical by descent (IBD) in affected relatives. The most common paradigm of this approach utilises affected sibling pairs (ASPs) and is based on comparing the IBD allele sharing at a given marker with the expectation under the null hypothesis that no deleterious gene is present. A marker close to a susceptibility gene would be expected to display excess IBD sharing, with ASPs sharing both alleles IBD more frequently than sharing neither allele. The allele sharing probabilities of ASPs depend on the contribution any gene makes to the genetic variation of the trait.
This is generally measured in terms of the relative risk (λ) which is the risk to relatives of affected probands compared with the population risk. The probability that ASPs share 0, 1, or 2 alleles IBD (Z0, Z1, and Z2 respectively) is given by7: where λs, λo, and λmz are the sibling, parent-offspring, and monozygotic relative risks respectively. If a marker is unlinked to the disease, these allele sharing probabilities will be equal to those expected under the null hypothesis (0.25, 0.5, and 0.25). Figure 1 shows the effect of sibling relative risk on the probability that ASPs will share two IBD alleles. As the sibling relative risk increases, the probability that ASPs will share two alleles at the disease gene locus increases. These formulae hold true irrespective of the mode of inheritance at the disease locus, the number of alleles and their frequencies, penetrance, and population prevalence.7 The only requirement is that recombination between the marker and the disease gene is negligible. Recombination between the marker and disease gene leads to a reduction in allele sharing between ASPs. Therefore, for a given sibling relative risk an increase in recombination will lead to a reduced deviation of the marker sharing from its null expectation. This has the consequence that larger numbers of ASPs are required to detect linkage (fig 1).
Transmission disequilibrium test
An alternative strategy for identifying the location of a gene conferring susceptibility to coeliac disease is by allelic association based on showing over representation of a specific allele in affected individuals. The simplest means of undertaking this is to compare the frequency in affected individuals (cases) with the frequency in the general population (controls). Marker alleles that are positively associated with the disease are analogous to risk factors in epidemiology. A major problem inherent in this approach is that spurious associations can arise as a result of population stratification. One method of overcoming the problem of hidden population stratification is to use family based controls. The most common approach introduced by Spielman and Ewens8 is the TDT, which is based on the McNemar test for matched pair data. It considers only parents whose transmitted and non-transmitted alleles are different (heterozygous parents) and assesses the evidence for preferential transmission of one allele over the other (table 1). One additional attractive feature of the TDT is that it is also a test of linkage and not merely of linkage disequilibrium, as only linkage disequilibrium can distort the distribution of marker genotypes among parents of affected individuals. In addition to the analysis of parent-child trios, the TDT approach can be extended to family data. Provided that the sampling scheme is confined to nuclear families such as parent affected sibpairs (TDT-ASP) the test will still be valid for both association and linkage.
To compare the sample sizes required to detect a predisposition gene using linkage and association methods requires consideration of the level of acceptable type I (false positive) and type II (false negative) error rates (denoted by α and β respectively). The probability (power) that a test will correctly detect a deleterious locus is given by 1−β. Most linkage searches are based on approximately 250–400 markers at a density of 10–20 cM throughout the genome. Lander and Kruglak have proposed that a Lod score of 3.4 be used to define the appropriate level of significance (equating to α of 2.2 × 10−5) for genome wide searches using ASPs.9 Studies using the TDT method can be based on a candidate gene approach or can be on a genome wide basis. Risch and Merikangas10 proposed that 5 × 10−8 be adopted as a critical value for α in these genome wide tests. This is based on a Bonferroni correction to account for testing of five biallelic markers in 100 000 genes.10 11
The sample size required to detect any disease gene depends on its frequency and associated genotypic risk. Both these parameters are unknown for the non-HLA linked component of coeliac disease. However, the sibling relative risk can be derived from estimated overall and HLA linked sibling relative risks. Estimates of coeliac disease prevalence based on disease presenting symptomatically vary considerably in different areas of Europe and the USA.12-16 However, population studies based on antibody screening have all shown that the prevalence of coeliac disease is approximately 1 in 200.17-20 Based on this population prevalence rate and a 10% recurrence risk, the sibling relative risk of coeliac disease will be 20. The sibling relative risk of coeliac disease contributed by the HLA linked genes is approximately 3.2 21 22 Hence the non-HLA linked gene or genes will be the stronger determinant of coeliac disease compared with those linked to HLA. The actual relative risk will depend on the mode of interaction between the HLA linked and unlinked genes. These genes could interact either additively (the penetrance of the disease is represented by the sum of the penetrances contributed by two or more loci) or multiplicatively (the penetrance of the disease is the product of the penetrances contributed by two or more loci). The familial risks seen in siblings and monozygotic twins are most parsimonious with a multiplicative model. Based on this model the relative risk associated with the non-HLA linked genes will be approximately 6.0, accounting for approximately 60% of the familial risk of coeliac disease.
Table 2 shows the sample sizes necessary to detect a multiplicatively acting predisposition gene with 80% power using ASP linkage analysis and TDT approaches. A comparison of the relative power was made using the formulae derived by Camp.11 Power calculations for ASPs assume a highly polymorphic marker tightly linked to the disease locus. Estimates are based on gene frequencies of 0.01, 0.1, 0.2, and 0.5, and risk ratios of 1.5 to 16. These provide sibling relative risks compatible with the range of risk associated with an HLA unlinked susceptibility gene predisposing to coeliac disease.
The number of TDT trios is smaller in all cases than that necessary for linkage analysis using ASPs. However table 2 shows that relatively small numbers of ASPs are required to detect a predisposition gene when it is less common and the sibling relative risk is high. Under this scenario, linkage analysis based on ASPs will offer the most efficient strategy for detecting a disease gene, particularly given the limited amount of genotyping required compared with adopting a TDT approach. As the sibling relative risk becomes smaller the deviation of allelic sharing in ASPs from the null expectation becomes smaller (fig 1). As a consequence of this the numbers of siblings required to detect linkage increases dramatically. Once the disease allele is relatively common and its risk ratio is small the number of ASPs required will be prohibitively large. Table 2also shows the sample sizes required to detect a susceptibility gene using ASPs for TDT analysis (TDT-ASPs). In all cases the numbers of TDT-ASPs required to detect a disease gene is smaller that the numbers of TDT singletons, in many cases by a factor of four.
The sibling relative risk associated with the non-HLA linked genes in coeliac disease is comparatively large; however, the relative power of linkage and association detection methods will depend on whether this risk can be ascribed to a single locus or a number of different genes acting in concert. If the sibling risk is conferred by a single locus a linkage search will clearly be the best strategy. However, if susceptibility to coeliac disease is controlled by a number of non-HLA linked genes, each having a small effect, then a TDT based strategy should prove more effective.
Two genome wide linkage searches of coeliac disease families have been reported.23 24 The study reported by Zhong and colleagues23 was based on 15 Irish nuclear families containing 45 affected sibships. Linkage of coeliac disease to five new chromosome regions outside HLA were detected: 6p23 (telomeric to HLA), 7q31.3, 11p11, 15q26, and 22cen. These regions were evaluated in 28 northern European families reported by Houlston et al.25 No significant evidence of linkage was found except for chromosome 15. The other genome wide study which has been undertaken, reported by Greco et al, 24 was based on an analysis of 108 Italian ASPs. In addition to HLA there was some evidence for linkage to 5qter and 11qter. The presence of a predisposition gene conferring a sibling relative risk greater that 1.8 in the regions of 15q26 and 11p11, and a risk of 1.6 in the regions 7q31 and 22cen were excluded. It is possible that the discrepancies in findings between studies are due to differences in the contribution of genetic factors to coeliac disease in the different populations analysed. Alternatively, the significant findings reported by Zhong and colleagues23 may reflect in part the structure of the families studied. Specifically, 31 of the affected sibships belonged to three families. The inclusion of large sibships in which no typing information is available has been shown to lead to artificially inflated support to specific regions.
It is clearly far too early to dismiss linkage as a strategy for detecting coeliac disease susceptibility genes on the basis of the published data. However, if further linkage searches do prove to be ineffective, then an association study may offer a more attractive strategy for locating novel predisposition genes. At present these studies will be restricted to the evaluation of candidate genes for which there is a priori evidence to support their role as susceptibility genes. Three classes of genes involved in the pathogenesis of coeliac disease include those involved in antigen presentation, antigen recognition, and antigen modification. Adopting a candidate gene approach will obviously require smaller sample sizes, and have more power to show involvement of individual loci than to detect loci on a genome wide basis. The limitation of genome wide association and TDT studies is not, as Risch and Merikangas pointed out, a statistical one,10 but relies on the availability of markers at sufficient density. Marker reduction can only be achieved in the presence of strong disequilibrium, as less than maximum linkage disequilibrium between a marker and disease gene leads to a greatly increased sample size requirement.26 The outlook for genome wide studies may not be so negative however, since as Risch and Merikangas indicated, both the apo-E region conferring susceptibility to Alzheimer’s disease and the insulin VNTR region on chromosome 11p associated with diabetes, display strong linkage disequilibrium.27
The information derived from the human genome project coupled with the availability of microarray technology will, in the relatively near future, allow genome wide association studies to be undertaken. If no genes conferring susceptibility to coeliac disease are identified prior to the introduction of this technology, then their detection will ultimately only be limited by the availability of sufficient numbers of affected individuals and controls necessary for the identification of genes conferring small risks.
We are grateful to the two anonymous reviewers whose comments improved this paper. S Bevan is in receipt of a Fellowship from the Coeliac Society.
- Abbreviations used in this paper:
- affected sibling pair
- identical by descent
- transmission disequilibrium test
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.