Gut 50:624-628 doi:10.1136/gut.50.5.624
  • Small intestine

The first large population based twin study of coeliac disease

  1. L Greco1,
  2. R Romino1,
  3. I Coto1,
  4. N Di Cosmo1,
  5. S Percopo1,
  6. M Maglio1,
  7. F Paparo1,
  8. V Gasperi1,
  9. M G Limongelli1,
  10. R Cotichini2,
  11. C D'Agate3,
  12. N Tinto4,
  13. L Sacchetti4,
  14. R Tosi5,
  15. M A Stazi2
  1. 1Department of Paediatrics, University of Naples Federico II, Naples, Italy and European Laboratory of Food Induced Disease (ELFID), Naples, Italy
  2. 2Istituto Superiore di Sanità, Rome, Italy
  3. 3Italian Coeliac Society, Southern Italy Regions, Italy
  4. 4Department of Biochemistry and Medical Biotechnologies, University of Naples Federico II, Naples, Italy
  5. 5Cellular Biology Institute, CNR, Rome, Italy
  1. Correspondence to:
    Professor L Greco, Università di Napoli Federico II, Dipartimento di Pediatria, Edificio 11, Via S Pansini 5, 80131 Napoli, Italy;
  • Accepted 7 August 2001


Background and aims: The genetic load in coeliac disease has hitherto been inferred from case series or anecdotally referred twin pairs. We have evaluated the genetic component in coeliac disease by estimating the concordance rate for the disease among twin pairs in a large population based study.

Methods: The Italian Twin Registry was matched with the membership lists of a patient support group. Forty seven twin pairs were recruited and screened for antiendomysial (EMA) and antihuman-tissue transglutaminase (anti-tTG) antibodies; zygosity was verified by DNA fingerprinting and twins were typed for HLA class II DRB1 and DQB1 molecules.

Results: Concordance rates for coeliac disease differ significantly between monozygotic (MZ) (0.86 probandwise and 0.75 pairwise) and dizygotic (DZ) (0.20 probandwise and 0.11 pairwise) twins. This is the highest concordance so far reported for a multifactorial disease. A logistic regression model, adjusted for age, sex, number of shared HLA haplotypes, and zygosity, showed that genotypes DQA1*0501/DQB1*0201 and DQA1*0301/DQB1*0302 (encoding for heterodimers DQ2 and DQ8, respectively) conferred to the non-index twin a risk of contracting the disease of 3.3 and 1.4, respectively. The risk of being concordant for coeliac disease estimated for the non-index twin of MZ pairs was 17 (95% confidence interval 2.1–134), independent of the DQ at risk genotype.

Conclusion: This study provides substantial evidence for a very strong genetic component in coeliac disease, which is only partially due to the HLA region.

Coeliac disease (CD) is a permanent gluten intolerance associated with specific HLA class II haplotypes—that is, DRB1*03 and DRB1*05/07 which are both linked to DQA1*0501/ DQB1*0201 (DQ2), and DRB1*04 which is linked to DQA1*0301/DQB1*0302 (DQ8).1 Disease concordance rate studies in monozygotic (MZ) twin pairs, who are genetically identical, versus dizygotic (DZ) twin pairs, who are no more genetically similar than other siblings but who share a common environmental background, are a powerful means of assessing the weight of genetic and environmental factors in diseases. The concordance rate of DZ twins versus that of non-twin siblings provides an estimate of genetic versus environmental factors. However, the value of twin studies depends on careful methodology and particularly on the method used to recruit twin pairs. Essentially, recruitment of twins can be volunteer based or population based. Obviously, volunteer twin recruitment is subject to biases that affect estimates of heritability in a largely unpredictable way. In general, volunteer twin recruitment has been reported to over select MZ and female twins.2 The end result is usually an overestimation of the genetic component. Bias is minimal in population based studies because individuals are identified as twins in a National Twin Registry and then assessed independently for their illness.3

Although a high concordance rate for CD in twins is not a new concept, earlier data were based on case list studies. Indeed, there is no fully informative population based study that includes DZ twins from a clinical series. Of the 14 twin studies of CD published in the last 25 years,4–16 11 concern only one twin pair and two concern two and three pairs, respectively; none provides information on DZ twins. Polanco and colleagues8 reported a 70% concordance rate among 21 MZ pairs collected from 17 centres: all were clinical cases, and the discordant individuals were not investigated. DZ pairs were almost absent from the study.

We have conducted a population based twin study of CD in Italy because a National Twin Register has become available.17 The register contains over 1 600 000 potential twins, identified by their “fiscal code” (that is, an alphanumerical code assigned to each Italian individual at birth), and is the largest in the world.17 In addition, we were able to cross link the twin registry with the membership lists of the Italian patient support group, Associazione Italiana Celiachia (AIC), to identify individuals affected by CD. It has been estimated that about 50% of diagnosed individuals are included in the AIC registry.18 It is noteworthy that although the twinning rate in Italy dropped from 12.6/1000 pregnancies in 1955 to 9.6 in 1983,19 DZ to MZ pregnancies have remained at a ratio of 2:1.20

The aims of this study were to evaluate: (1) the concordance rate for CD in MZ and DZ twin pairs; and (2) the independent contribution of specific HLA class II haplotypes to CD in order to determine the global genetic load. Five of six twins with dermatitis herpetiformis were found to be concordant for the disease, with mixed phenotypes.21



We matched the AIC membership lists of the five regions of Southern Italy (6048 cases) with the National Twin Registry. This registry was constructed from a database of “fiscal codes” that identify an individual's surname, date of birth, place of birth, and place of residence, and includes nearly 1 600 000 potential twins alive on 31 December 1996.17 Matching of the files produces four levels of probability (based on the aforementioned variables) of being a twin pair. Each pair resulting from the matching was contacted by us to verify twinship. To date, 58 twin pairs have been identified, and 47 entered our study. The verified twin pairs were visited and checked with regard to their health status, symptoms, and associated diseases. The diagnostic criteria of all probands were verified according to the ESPGAN revised criteria.22 The “index case” was the chronologically first diagnosed twin in the family.

DNA extraction

Peripheral blood samples were collected from identified individuals by venepuncture using EDTA as anticoagulant. A serum sample was also collected. Genomic DNA was isolated from peripheral blood lymphocytes using the salting out procedure.23 Purified DNA was quantified by spectrophotometry at 260 nm.

Serological studies

Twins were screened for antiendomysial (EMA) and antihuman-tissue transglutaminase (anti-tTG) antibodies. Coeliac disease specific IgA autoantibodies to endomysium were detected and semiquantified by indirect immunofluorescence on sections of umbilical cord, according to the method described by Sacchetti and colleagues.24 IgA anti-tTG antibodies were identified using an enzyme linked immunosorbent assay (ELISA).25,26 Dilutions of a positive reference serum, converted to concentrations of arbitrary ELISA units (EU/ml), were used to construct a standard curve. The dosage of total IgA with monoclonal monospecific antibodies and nephelometric procedures (BNA-Dade Behring) did not reveal any IgA deficient individuals.

Zygosity checking

The zygosity of twin pairs of the same sex was verified by DNA typing of nine short tandem repeats localised on nine different chromosomes27 with the AmpFISTR Profiler Plus Kit (PE Applied Biosystems, Forster City, California, USA).28 The polymerase chain reaction products were then analysed by capillary electrophoresis on the ABI Prism 310 apparatus (PE Applied Biosystems).

HLA typing

Each individual was typed for HLA class II DRB1 and DQB1 molecules. A Dynal AllSet+ SSP DR low resolution kit and a Dynal AllSet+ SSP DQ low resolution kit (Dynal Oxoid, Cologno, Milan, Italy, 1999) were used for typing. Results were obtained after 2% agarose gel electrophoresis.

Intestinal biopsy

Clinically discordant twins with a positive EMA and tTG antibody test were offered a small bowel biopsy. The small bowel biopsy was performed while on a gluten containing diet and sampled with a paediatric or adult Watson capsule or with forceps during upper gastrointestinal endoscopy. Formalin fixed haematoxylin-eosin stained biopsy specimens were analysed under light microscopy. CD3+ and γδ lymphocytes were counted after immunohistochemical staining.29

Concordance rate

CD concordance was assessed separately for MZ and DZ twin pairs using pairwise (Pr) and probandwise (Cc) concordance rates. Pr is a descriptive statistic and simply gives the proportion of affected pairs that are concordant for the disease; it was estimated as described by Emery30 and MacGregor31: Pr=C/(C+D), where C is the number of concordant pairs and D is the number of discordant pairs. Cc is the proportion of affected individuals among the co-twins of previously ascertained index cases and was estimated as Cc=2C/(2C+D): this estimates the probability that a twin in a pair is affected given that his/her co-twin is affected and is informative of the recurrence risk of disease associated with relatedness of the pair.

Multivariate analysis

A logistic regression analysis was used to evaluate the independent effect of HLA and other variables on the risk for CD concordance. The sampling unit was the pair of twins and concordance for CD was the outcome variable. DZ twins sharing 0 or 1 alleles at the critical HLA region served as the reference category, and the risk of the non-index twin of being affected was estimated for DZ pairs with an identical HLA pattern and for MZ pairs. The other variables included in the model were the actual HLA genotype of the non-index twin, her/his sex, and age at CD. With this model we were able to estimate the independent risk of being concordant for CD due to zygosity, number of shared HLA haplotypes, age, and sex.


The study protocol was approved by the ethics committees of the University of Naples Federico II and of the AIC. Information on the study was given to probands and healthy family members. All subjects enrolled in the study granted permission to search hospital records, and consented to the serological screening and blood sampling for HLA typing. Matching of the Italian National Registry with the AIC patient support group to identify individual patients was performed in accordance with the Italian privacy law on data confidentiality.


Ascertainment and zygosity

Among the 6048 (63% females) AIC members born before 31 December 1996, 75 potential twins were identified after cross linkage with the Italian Twin Registry.

Fifty eight pairs of twins were confirmed, giving a ratio of 1.9 twins/100 individuals.20 Blood samples were available for 47 pairs. It is important to note that there was no selection bias on the grounds of symptoms or clinical evidence as they were recruited on the basis of date of birth, name, and birth place.

These 47 pairs entered the study and were examined for zygosity using specific markers: three of the 22 presumed MZ pairs showed different genetic profiles and were reclassified as DZ pairs. One of the 25 presumed DZ pairs was reclassified as MZ pair. Hence 20 pairs (19+1) were classified as MZ (six male pairs, 14 female pairs) and 27 pairs (24+3) as DZ (eight male pairs, seven female pairs, 12 opposite sex). The MZ/DZ same sex/DZ opposite sex ratio was 1.3:1.0:0.8 (which is very similar and not statistically different from the expected population model—that is, 1:1:1; χ2 =0.14, p=0.7).20 Consequently, there was no ascertainment bias in our sample, which can therefore be considered a true population based sample.

Disease status

In 13 of the 47 pairs identified, both twins had a clear diagnosis of CD. Five individuals, who were considered negative for CD before our study, were positive at EMA-anti-tTG screening; all underwent biopsy and a flat mucosa was observed in all cases.

In the probands, the most frequent symptoms of CD were diarrhoea (23/47 cases), weight loss (17/47), vomiting (27/47), and abdominal distension (12/47). All symptoms subsided on a gluten free diet.

Concordance rate

As shown in tables 1 and 2, 15/20 MZ twin pairs were disease concordant (85.7% probandwise, 75.0% pairwise): 12 pairs were known to have CD before this study and three MZ pairs were found to be concordant during our screening because of positive EMA and anti-tTG associated with a flat mucosa. Four of the five discordant twins underwent biopsy and there was no sign of mucosal damage or intestinal activation (evaluated with immunohistochemical markers of CD3 and γδ T lymphocytes); they were also negative for anti-tTG antibodies. These four discordant pairs were 5, 8, 11, and 22 years old. The fifth discordant twin (a 40 year old woman who was anti-tTG negative) refused biopsy.

Table 1

Sample description of the 47 twin pairs analysed

Table 2

Patient characteristics of the 47 twin pairs analysed

Three of the 27 DZ pairs were concordant (20% probandwise, 11% pairwise). One of these was known before the study; two were identified by screening and biopsy.

Of the five MZ discordant pairs, four were female (80%); 10 of the 15 concordant pairs (67%) were female.

One of the three DZ concordant pairs was female, and two were of the opposite sex. Six of the DZ discordant pairs were female, 10 were of the opposite sex, and in 7/10 of these opposite sex pairs the proband was female (tables 1, 2).

HLA stratified concordance

Table 3 shows the concordance rate in MZ twins according to HLA pattern. Sixteen of the 20 MZ pairs showed the DQA1*0501/ DQB1*0201 genotype (DQ2, 12 concordant and four discordant), one concordant pair showed the DQA1*0301/DQB1*0302 genotype (DQ8) which is linked to DRB1*04, and three were DQ2 and DQ8 negative (two concordant and one discordant). These three pairs had an unusual HLA pattern; two expressed DRB1*07/07 and one expressed DRB1*07/08. They have been typed for the gene DRB4 and all three pairs show the DRB4*01 allele so that all of these individuals are Dw53+, which is strongly implicated in the pathogenesis of CD.32

Table 3

HLA patterns in monozygotic twin pairs

As shown in table 4, one of the five DZ pairs with an identical HLA was concordant; 4/5 (80%) were discordant (one of the latter showed the DRB4*01 allele and therefore Dw53+). Two of the 13 (7.3%) DZ twin pairs with an at risk HLA haplotype but different in molecular terms from the proband, were concordant. None of the eight DZ twins without an at risk HLA genotype was concordant with the proband. It is noteworthy that there was no difference in concordance rate between the DZ pairs that had the same HLA (IBS status=2) and the DZ pairs not sharing the same HLA (IBS status=1 or 0) but with an at risk haplotype (exact Fisher's test, p=0.64).

Table 4

HLA pattern in dizygotic twin pairs

Multivariate analysis

Multivariate analysis showed an increased, although not always statistically significant, risk of CD concordance in twin pairs for all variables considered (table 5). Female sex conferred a 30% excess risk in the non-index twin. Compared with children aged ≤10 years, the risk of being concordant was 1.6 for the 11–30 age group and 2.2 for pairs older than 30 years. The DQA1*0501/DQB1*0201 genotype conferred to the non-index twin a risk of CD of 3.3 (95% confidence interval (CI) 0.4–30.0) versus 1.4 for the DQA1*0301/DQB1*0302 genotype. After adjusting for the DQ at risk genotype, DZ pairs sharing the haplotype had an additional 1.4 risk of being concordant for CD whereas the same risk estimated for MZ pairs was 17.0 (95% CI 2.1–134.0).

Table 5

Odds ratios (OR) for coeliac disease concordance among non-index twins, estimated by logistic regression


Coeliac disease has a very strong genetic component: the ratio of the risk of first degree relatives (13%)33,34 to the population risk (0.5%)35 is 26. The CD concordance rate in twins has long been cited as proof of the “genetic load” of the disease. However, concordance rates were based only on case series that rarely included DZ twins.

We have conducted the first population based study in a large area—that is, the south of Italy, which constitutes 40% of the Italian population. To verify that there was no ascertainment bias in the study, we compared the “recruited” twinning rate (1:104) with the expected rate (1:103)20 and also the MZ:DZ of the same sex:DZ of the opposite sex ratio (1.3:1.0:0.8) with the expected ratio (1:1:1). The close agreement between the observed and expected ratios suggests that this study was not affected by a reporting and/or selection bias.

Unlike earlier screening methods, the anti-tTG test was very efficient in identifying silent CD cases.25,26 In fact, anti-tTG screening revealed five clinically silent concordant pairs. However, this test does not predict who may become positive in the future. Therefore, data for individuals less than 20 years of age should be regarded with caution as some of these discordant pairs may change status in the future.

Three of the five MZ discordant pairs were children. One adult MZ pair was discordant at both screening and biopsy. In the other adult discordant pair, the non-index twin was negative for the tTG screening test but refused biopsy. We must await the results of follow up examinations to establish the definitive rate of concordance.2 Our MZ concordance rate of 75.0% probably errs on the low side, and is likely to increase during the follow up of discordant twins. Identification of three asymptomatic cases improved the estimate of concordance, which is quite high for a multifactorial disease. In fact, the concordance rate in MZ twin pairs is 13% for insulin dependent diabetes mellitus,36 12.3% for rheumatoid arthritis,37 26.7% for multiple sclerosis,38 11.1% for systemic lupus erythematosus,3916% for ulcerative colitis,40 and 20% for Crohn's disease.40

The conclusion of this study does not differ greatly from that of earlier non-population based studies. The MZ versus DZ concordance rate provides a reliable estimate of the size of the genetic component in a disease whereas the DZ twins versus ordinary siblings concordance rate gives an estimate of the effect of a shared environment.41 In the case of CD, the DZ twins:siblings ratio appeared to be close to 1 as the incidence of the disease in siblings was above 11.1% which is very close to that observed in DZ twin pairs (11%). Therefore, our data may suggest that a shared environment (gluten antigen aside) has little or no effect on the concordance of DZ twins reared together.

As expected, the risk of being concordant increased with age and with female sex of the non-index twin. Interestingly, all of the MZ pairs not presenting DQA1*0501/DQB1*0201 or DQA1*0301/DQB1*0302 had a DRB1*07 allele. The pairs in this group were: DRB1*07/07 (CD concordant), DRB1*07/08 (CD concordant), and DRB1*07/07 (CD discordant).

The concordance rate did not differ significantly between DZ twins with the same HLA pattern and DZ twins with different HLA patterns but with the at risk HLA haplotype. Therefore, it is not inconceivable that the HLA molecules that trigger the immune reaction to gliadin peptides do not differ in relation to the CD associated HLA genotype.

These data suggest that environmental factors, apart from gluten, have little or no effect on the pathogenesis of CD. It is generally agreed that HLA contributes to the onset of CD. However, the MZ/DZ concordance rate ratio (0.857/0.20=4.28) and the logistic regression analysis showing that, after adjustment for the HLA at risk haplotype the risk of MZ pairs to be concordant was 17, suggest that other genes are involved in the pathogenesis of the disease. A long term follow up may reveal that the discordance of MZ twins is due to epigenetic factors.

Although genes other than HLA may be implicated in CD, large genome screening studies have failed to identify a gene that exerts a major effect. Consequently, it is likely that the large genetic load of CD identified in this study may not be produced by a missing or “altered” gene but by a series of genetic characteristics which individually exert little effect but which collectively characterise a large gluten intolerant tribe that is spread throughout the gluten consuming world.42


This study would not have been possible without the enthusiastic help and participation of the Italian Coeliac Society. The authors are extremely grateful to Marco Salvetti and Giovanni Ristori for their precious work on establishing the Italian Twin Registry. The study was supported in part by TELETHON (grant No. E0552) “genetics of celiac disease”, it also received financial support from the Commission of the European Communities, specific RTD programme “Quality of Life and Management of Living Resources”, QLRT,-CT-1999-00037”, Evaluation of the prevalence of celiac disease and its genetic components in the European population” (it does not necessarily reflect its view and in now way anticipates the Commission's future policy in this area), and by the Regione Campania (P.O.P. Action5.4.2Funds1997), the Italian Ministry of Health (MINSAN) (D.L. 502/92, finalised research 1998), and MURST-PRIN-COFIN-2000 `Twin studies in coeliac disease'. We are indebted to Jean Ann Gilder for revising and editing the text.