Article Text


Changing genes; losing lactase
  1. R J Grand1,
  2. R K Montgomery1,
  3. D K Chitkara1,
  4. J N Hirschhorn2
  1. 1Divisions of Gastroenterology and Nutrition, Children’s Hospital Boston, and Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA
  2. 2Department of Genetics, Children’s Hospital Boston, and Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA
  1. Correspondence to:
    Dr R J Grand, Harvard Medical School, Division of Gastroenterology and Nutrition, The Children’s Hospital, 300 Longwood Avenue, Hunnewell G, Boston, MA 02115, USA;

Statistics from

Transcriptional regulation of the lactase-phlorizin hydrolase (LPH) gene by polymorphisms is associated with persistence of high levels of intestinal lactase activity or non-persistence

Lactase-phlorizin hydrolase (LPH), an intestinal microvillus membrane enzyme that hydrolyses lactose, is a critical enzyme for neonatal nutrition. The developmental pattern of lactase expression in the human fetus is distinct from that of similar digestive enzymes. Before week 24 of gestation, intestinal lactase activity is low. It then begins to increase, and during the third trimester lactase activity increases markedly until levels in term neonates are at or above those of infants aged 2–11 months.1 Lactase exhibits a characteristic proximal to distal pattern of expression in the small intestine; enzyme activity is greatest in the mid- jejunum, with decreasing activity both proximally and distally, resulting in minimal activity in the proximal duodenum and the terminal ileum.2

In most human populations, lactase activity decreases during mid-childhood (about five years of age), resulting in low levels from that age onwards. This pattern is similar to that seen in all other mammals examined, with a reduction in intestinal lactase activity at weaning to a fraction of that found in the suckling newborn. In striking contrast, a minority of the human population, especially people of Northern European extraction and a few other racial groups, retain high levels of activity throughout adult life.3 Persistence of elevated lactase activity is thought to be a relatively recent human evolutionary development, arising within the last 10 000 years, coincident with the development of dairying.4 A small number of subjects with lactase non-persistence have been demonstrated to have an abnormality in the intracellular processing of newly synthesised LPH protein, indicating post-transcriptional control of non-persistence.5 However, it is now clear that in humans, as in all mammals studied, the primary mechanism of both the persistence and non-persistence phenotypes is regulation of gene transcription.6–8 Considerable effort has been devoted to the elucidation of the molecular mechanisms involved in the transcriptional regulation responsible for these two human phenotypes.

The gene for human LPH, located on chromosome 2q21, comprises 17 exons and covers approximately 49 kb, giving rise to a messenger RNA (mRNA) of slightly more than 6 kb. From initiation codon to stop codon, human LPH mRNA encodes 1927 amino acids forming the complete translation product.9 Sequence comparisons indicate that the coding region is comprised of four homologous parts, leading to the suggestion that the gene is the product of two duplication events during evolution.10 The nascent protein is heavily glycosylated so that the final translation product is about 220 kDa (fig 1). This high molecular mass glycoprotein undergoes intracellular cleavage, dividing regions I and II from regions III and IV. The protein consisting of regions III and IV contains the two active sites and is inserted into the microvillus membrane of the enterocyte as a mature enzyme of approximately 160 kDa.11 The proximal portion encompassing regions I and II has no enzymatic activity, but has been shown to function in correct folding of the enzyme.12 Initial analyses of the gene identified several single base polymorphisms (SNPs) within both the coding region and the 5′ flanking region. None was considered to have functional significance.9

Subsequent analysis has led to the identification of additional SNPs and several other features unique to the human gene that may be of relevance to the mechanisms of LPH persistence/non-persistence. The first 100 bp of the proximal promoters of the mammals analysed to date (rat, mouse, pig, and human) are virtually identical and appear to be similarly regulated.13–15 Studies in transgenic animals have indicated that approximately 1 kb of the 5′ flanking sequence in the pig, and 2 kb of the 5′ flanking sequence in the rat, are sufficient to direct appropriate tissue, cell, and villus expression, as well as the developmental decline at weaning.16–18 Comparable studies in humans have not been carried out. In contrast with the other mammals analysed, the 5′ flanking region of the human LPH gene contains five inserted stretches of repetitive DNA, two Alu sequences of approximately 300 bp each, and three other short repetitive sequences, making a direct comparison of the more distal regulatory region to those of other mammals difficult. Whether or not these inserted repetitive DNA segments affect LPH expression is currently unknown. Furthermore, exon 17 of MCM6, a cell cycle regulatory gene, ends 3.5 kb from the start site of the human LPH gene.19 The transcriptional start site of the MCM6 gene lies approximately 39 kb 5′ of the LPH transcriptional start site. The two genes are close together but the available evidence indicates that their regulation is independent. Two polymorphisms associated with LPH non- persistence originally identified by Enattah and colleagues 20 and examined further here, lie within introns 13 and 9 of the MCM6 gene (fig 2).

The report by Enattah and colleagues 20 mapped the DNA changes responsible for lactase persistence (or its converse, adult-type hypolactasia) to a region 13–22 kb upstream of the LPH gene. Using traditional linkage analysis, they first narrowed the region to approximately 3.4 million bases between genetic markers named D2S114 and D2S2385. They then hypothesised that the allele causing lactase persistence arose once in the recent past on a particular chromosome. In this scenario, recombination events in the subsequent history of the population would separate the persistence allele from alleles in other parts of the chromosome, but in the immediate vicinity the persistence alleles would still be inherited together (in linkage disequilibrium) with nearby alleles from the ancestral chromosome. Thus recombination events can be used to narrow the region of interest.

To identify this signal of linkage disequilibrium, they typed several additional markers within the critical 3.4 Mb region. They identified a 47 kb region containing LPH and upstream sequences in which all individuals with lactase persistence carried the same alleles. The localisation was based in part on data from two chromosomes that differ from the ancestral chromosome at only a single marker. These two chromosomes could therefore be derived from a recent mutation at that one marker rather than from a recombination event, meaning that localisation to the 47 kb region might be premature. Nevertheless, they re- sequenced the entire 47 kb region, including the entire LPH gene, and identified only one variant (a C>T SNP, located 13910 bases upstream of LPH) that was perfectly associated with lactase persistence. All 99 individuals with low lactase activity were homozygous for a C at this SNP whereas all 137 individuals with lactase persistence carried either C/T or T/T. A similar but not quite perfect association was found with a G>A SNP at −22018. No other variants were as tightly associated with lactase persistence as were these two SNPs. Interestingly, other haplotypes had previously been associated with lactase persistence and non-persistence.21

A report in this issue of Gut by the same group22 extends these studies by testing whether these SNPs are associated with decreased expression of LPH mRNA levels [see page 647]. As expected, higher levels of LPH mRNA and lactase activity were found in intestinal biopsy samples of subjects whose DNA contained a T at the −13910 SNP and an A at the −22018 SNP. This correlation is perhaps not surprising given the previous very tight correlation between these alleles and lactase persistence 20 and the tight correlation between lactase persistence and high lactase mRNA levels, as first reported in 1992.6,7 However, they also used a clever technique whereby SNPs in the coding region were used to distinguish the transcripts synthesised from the two LPH alleles in an individual heterozygous for one of these coding SNPs. By using allele specific reverse transcription-polymerase chain reaction directed at the coding SNPs, they were able to quantify not only the total levels of LPH mRNA but also the relative levels of expression from the two different transcripts. By this method, they showed that LPH mRNA transcripts are less abundant when synthesised from a chromosome carrying the C at −13910 and G at −22018 than from chromosomes carrying a T and an A at these two sites. Thus, in this population, levels of LPH mRNA were confirmed to be correlated with these SNP patterns.

The reported perfect correlation between T at −13910 and lactase persistence is extremely suggestive, and this paper indicates that lactase persistence is largely or completely explained by a cis acting effect on mRNA levels that is due to either the −13910 SNP or an SNP in perfect linkage disequilibrium with this SNP. In this study, all of the chromosomes carrying a T at −13910 also had an A at −22018. It would be of interest to apply this same technique to individuals in whom these two alleles had been separated (for example, a C at −13910 but an A at −22018) to determine which of these SNPs was more tightly correlated with the cis acting effect on mRNA levels. One could also imagine using this assay in vitro to try to determine whether the −13910 SNP is truly causative or whether a more distant genetic variant might be responsible for the persistence of lactase activity into adulthood.

Except for rare cases of congenital lactase deficiency, reported to be due to a separate gene,23 every human infant has high levels of LPH expression. If the polymorphisms regulate LPH expression, it is unclear how to account for both universal elevated expression in infants and the later development of lactase persistence/non-persistence in different individuals. While the correlation of the polymorphisms with the LPH phenotypes presented here is excellent, it does not demonstrate causation. The discussion indicates that the polymorphisms are both located within repetitive DNA sequences. As the Alu sequences are unique to primates, this is consistent with a mechanism for LPH persistence unique to humans. However, post-weaning LPH non-persistence is common to all mammals. It is unclear how both pre- and post-weaning human patterns can be accounted for by one or both of these polymorphisms.

In contrast with the previous publication, in which it was suggested that the polymorphisms altered a transcription factor binding site, no mechanism is presented here. The discussion implies that the two SNPs may identify LPH enhancers. Experiments to test this hypothesis should be straightforward to carry out. At present, it remains unclear whether the polymorphisms directly affect expression of LPH or are simply markers for LPH persistence or non-persistence.

Figure 1

Model of the molecular forms of lactase-phlorizin hydrolase during synthesis and processing in the human villus enterocyte. The early changes in apparent molecular size are due to glycosylation, as indicated in the diagram. Note that the two active sites are located in domains III and IV. The subsequently removed domains I and II are important for correct folding of the nascent protein. Although not indicated on this drawing, the enzyme forms a homodimer during processing. The final N terminal cleavage of a small segment is depicted by the elimination of the terminal loop in the microvillus form of the enzyme.

Figure 2

Schematic representation of the region between the two highlighted genetic markers that is associated with adult-type hypolactasia (Enattah and colleagues20). Vertical bars within the MCM6 gene represent the exons, the widths indicating relative size. The two associated single base polymorphisms (SNPs) lie within the indicated introns of the MCM6 gene which is located 5′ to the LCT (lactase-phlorizin hydrolase) gene on human chromosome 2. The arrows indicate the direction of gene transcription (the centromeric region of the chromosome is located to the left). The scale bar at the bottom of the figure indicates the size of the DNA region occupied by the MCM6 gene. The numbers at introns 13 and 9 indicate the positions of the SNPs relative to the transcriptional start site of lactase (redrawn from Enattah and colleagues20).

Transcriptional regulation of the lactase-phlorizin hydrolase (LPH) gene by polymorphisms is associated with persistence of high levels of intestinal lactase activity or non-persistence


View Abstract

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles