Article Text


Large scaled analysis of hepatitis B virus (HBV) DNA integration in HBV related hepatocellular carcinomas
  1. Y Murakami1,*,
  2. K Saigo2,
  3. H Takashima3,
  4. M Minami3,
  5. T Okanoue3,
  6. C Bréchot4,
  7. P Paterlini-Bréchot4
  1. 1Department of Gastroenterology, Fukui National Hospital, Fukui, Japan, and Department of Gastroenterology, Kyoto Prefectural University of Medicine, Kyoto, Japan
  2. 2Second Department of Surgery, Chiba University, Chiba, Japan
  3. 3Department of Gastroenterology, Kyoto Prefectural University of Medicine, Kyoto, Japan
  4. 4Institut Pasteur/INSERM U-370, Paris, France
  1. Correspondence to:
    Dr Y Murakami
    Department of Gastroenterology, Fukui National Hospital, 33-1 Sakuragaoka, Tsuruga, Fukui 914-0195, Japan;


Background and aims: Hepatitis B virus (HBV) DNA integration into or close to cellular genes is frequently detected in HBV positive hepatocellular carcinomas (HCC). We have previously shown that viral integration can lead to aberrant target gene transcription. In this study, we attempted to investigate common pathways to hepatocarcinogenesis.

Methods: By using a modified Alu-polymerase chain reaction approach, we analysed 50 HCCs along with 10 previously published cases.

Results: Sixty eight cellular flanking sequences (seven repetitive or unidentified sequences, 42 cellular genes, and 19 sequences potentially coding for unknown proteins) were obtained. Fifteen cancer related genes and 25 cellular genes were identified. HBV integration recurrently targeted the human telomerase reverse transcriptase gene (three cases) and genes belonging to distinct pathways: calcium signalling related genes, 60s ribosomal protein encoding genes, and platelet derived growth factor and mixed lineage leukaemia encoding genes. Two tumour suppressor genes and five genes involved in the control of apoptosis were also found at the integration site. The viral insertion site was distributed over all chromosomes except 13, X, and Y.

Conclusions: In 61/68 (89.7%) cases, HBV DNA was integrated into cellular genes potentially providing cell growth advantage. Identification of recurrent viral integration sites into genes of the same family allows recognition of common cell signalling pathways activated in hepatocarcinogenesis.

  • HCC, hepatocellular carcinoma
  • HBV, hepatitis B virus
  • hTERT, human telomerase reverse transcriptase
  • WHV, Woodchuck hepatitis virus
  • HPV, human papillomavirus
  • PCR, polymerase chain reaction
  • PDGF, platelet derived growth factor
  • MLL, mixed lineage leukaemia
  • APCL, adenomatous poliposis coli-like
  • MGMT, O6-methylguanine DNA methyltransferase
  • hepatocellular carcinoma
  • hepatitis B virus
  • DNA integration
  • human telomerase reverse transcriptase

Statistics from

Viral insertional mutagenesis is a powerful tool for identifying cancer related genes. In fact, cellular genes have been successfully identified using this method in animal models.1 In the majority of hepatocellular carcinomas (HCCs) developed in chronic hepatitis B virus (HBV) carriers, the viral genome is clonally integrated into the host chromosomal DNA.2–4 In some cases where detailed molecular studies have been performed, integration of the virus into the host genome was found to lead to novel fusion transcripts and/or local genomic instability, resulting in secondary deletions, rearrangements, duplications, or inversions of the host and/or viral genomic sequences.5–8 It is also generally accepted that the HBV genome has, in itself, some oncogenic activities. HBV encodes HBX and preS2/S truncated proteins which may have transforming activity2 and may interfere with DNA repair.9 Chronic infection by woodchuck hepatitis virus (WHV) was used as an animal model of chronic liver disease followed by liver carcinogenesis. In humans, the lifetime risk of cancer for a HBV chronically infected person is approximately 10–25% while in woodchucks the lifetime risk is approximately 100%.10 The human papillomavirus 18 (HPV) also recurrently integrates in the vicinity of the c-myc gene and the relationship between HPV18 integration and development of cervical cancer has been demonstrated.11 In almost all cases examined so far, integration of WHV activates transcription at the N-myc2 locus. In the few cases where N-myc2 was not transcribed, WHV integration was found at the c-myc locus or N-myc1 locus.12,13 Accordingly, many researchers have tried in the past to investigate the relationship between viral integration and hepatocarcinogenesis; however, further technological advances were needed to isolate the viral/cellular junctions on a large scale and elucidate this issue. We have previously identified 21 cellular sequences as targets of HBV-DNA integration from 18 HCC using the HBV X sequence as a genetic tag.14,15 Our group and others have reported that HBV DNA is preferentially integrated into or in the vicinity of the human telomerase reverse transcriptase (hTERT) gene in HCCs.15–18 In this report, we have performed a large scale analysis of cellular genes targeted by HBV-DNA integration to identify cancer related genes and recurrent sites of viral integration.


Study population

The study population consisted of 50 cases of HBV positive HCCs obtained by surgical resection. Ten additional previously published viral-host junctions were taken into account for a better interpretation of results. Pathological analyses of adjacent non-tumorous tissues showed minimal liver change in four cases, chronic hepatitis in 15 cases, liver cirrhosis in 27 cases while for 14 cases no information was available (table 1). The study protocol conformed to the ethical guideline of the Declaration of Helsinki (1975). All patients or their relatives provided written informed consent, and the ethics committee of the Kyoto Prefectural University of Medicine approved all aspects of the study.

Table 1

 Clinical background of this study

Sample preparation

DNA was extracted from liver tissues using the “G’NOME DNA isolation kit” (BIO 101, Joshua Way, California, USA), according to the manufacturer’s instructions. All samples were stored at −80°C and carefully handled to avoid contamination by other nucleic acids.

Detection of viral-host junctions

A polymerase chain reaction (PCR) based technique (Alu-PCR) was employed using specific primers to human Alu sequences and to HBV sequences to efficiently amplify viral-host junctions, as previously described.25,26 We also used additional primer sets and a schematic view of this PCR strategy is illustrated in fig 1 and supplementary table 1 (supplementary table 1 can be viewed on the Gut website at Amplified PCR products were analysed by electrophoresis on 1.0% agarose gel and transferred to a Hybond-N+ nylon membrane (Amersham-Pharmacia, Buckinghamshire, UK). To prepare a full length HBV probe, the total HBV genome was amplified according to the method of Günther and colleagues,27 and the HBV specific bands were detected by hybridisation with a DIG labelled probe (Roche, Mannheim, Germany).

Figure 1

 (A) Localisation on the hepatitis B virus (HBV) genome of the primers used in this study. Open circle represents the total HBV sequence, nucleotide 3200/1, DR1, and DR2 indicate numbering from the hypothetical EcoRI site (HBV subtype adw and accession number V00866 were used as reference sequences),28 direct repeat 1, and direct repeat 2, respectively. Arrows represent HB1 primers (HBV preS2/S (uPreS2), HBV X (pUTP), and HBV core region (uPre31), respectively). (B) Schematic protocol for amplification of the viral-host junction. Technical details of the procedure have been previously reported.15,16 (1) Small arrows represent primers. Primers represented as broken arrows are synthesised with dUTP instead of dTTP and can be denatured by uracil DNA glycosylase (UDG) treatment. The Alu specific primer has a 5′ tag sequence (Alu+tag). (2) After the tag introducing polymerase chain reaction (PCR), this PCR product is comprised of HBV sequence, cellular franking sequence, and newly synthesised complementary to the tag sequence (cTag), which is indicated by a rectangle. (3) The primers synthesised with dUTP are denatured by UDG and further amplification is performed by HB2 and tag primer, then a second PCR is carried out using HB3 and the Alu tag primer.

Direct sequencing

The amplified viral-host junctions were analysed by sequencing using the dideoxy chain termination method. DNA was purified with an Easy trap kit (Takara, Otsu, Japan) and subjected to sequencing using the Prism Taq DyeDeoxy Terminator cycle sequencing kit (Applied Biosystems Inc., Foster City, California, USA), according to the manufacturer’s instructions. The sequencing products were precipitated with ethanol and then analysed by electrophoresis with a 377 Prism DNA sequencer (Applied Biosystems Inc.).


We utilised a modified Alu-PCR method using primers specific to HBV-X and to HBV-preS2/S and core regions. Collectively, 68 viral-host junctions from 60 specimens were obtained. Fifty eight of 68 viral-host junctions were detected by Alu-PCR and 10 junctions which had already been published were detected by conventional methods. Fifty six of 58 viral-host junctions were obtained with the HBV-X primer, one was obtained with the HBV-S primer, and one was obtained with the HBV core primer. The short viral and cellular sequences at the 68 junctions are shown in the right part of supplementary fig 1 (supplementary fig 1 can be viewed on the Gut website at The cellular flanking sequences were assessed using the BLAST search system. Seven cellular flanking sequences were repetitive or unidentified sequences and 61 were cellular gene sequences, including 17 sequences potentially encoding unknown proteins (table 2 and supplementary table 2—supplementary table 2 can be viewed on the Gut website at These cellular sequences were divided into three groups (table 2): (1) genes already known to be involved in carcinogenesis: 15 genes (17 cases); (2) genes already known and/or fully characterised but not previously known to be involved in carcinogenesis: 25 genes; and (3) unknown open reading frames or genes belonging to a known gene family but not functionally characterised: 19 cases, eight of which had characters or significant motifs similar to those described in other cellular genes.

Table 2

 Target genes searched by viral insertion

Orientation of the HBV genome ORF was either the same or opposite to the cellular gene orientation, and HBV DNA integration was found either in the vicinity of or into the cellular gene (supplementary fig 1—supplementary fig 1 can be viewed on the Gut website at Recurrent HBV integration was found in the hTERT gene (three cases). We also found two recurrent target pathways: calcium signalling related genes (four genes: sarco/endoplasmic reticulum calcium ATPase 1 (SERCA1), inositol triphosphate receptor type 1 (IP3R1), inositol 1,4,5-triphosphate receptor type 2 (ITPR2), and SPARC related modular calcium binding 1 (SMOC1)), and 60s ribosomal protein-like encoding genes (three genes: L7a, L14, and L17). Moreover, in two cases, HBV was integrated into genes belonging to the same gene family: platelet derived growth factor family (PDGF) (two genes: PDGF receptor beta and PDGF beta), and mixed lineage leukaemia family (MLL) (two genes: MLL2 and MLL4). Additionally, five apoptosis associated genes (hTERT, thyroid hormone associated protein 150 alpha (TRAP 150α), scavenger receptor class A member 3 (SCARA3), mitogen associated protein kinase 1 (MAPK1), BCL2-like 2 (BCL2L2)) were targeted by HBV. Our preliminary data had shown that TRAP 150α is involved in the control of apoptosis (Murakami Y et al, in preparation). Two tumour suppressor genes were also found at the HBV integration site: adenomatous poliposis coli-like (APCL) and O6-methylguanine DNA methyltransferase (MGMT). Concerning viral integrations into the hTERT gene, orientation of the HBV DNA was opposite to that of the hTERT gene in all three cases: HBV being integrated 10.8 kb upstream to the promoter, 0.3 kb downstream, and 16 kb downstream to the gene, respectively (fig 2). In all of the other genes, HBV DNA orientation and position into or close to the open reading frame were variable. The chromosomal localisation of viral integration was distributed on all chromosomes except chromosome 13, X, and Y (fig 3).

Figure 2

 Hepatitis B virus (HBV) DNA integration in the human telomerase reverse transcriptase (hTERT) gene. The shaded box is the hTERT gene, the bold arrows indicate the position and orientation of the inserted viral sequence, and the thin arrows indicate the orientation of the hTERT gene. The distance between the ATG or stop codon and the HBV integration site is indicated (kb). The letters on the top of the bold arrow indicate the hepatocellular carcinoma code, as reported in table 1.

Figure 3

 Distribution of the viral integration sites on the chromosome map. Red spots indicate viral integration sites. The names and chromosomal localisations of the genes are also indicated.


Our study demonstrates that a large scale analysis of HBV DNA integration sites in liver cancer enables identification of several cancer related genes and pathways. Most of the viral integration sites are located in the vicinity of cellular genes or inside the coding sequences. This situation is susceptible to activation of expression of proto-oncogenes or inactivation of tumour suppressor genes as a result of viral insertion. The inserted viral sequences are sometimes located immediately upstream or downstream of the target gene. In some cases, integration is quite distant to the target genes but this is not inconsistent with an impact of HBV DNA on expression of the target gene, as shown for the win locus targeted by WHV integration in woodchuck liver cancers.29 In addition, viral integration and human chromosomal translocations can disrupt gene expression over hundreds of kilobases. Therefore, we cannot exclude some candidate target genes because of the distance between the integration site and target genes.1,30 A recent report on 14 cases of HBV-DNA integration showed that most cellular flanking sequences are repetitive sequences but identification of longer sequences flanking the HBV-DNA integration site would have allowed identification of a higher number of unique sequences, which are needed to accurately locate the HBV integration site in the human genome.31

Viral integration into the cellular genome has been thought to occur randomly2 but our study and other laboratory studies have shown that hTERT is a recurrent site of viral integration.15,18 Our present study, by identifying five gene families recurrently targeted by the viral genome, supports the view of recurrent targeting of genes involved in cell signalling and growth control: hTERT, PDGF receptor, calcium signalling related genes, MLL, and 60s ribosomal protein genes. hTERT gene expression has been found to be increased in HCCs.32 RNA expression of MLL was found to be amplified in several tumours, including HCC.33–37 Recently, Saigo et al found that a HBX-MLL fusion protein was dominantly expressed in three HCC tissues and chromosomal translocation was also observed in these cases (personal communication). A distinct group of genes includes PDGF-B, APCL, and MGMT, having a p53 dependent tumorigenic effect.38,39 Moreover, the PDGF-B/PDGF-R system has a critical function for pericyte recruitment to tumour vessels.37 APCL may be involved in the p53/Bcl2 linked pathway of cell cycle progression and cell death.40 MGMT hypermethylation has been involved in pharmacoepigenomics: methylated tumours are more sensitive to the killing effects of alkylating drugs used in chemotherapy.41 Another gene group includes calcium signalling genes. A previous study led to the discovery of new truncated and hybrid HBx-SERCA1 proteins involved in the control of apoptosis.42 Calcium homeostasis is also modified by HBV-X protein, which acts on calcium extrusion mechanisms, playing an important role in the control of HBX related apoptosis.43 We found that HBV targets ribosomal protein L7a, L14, and L17 genes. In vivo, constitutive expression of L7 has been found to induce cell cycle arrest,44 and trk-2h oncogene, which derived from the human breast cancer cell line MDA-MB231, encoded a 44 kDa phosphoprotein exhibiting tyrosine protein kinase activity and the N terminal 41 amino acids were derived from the N terminal of human ribosomal large unit L7a.45 Overexpression of the ribosomal protein L36a has been associated with tumour cell proliferation in HCCs.46

Our group has recently shown that viral integration and cellular DNA genetic rearrangements were observed in patients with acute hepatitis.26 These genetic changes have also been observed in the liver of patients with HBV chronic hepatitis (not during the latent phase).47 Acquired transforming activity of mutated cellular and viral proteins, including chimeric HBV cellular proteins, encoded following HBV integration have already been reported in previous studies. (a) In the viral genome, both the 3′ truncated HBV X and preS2/S gene products may exhibit oncogenic activity. Most new open reading frames created in HCCs by the HBV DNA integration derive from the 3′ truncated HBV-X sequence fused in frame to the cellular flanking sequence. These viral onco-proteins act by transactivating genes that regulate cellular growth. While the whole HBx protein may induce apoptosis through mitochondrial dysfunction and caspase activation43 and suppress the transforming activity of ras and myc oncogenes, C terminally truncated HBx proteins have been shown to promote the transforming activity of ras and myc oncogenes.48 The HBV genome includes enhancer elements capable of activating heterologous promoters in a position and orientation independent manner.49 Moreover, a mechanism of viral enhancer insertion has been reported in many examples of retroviral integration,50,51 as well as in WHV insertional mutagenesis.52 (b) In the cellular genome, hyperexpression and aberrant transcription of hTERT,16 truncated transcription of hMCM8,53 and hybrid viral/cellular transcription of SERCA1, FR7, and cyclin A have been previously reported14,42,54 and, in some analysed cases,55 shown to have transforming activity. The different patterns of HBV integration are all susceptible to modified expression of the target cellular genes. These mutagenic effects related to viral insertion may play an important role in liver carcinogenesis.

Taken together, our results suggest the following hypothesis of tumour development. Firstly, viral integration occurring during the acute phase of infection induces genetic changes in the target cellular gene. Secondly, the oncogenic activity of the cellular and viral genes modified by the viral integration may provide the cells harbouring the HBV DNA integration with a selective growth advantage (over viral propagation) during the chronic phase of the infection. Increasing accumulation of genetic changes during liver cell proliferation may finally lead to hepatocarcinogenesis.

In conclusion, we have found that all of the 61 genes identified at the HBV DNA integration site are at risk of being involved in the control of cell proliferation and/or survival, being likely to play a role in the development of liver cancer. HBV insertional tagging provides a new tool for identifying human cancer related genes. This study has shown the high prevalence of HBV integration in genes involved in cell signalling. Our results have taken advantage of the recent progress achieved in the field of human genome sequencing. We propose the view that viral insertion induces the first genetic change in liver tumorigenesis and that genes targeted by viral integration may play an important role in hepatocarcinogenesis.


The authors are grateful to Professor Takashi Inamoto, Department of Gastroenterological Surgery, Kyoto University, and Professor Hisakazu Yamagishi, Department of Gastroenterological Surgery, Kyoto Prefectural University of Medicine, for providing the liver samples.


View Abstract
  • The table and figures are available as downloadable PDFs (printer friendly files).

    If you do not have Adobe Reader installed on your computer,
    you can download this free-of-charge, please Click here


    Files in this Data Supplement:

    • [view PDF] - Supplementary Table 1: Sequence of primers.
    • [view PDF] - Supplementary Figure 1: Left part - schematic drawing of HBV DNA integration in cellular genes. Right part - The sequence of the 20 first cellular nucleotides at the integration site.


  • * Present address: Laboratory of Human Tumour Viruses, Department of Viral Oncology, Institute for Viral Research, Kyoto University, Kyoto, Japan. Kawahara-cho, Shogoin, Sakyo-ku, Kyoto 606-8507, Japan.

  • Conflict of interest: None declared.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.