Article Text

Download PDFPDF

Original research
Meta-analysis of genome-wide association studies and functional assays decipher susceptibility genes for gastric cancer in Chinese populations
  1. Caiwang Yan1,2,3,
  2. Meng Zhu1,2,
  3. Yanbing Ding4,
  4. Ming Yang5,
  5. Mengyun Wang6,7,
  6. Gang Li8,
  7. Chuanli Ren9,
  8. Tongtong Huang1,
  9. Wenjun Yang10,
  10. Bangshun He11,
  11. Meilin Wang2,12,
  12. Fei Yu1,
  13. Jinchen Wang1,
  14. Ruoxin Zhang6,7,
  15. Tianpei Wang1,
  16. Jing Ni1,
  17. Jiaping Chen1,2,
  18. Yue Jiang1,2,
  19. Juncheng Dai1,2,
  20. Erbao Zhang1,2,
  21. Hongxia Ma1,2,
  22. Yanong Wang13,
  23. Dazhi Xu13,
  24. Shukui Wang2,11,
  25. Yun Chen2,14,
  26. Zekuan Xu2,15,
  27. Jianwei Zhou2,16,
  28. Guozhong Ji2,17,
  29. Zhaoming Wang18,
  30. Zhengdong Zhang2,12,
  31. Zhibin Hu1,2,3,
  32. Qingyi Wei6,19,20,
  33. Hongbing Shen1,2,
  34. Guangfu Jin1,2,3
  1. 1 Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
  2. 2 Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Medicine, Nanjing Medical University, Nanjing, China
  3. 3 State Key Laboratory of Reproductive Medicine, China International Cooperation Center for Environment and Human Health, Nanjing Medical University, Nanjing, China
  4. 4 Department of Gastroenterology, The Affiliated Hospital of Yangzhou University, Yangzhou, China
  5. 5 Shandong Provincial Key Laboratory of Radiation Oncology, Cancer Research Center, Shandong Cancer Hospital Affiliated to Shandong University, Shandong Academy of Medical Sciences, Jinan, China
  6. 6 Cancer Institute, Fudan University Shanghai Cancer Center, Shanghai, China
  7. 7 Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
  8. 8 Department of General Surgery, Jiangsu Cancer Hospital, Jiangsu Institute of Cancer Research, The Affiliated Cancer Hospital of Nanjing Medical University, Nanjing, China
  9. 9 Department of Laboratory Medicine, Clinical Medical College of Yangzhou University, Yangzhou, China
  10. 10 Key Laboratory of Fertility Preservation and Maintenance, The General Hospital, Ningxia Medical University, Yinchuan, China
  11. 11 General Clinical Research Center, Nanjing First Hospital, Nanjing Medical University, Nangjing, China
  12. 12 Key Lab of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
  13. 13 Department of Gastric Cancer and Soft Tissue Sarcomas, Fudan University Shanghai Cancer Center, Shanghai, China
  14. 14 Department of Immunology, Nanjing Medical University, Nanjing, China
  15. 15 Department of General Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
  16. 16 Department of Molecular Cell Biology and Toxicology, School of Public Health, Nanjing Medical University, Nanjing, China
  17. 17 Institute of Digestive Endoscopy and Medical Center for Digestive Diseases, Second Affiliated Hospital of Nanjing Medical University, Nanjing, China
  18. 18 Department of Epidemiology and Cancer Control, St. Jude Children’s Research Hospital, Memphis, Tennessee, USA
  19. 19 Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina, USA
  20. 20 Department of Population Health Sciences, Duke University School of Medicine, Durham, North Carolina, USA
  1. Correspondence to Dr Guangfu Jin, Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China; guangfujin{at}njmu.edu.cn

Abstract

Objective Although a subset of genetic loci have been associated with gastric cancer (GC) risk, the underlying mechanisms are largely unknown. We aimed to identify new susceptibility genes and elucidate their mechanisms in GC development.

Design We conducted a meta-analysis of four genome-wide association studies (GWASs) encompassing 3771 cases and 5426 controls. After targeted sequencing and functional annotation, we performed in vitro and in vivo experiments to confirm the functions of genetic variants and candidate genes. Moreover, we selected 33 promising variants for two-stage replication in 7035 cases and 8323 controls from other five studies.

Results The meta-analysis of GWASs identified three loci at 1q22, 5p13.1 and 10q23.33 associated with GC risk at p<5×10 8 and replicated seven known loci at p<0.05. At 5p13.1, the risk rs59133000[C] allele enhanced the binding affinity of NF-κB1 (nuclear factor kappa B subunit 1) to the promoter of PRKAA1, resulting in a reduced promoter activity and lower expression. The knockout of PRKAA1 promoted both GC cell proliferation and xenograft tumour growth in nude mice. At 10q23.33, the rs3781266[C] and rs3740365[T] risk alleles in complete linkage disequilibrium disrupted and created, respectively, the binding motifs of POU2F1 and PAX3, resulting in an increased enhancer activity and expression of NOC3L, while the NOC3L knockdown suppressed GC cell growth. Moreover, two new loci at 3q11.2 (OR=1.21, p=4.56×10 9) and 4q28.1 (OR=1.14, p=3.33×10 11) were associated with GC risk.

Conclusion We identified 12 loci to be associated with GC risk in Chinese populations and deciphered the mechanisms of PRKAA1 at 5p13.1 and NOC3L at 10q23.33 in gastric tumourigenesis.

  • stomach cancer
  • genetics
  • mutation
  • gene regulation

Statistics from Altmetric.com

Significance of this study

What is already known on this subject?

  • A subset of gastric cancer (GC) susceptibility loci have been identified by genome-wide association studies.

  • Up to now, few loci have been experimentally confirmed for their biological mechanisms underlying the observed genetic associations.

What are the new findings?

  • For the first time, we characterise the mechanisms underlying the association of loci of 5p13.1 and 10q23.33 with GC risk at the regulatory and functional levels.

  • At the 5p13.1 locus, the C allele of rs59133000 reduces the promoter activity by enhancing the binding affinity of transcriptional repressor NF-κB1 (nuclear factor kappa B subunit 1), leading to decreased expression of PRKAA1, weakening the suppressive effect of PRKAA1 on tumourigenesis and eventually resulting in an increased risk of GC.

  • At the 10q23.33 locus, the risk haplotype rs3781266[C]–rs3740365[T] can switch the binding motifs of POU2F1 and PAX3 in an enhancer element, resulting in increased enhancer activity and expression of NOC3L and ultimately promoting GC tumourigenesis.

  • Here, we report 12 GC risk-related loci, including two novel ones of rs7624041 at 3q11.2 and rs10029005 at 4q28.1.

  • Our pathway analysis highlights an important role of host immunity, especially the cellular response to interleukin 1, in GC susceptibility.

Significance of this study

How might it impact on clinical practice in the foreseeable future?

  • Our results expand current understanding of genetic basis for GC risk and provide new deeper insights into the biology and mechanisms of susceptibility to GC.

  • The identified risk variants may serve as the basis for risk assessment in the clinical setting.

Introduction

Gastric cancer (GC) is the fifth most common cancer and the third leading cause of cancer death worldwide, and more than 40% of new cases and deaths each year occur in China.1 The heritability of GC has been estimated to be 22% in Nordic countries.2 Familial aggregation of GC occurs in approximately 10% of the cases, due to heritable pathogenic mutations in the high-penetrance genes, including CDH1, CTNNA1, BRCA2, STK11, PALB2, BRCA1 and RAD51C.3–5 However, the vast majorities of GC cases are sporadic, without definitive germline mutations. A subset of common variants or susceptibility loci have been identified by genome-wide association studies (GWASs),6–13 including 1q22, 3q13.31, 5p13.1, 5q14.3, 6p21.1, 6p22.1, 8q24.3, 9q34.2, 10q23.33, 12q24.11–12 and 20q11.21. In addition, we and other groups have discovered low-frequency variants in SPOCD1 13 and ATM 14 with a higher penetrance for GC. Following the discovery of susceptibility loci in association studies, MUC1 at 1q229 and PSCA at 8q24.315 16 have been experimentally confirmed as susceptibility genes for GC. However, the remaining loci await functional studies to further understand biological mechanisms underlying their observed genetic associations.

To comprehensively characterise susceptibility genes for GC, we performed a meta-analysis of four GWASs, with replication in other five studies, with a total of 10 806 cases and 13 749 controls from Chinese populations; the present study is thus the largest genetic study of GC in China to identify new genetic variants associated with GC risk. Through targeted sequencing, functional annotations and both in vitro and in vivo experiments, we deciphered the mechanisms by which the genetic variants at 5p13.1 and 10q23.33 influence gastric tumourigenesis.

Materials and methods

Genome-wide association studies

Four GWASs were included in the initial meta-analysis: one new GWAS (Onco-GWAS) and three published GWASs (NJ-GWAS, BJ-GWAS and SX-GWAS).7 8 11 The characteristics of the participants are summarised in online supplementary table 1. The cases and cancer-free controls of each study were all ethnic Chinese from the same geographical area. All four studies used Illumina or Affymetrix chips for genotyping. We performed quality control procedures on genotyping data using the same protocol for all four GWAS datasets (online supplementary figure 1). The study details and quality control procedures are presented in online supplementary text. After quality control on GWAS datasets, cases and controls were roughly matched at the genetic level for all four studies (online supplementary figure 2). All participants provided a written informed consent. All studies were approved by the relevant institutional review boards.

Supplemental material

Meta-analysis of GWASs

To facilitate the meta-analysis of four GWAS datasets, imputation was conducted separately for each GWAS dataset (details provided in online supplementary text and imputation quality shown in online supplementary figure 3). Single-variant association analysis was performed separately for each GWAS dataset by the use of SNPTEST software (V.2.5).17 Significant principal components of ancestry and demographic characteristics were included as covariates in the multivariate logistic regression model in estimating ORs and 95% CIs for the genetic variants. An inverse variance-weighted fixed effects model was used in the meta-analysis as implemented in the GWAMA software (V.2.1).18 Quantile-quantile (Q-Q) plots and Manhattan plots were created for individual studies or the meta-analysis using R software (V.3.4.3) (online supplementary figures 4 and 5). Regional plots were generated by using LocusZoom,19 and the Bonferroni correction was used to control multiple testing in each region (0.05/number of variants tested). The details for targeted sequencing and functional annotations are provided in the online supplementary text.

Functional experiments

The activities of regulatory elements controlling potentially functional variants at 5p13.1 and 10q23.33 were experimentally validated by using luciferase reporter assays. Transcription factor binding site analysis was performed with the JASPAR 201820 and enhancer element locator algorithm.21 The binding of transcription factors was confirmed by electrophoretic mobility shift assay (EMSA)- and chromatin immunoprecipitation (ChIP)-based assays. RNA interference, plasmid overexpression, lentivirus-mediated knockdown and CRISPR/Cas9-mediated knockout, as well as cell viability, proliferation and 5-ethynyl-2’-deoxyuridine (EdU) incorporation assays, were used to evaluate the biological functions of PRKAA1 and NOC3L in GC cells. The methods used for CRISPR/Cas9-mediated knockout of PRKAA1 and lentivirus-mediated knockdown of NOC3L in the BGC823 cell line are shown in online supplementary figures 6 and 7. Cells (3.0×106) were injected subcutaneously into the bilateral armpit of 5- to 6-week-old male BALB/c nude mice to assess the growth of xenograft tumours after PRKAA1 knockout or NOC3L knockdown. Protocols for these experimental procedures are available in the online supplementary text.

Replication studies

To identify new loci associated with GC risk, we performed a two-stage replication with detailed information summarised in online supplementary table 2. In the first stage (replication I), we genotyped 33 promising loci in 1710 GC cases and 1802 controls by using the iPLEX Sequenom MassARRAY platform (Sequenom, San Diego, California, USA) (online supplementary table 3). In the second stage (replication II), we further genotyped five variants that showed consistent associations in replication I (p<0.05) in 5325 cases and 6521 controls from Jiangsu (replication II-JS), Shanghai (replication II-SH), Shandong (replication II-SD) and Ningxia (replication II-NX) by using a TaqMan allelic discrimination assay with an ABI 7900 system (Applied Biosystems, Carlsbad, California, USA). Logistic regression analysis was applied to estimate the ORs and 95% CIs for each replication study independently, and all results from replication studies and GWAS datasets were combined with a fixed-effects meta-analysis.

Gene-based and pathway analyses

We performed gene-based and pathway analyses on the basis of single-variant p values from the meta-analysis as implemented in MAGMA (Multi-marker Analysis of GenoMic Annotation).22 A total of 4622 curated gene sets representing known biological and metabolic pathways were derived from the KEGG (Kyoto Encyclopedia of Genes and Genomes)23 and GO (Gene Ontology)24 databases, which were catalogued by and obtained from MSigDB V.6.20.25 The details are provided in the online supplementary text.

Results

Meta-analysis of GC GWAS datasets and targeted sequencing of chromosomes 1q22, 5p13.1 and 10q23.33

An overview of the analysis strategy is presented in online supplementary figure 1. After quality control, 3771 GC cases and 5426 controls were retained in four GWAS datasets of Chinese ancestry (online supplementary figure 1), and more than 6 million variants were included in the genetic association analysis. Q-Q plots did not show evidence of a substantial inflation rate (λ=1.00−1.07, online supplementary figure 4). Manhattan plots from individual GWAS and meta-analysis are shown in online supplementary figure 5 and Figure 1A, respectively. Genetic variants of three regions were found to be associated with GC risk at the genome-wide association level (p<5×10−8), including 1q22 (lead variant rs760077: p=2.09×10−12), 5p13.1 (rs6897169: p=5.48×10−12) and 10q23.33 (rs10509671: p=2.51×10−12) (table 1). Moreover, an additional seven known loci for GC susceptibility, including 3q13.31, 5q14.3, 6p21.1, 6p22.1, 8q24.3, 9q34.2 and 20q11.21 were replicated in the meta-analysis with p values less than 0.05 (online supplementary table 4).

Table 1

Gastric cancer susceptibility loci at 1q22, 5p13.1 and 10q23.33 with a p value less than 5×10−8 in the meta-analysis of GWASs (3771 cases and 5426 controls) and two new loci identified in replication studies (7035 cases and 8323 controls)

The regional association results showed that the most strongly risk-associated variant rs760077 at 1q22 was previously reported, and rs6897169 at 5p13.1 and rs10509671 at 10q23.33 were highly correlated with previously reported variants (r2=0.96 and 1.00 for rs10074991 and rs3781264, respectively) (online (supplementary figure 8). Conditional logistical regression analysis and targeted sequencing of 1q22, 5p13.1 and 10q23.33 showed that there were no secondary signals with independent effects from the top hits (online supplementary figures 8 and 9).

A promoter variant of PRKAA1 at 5p13.1 modulates its tumour-suppressing role and modifies GC risk

To delineate the functional variants mapped to 5p13.1, we evaluated potential functions of 59 variants in strong linkage disequilibrium (LD) (r2>0.6) with the lead variant rs6897169 of this region (online supplementary table 5), all of which are in non-coding regions. Expression quantitative trait loci (eQTL) analysis showed the strongest correlation between rs6897169 genotypes and PRKAA1 expression levels in normal stomach tissues (p=7.2 × 10 4) (online supplementary table 6; figure 1B). We further focused on variants that are located within the annotated stomach-related cis-regulatory elements. The three most strongly supported variants (ie, rs3805495, rs5913300 and rs10065570) are located at the centre of DNase I hypersensitivity site (DHS) peaks as well as within regions harbouring promoter or enhancer histone marks (eg, H3K4me1, H3K4me3 and H3K27ac) in stomach tissues and transcription factor binding sites (figure 1C). Given the considerable evidence for allelic differences in PRKAA1 expression levels, we measured the effect of rs5913300 alleles on promoter activity and rs3805495 as well as rs10065570 alleles on the enhancer activity by luciferase reporter assays. The results revealed that the C allele of rs59133000, which was associated with an increased GC risk (OR=1.24, 95% CI: 1.16 to 1.32, p=1.17×10 11, table 1) and correlated with decreased expression of PRKAA1, consistently reduced the promoter activity of the reporter gene, as compared with the T allele (figure 2A). However, no allelic differences in the enhancer activity were observed for rs3805495 and rs10065570 (online supplementary figure 10). ChIP assays further confirmed H3K4me3 at the rs59133000 site (figure 2B). These results suggested that the C allele of rs59133000 might reduce the promoter activity and result in decreased expression of PRKAA1.

Figure 1

Manhattan plot for meta-analysis of gastric cancer genome-wide association studies and functional annotations on the locus of 5p13.1. (A) The associations (−log10 (p) values, y-axis) are plotted against genomic position (y-axis by chromosome and the chromosomal position); the variants at 1q22, 5p13.1 and 10q23.33 with p<5×10−8 are plotted in red. (B) The risk allele C in the lead variant rs6897169 at 5p13.1 was correlated with decreased expressions of PRKAA1 in normal stomach tissues. (C) ChIP-Seq data show enrichments of histone marks, DHSs and transcription factors in the sites of three variants (rs3805495, rs5913300 and rs10065570) in strong linkage disequilibrium with the lead variant rs6897169 (r2=1.00, 1.00 and 0.76, respectively) (see also online supplementary table 5). ChIP, chromatin immunoprecipitation; DHS, DNase I hypersensitivity site; UW, University of Washington.

Figure 2

The variant rs59133000 at 5p13.1 regulates transcriptional activity of PRKAA1 by modulating the binding affinity of nuclear factor kappa B subunit 1 (NF-κB1) to the promoter. (A) Allele-specific constructs containing the putative promoter sequence flanking rs59133000 were cloned into the pGL3-basic luciferase reporter vector and transfected into the GES1, BGC823 and SGC7901 cell lines. The results of luciferase activity were normalised to those of pGL3 basic (n=6). (B) DNA electropherogram of the PCR fragment generated using primers flanking the variant rs59133000 following ChIP (left); based on ChIP followed by quantitative PCR (ChIP-qPCR) against H3K4me3, enrichment was quantified relative to the amount of input DNA (right, n=3), and antibody directed against IgG was used as a negative control. (C) EMSA with biotin-labelled oligonucleotides containing the rs59133000[C] or rs59133000[T] allele and nuclear extracts from BGC823 cells. Lanes 1 and 6 show the mobility of the labelled oligonucleotides without nuclear extracts; lanes 2 and 7 show the mobility of the labelled oligonucleotides with nuclear extracts in the absence of the competitor oligonucleotide and lanes 3–5 and 8–10 show the mobility of the labelled oligonucleotides with nuclear extracts in the presence of unlabelled rs59133000[C] or rs59133000[T] competitors. The arrow indicates a DNA–protein complex. (D) In silico predicted preferential binding of NF-κB1 to the risk allele C of rs59133000 (left); after stimulated by tumour necrosis factor α (100 ng/mL) for 3 hours, the ChIP enrichment of NF-κB1 as determined by ChIP-qPCR (middle, n=2) and Sanger sequencing of the PCR amplifications (right). (E) Supershift assays using the NF-κB1 antibody in BGC823 cells. IgG was used as a negative control. In (A), (B) and (D), the data are presented as the means±SDs, and the p values were calculated using Student’s t-test. *p<0.05, **p<0.01 and ***p<0.001. ChIP, chromatin immunoprecipitation.

Subsequently, we sought to clarify whether the base change of rs59133000 might influence the preference for transcription factors. The EMSA assay showed that the C allele had a stronger binding band than the T allele (figure 2C). The competitive EMSA assay indicated that a 100-fold excess of unlabeled probe containing the C allele could completely abolish binding with the T allele, whereas the unlabeled T allele could not abolish binding with the C allele (figure 2C). Further in silico analysis revealed that rs59133000 was mapped within the binding motif of the transcription factor nuclear factor kappa B subunit 1 (NF-κB1) (figure 2D, left; online supplementary table 7). Using NCI-N87 cells heterozygous for rs59133000, we found the preferential binding of NF-κB1 to the risk allele C through ChIP followed by quantitative PCR (ChIP-qPCR) and Sanger sequencing of the PCR amplifications (figure 2D, online supplementary figure 11). NF-κB1 antibody supershift assays further revealed that NFKB1 is an allele-preferential binder of rs59133000 (figure 2E). Inhibiting the expression of NF-κB1 with specific small interfering RNA (siRNA) duplexes resulted in an increased expression of PRKAA1 (online supplementary figure 12A). After stimulation by tumour necrosis factor α (TNF-α), activation of NF-κB pathway first resulted in upregulation of PRKAA1 expression, while treatment for 3 hours, the expression of PRKAA1 gradually decreased with the increased concentration of TNF-α (online supplementary figure 12B). Taken together, these results suggest that the C allele of rs59133000 may enhance the binding affinity of NF-κB1 to the promoter of PRKAA1 as a transcriptional repressor.

By using the expression data from TCGA, we found that PRKAA1 was overexpressed in GC tumours (online supplementary figure 13A). We further measured PRKAA1 expression in gastric normal tissues or precancerous lesions and found that PRKAA1 expression was significantly reduced in intestinal metaplasia and dysplasia tissues as compared with superficial gastritis or atrophic gastritis (online supplementary figure 13B). To further explore biological significance of PRKAA1 in gastric carcinogenesis, we knocked down PRKAA1 by using siRNAs in BGC823 and SGC7901 cells that expressed relatively high levels of PRKAA1 (online supplementary figure 14), and we found that PRKAA1 knockdown promoted cellular malignant phenotypes in both the BGC823 and SGC7901 cell lines (online supplementary figure 15). Subsequently, by establishing a PRKAA1 knockout BGC823 cell line by using the CRISPR/Cas9 system (supplementary figure 6), we found that the deletion of PRKAA1 significantly increased proliferation (figure 3A,B) and clonogenicity (figure 3C) in these cell lines. Then, by rescuing the expression of PRKAA1 in the established PRKAA1 knockout BGC823 cell line, we found that the cell proliferation and clonogenicity decreased (figure 3D–F). Furthermore, by performing xenograft tumour assays with PRKAA1 knockout BGC823 cell lines, we found that PRKAA1 knockout reduced the growth of xenograft tumours in nude mice (figure 4). Taken together, these results suggest that PRKAA1 functions as a tumour suppressor in GC development.

Figure 3

PRKAA1 knockout promotes gastric malignant cellular phenotypes. (A) Proliferation curves before (WT) or after CRISPR/Cas9-mediated knockout (KO1 and KO2) of PRKAA1 in BGC823 cells. (B) Colony formation assay with BGC823 cells with PRKAA1 knockout (KO1 and KO2) and negative control cells (WT). The numbers of colonies were counted and are presented in the histogram. (C) EdU proliferation analysis before (WT) or after knockout of PRKAA1 (KO1 and KO2) in BGC823 cells. (D–F) Recovery of PRKAA1 expression in PRKAA1 knockout BGC823 cells reduced cell proliferation. The data are presented as the means±SDs, and data analysis was conducted using Student’s t test. *p<0.05, **p<0.01, ***p<0.001. DAPI, 4′,6-diamidino-2-phenylindole; WT, wild type.

Figure 4

PRKAA1 knockout promotes the tumourigenicity of gastric cancer cells in nude mice. In vivo xenograft tumour formation assays were performed using PRKAA1 knockout (KO1 and KO2), and negative control (WT) cells (3.0×106) subcutaneously injected into the bilateral armpit (left: WT; right: PRKAA1 KO1 or KO2) of 5- to 6-week-old male BALB/c nude mice. Tumour growth was measured twice weekly. On day 21, mice were sacrificed and tumours were photographed. (A, D) Mean volume of xenograft tumours in the WT or PRKAA1 knockout (KO1 and KO2) groups. The data are presented as the means±SEMs. (B, E) Photographs of excised tumour tissues from mice (top, tumours from the WT group; bottom, tumours from the PRKAA1 knockout group). (C, F) Average weight of harvested tumours in the WT and PRKAA1 knockout groups. The data are presented as the means±SDs. The p values were calculated using Student’s t-test. *p<0.05, **p<0.01, ***p<0.001. WT, wild type.

A risk haplotype at 10q23.33 activates an enhancer of NOC3L that promotes the GC tumourigenesis

To determine potentially functional variants at 10q23.33, we performed functional annotations for 78 variants that were in strong LD (r2>0.6) with the lead variant rs10509671 (online supplementary table 8). Three missense variants located in the exons of PLCE1 and NOC3L were predicted to be tolerant or benign (online supplementary table 9). Subsequent eQTL analysis indicated a strong correlation between the genotypes of the lead variant rs10509671 and NOC3L expression levels (p=8.4×10−8) (Figure 5A; online supplementary table 10). Then, we focused on two non-coding regions: region 1 for three variants (ie, rs11187842, rs3781266 and rs3740365 at a distance of 728 bp in perfect LD, pairwise r2=1.00) and region 2 for two variants (ie, rs7903902 and rs12220125 at a distance of 782 bp, r2=0.91), which were marked by the H3K4me1 and H3K27ac histone modifications and by the DHS in stomach tissues (figure 5B). Further luciferase assays indicated that compared with the wild-type haplotype in region 1, the variant haplotype of rs11187842[A]–rs3781266[C]–rs3740365[T], which was highly correlated with the risk allele rs10509671[C] and significantly associated with an increased GC risk (table 1), produced a higher enhancer activity of the reporter gene (figure 5C). However, there were no differences in enhancer activities between the wild-type and variant haplotypes in region 2 (figure 5D). In region 1, the ChIP assays also confirmed the H3K4me1 and H3K27ac modifications to the enhancer segment (online supplementary figure 16). Taken together, these results indicate that the risk haplotype of rs11187842[A]–rs3781266[C]–rs3740365[T] may elevate the enhancer activity at 10q23.33.

Figure 5

The haplotype at 10q23.33 regulates enhancer activity and expression of NOC3L by modulating the binding affinity of POU2F1 and PAX3. (A) The risk allele C of the lead variant rs10509671 was correlated with increased expressions of NOC3L in normal stomach tissues. (B) ChIP-Seq data show enrichment of histone marks, DHSs and transcription factors in two regions that include variants (region 1: rs11187842, rs3781266 and rs3740365; region 2: rs7903902 and rs12220125) in strong LD with the lead variant rs10509671 (R2 >0.6) (see also online supplementary table 8). (C–D) Allele-specific constructs containing the two putative enhancer regions (region 1 and region 2) were cloned into the pGL3-promoter luciferase reporter vector and transfected into BGC823 and SGC7901 cells. (C) Luciferase activity of the 900 bp DNA segment in the region 1 haplotype. Three variants within region 1, which were perfectly correlated with each other (pairwise r2=1.00), were mutated to modify their common alleles (G, T and A) to the risk alleles (A, C and T, respectively). (D) Luciferase activity of the 800 bp DNA segment in the region 2 haplotype. The haplotype contained two variants in strong LD: rs7903902 and rs12220125 (r2=0.91). The luciferase activity values were normalised to that of pGL3-promoter. The data are presented as the means±SDs, n=6. (E) EMSA with biotin-labelled oligonucleotides containing the rs11187842, rs3781266 and rs3740365 alleles in region 1 and nuclear extracts from BGC823 cells. Lane 1 shows the mobility of the labelled oligonucleotides without nuclear extracts; lanes 2–4, 7, 10 and 13 show the mobility of the labelled oligonucleotides with nuclear extracts in the absence of competitor oligonucleotides; and lanes 5–6, 8–9, 11–12 and 14–15 show the mobility of the labelled oligonucleotides with nuclear extracts in the presence of the associated unlabelled competitors. The arrow indicates a DNA–protein complex for rs3781266 and rs3740365. See also online supplementary figure 17. (F) The risk allele C of rs3781266 disrupts the POU2F1-binding motif. Left: Predicted preferential binding of POU2F1 to the non-risk allele T of rs3781266. Right: ChIP enrichment of POU2F1 as determined by ChIP-qPCR (n=3). (G) The risk allele T of rs3740365 creates a Pax3-binding motif. Left: Predicted preferential binding of Pax3 to the risk allele T of rs3740365. Right: ChIP enrichment of Pax3 as determined by ChIP-qPCR (n=3). In (C), (D), (F) and (G), the data are presented as the means±SDs, and the p values were calculated using Student’s t-test. *p<0.05, **p<0.01. ChIP, chromatin immunoprecipitation; DHS, DNase I hypersensitivity site; n.s., not significant; UW, University of Washington; WT, wild type.

Next, we performed EMSA assays that showed stronger binding bands for the labelled probes containing either rs3781266[T] or rs3740365[T] allele, but probes for different alleles of rs11187842 did not exhibit differential bands (figure 5E). Competitive EMSA assays confirmed stronger binding affinities for the rs3781266[T] and rs3740365[T] alleles (figure 5E, online supplementary figure 17). These two variants were in silico predicted to reside within binding motifs of multiple transcriptional factors in region 1 (online supplementary table 11). Further EMSAs using competitor DNA targeted to predicted transcription factors suggested that POU2F1 bound to the rs3781266 site, while PAX3 bound to the rs3740365 site (online supplementary figure 18A). The binding band was effectively competed away with an unlabeled intact sequence but not by oligonucleotides with a mutated POU2F1 or PAX3-binding motif (online supplementary figure 18B). Additional ChIP assays verified that POU2F1 and PAX3 in vivo bound to the DNA sequences containing rs3781266 and rs3740365, respectively (figure 5F,G; online supplementary figure 19). Taken together, these results suggest that the risk haplotype of rs3781266[C]–rs3740365[T] may disrupt the motif for POU2F1 binding but create the motif for PAX3 binding to the enhancer element at 10q23.33.

The expression data from TCGA showed that NOC3L was significantly overexpressed in GC tumours (online supplementary figure 20A). Subsequently, we knocked down NOC3L in BGC823 and SGC7901 cells, which expressed relatively high levels of NOC3L (online supplementary figure 20B) by RNA interference and observed a reduction in malignant cellular phenotypes, including behaviours such as cell proliferation and colony formation (online supplementary figure 21). We failed to establish an NOC3L knockout cell line by using the CRISPR/Cas9 system possibly due to the previously reported lethality26; thus, we performed the short hairpin RNA (shRNA)-mediated knockdown of NOC3L in BGC823 cells (online supplementary figure 7). NOC3L knockdown cells exhibited significantly lower cell proliferation than the wild-type counterparts (figure 6A,C), and the phenotype was restored after rescuing the expression of NOC3L (online supplementary figure 22). Furthermore, by performing xenograft tumour assays, we found that NOC3L knockdown suppressed the growth of xenograft tumours in nude mice (figure 6D,E). Collectively, these results suggest that NOC3L may promote GC development.

Figure 6

NOC3L knockdown suppresses gastric cancer cell growth. (A) Proliferation curve of NOC3L knockdown (shNOC3L #1 and #2) or negative control (shCtrl) BGC823 cells. (B) Colony formation assay of NOC3L knockdown or negative control BGC823 cells. The numbers of colonies were counted and were presented in the histogram. (C) EdU proliferation analysis indicating the effect of NOC3L on the growth of NOC3L knockdown or negative control BGC823 cells. (D, E) In vivo xenograft tumour formation assays were performed using NOC3L knockdown (shNOC3L #1 and #2) and negative control (shCtrl) cells (3.0×106) subcutaneously injected into the bilateral armpit (left: shCtrl; right: shNOC3L #1 or #2) of 5- to 6-week-old male BALB/c nude mice. Tumour growth was measured twice weekly. On day 21, mice were sacrificed, and excised tumours were photographed in the shCtrl or NOC3L knockdown groups. Mean volume of xenograft tumours is presented as the means±SEMs. Average weight of harvested tumours is presented as the means±SDs. The p values were calculated using Student’s t-test. *p<0.05, **p<0.01, ***p<0.001.

Population replications identify two new loci at 3q11.2 and 4q28.1 associated with GC risk

To discover additional loci related to GC risk, we performed a two-stage replication study to assess 33 promising variants not previously reported. In the stage of replication I, we identified five variants consistently associated with GC risk at p<0.05 in 1710 GC cases and 1802 controls (online supplementary table 3). In the stage of replication II, we further genotyped these five variants in an additional 5325 GC cases and 6521 controls and confirmed the associations between two variants (rs7624041 and rs10029005) and GC risk (table 1). Finally, we found that rs7624041 at 3q11.2 (OR=1.21, 95% CI: 1.13 to 1.28, p=4.56×10 9) and rs10029005 at 4q28.1 (OR=1.14, 95% CI: 1.10 to 1.19, p=3.33×10 11) were significantly associated with GC risk, without heterogeneity among studies (table 1 and online supplementary figure 23).

Gene-based and pathway analysis identify pathways related to immune response contributing to GC susceptibility

We identified 1379 genes to be associated with GC risk at a p value of <0.05 (online supplementary figures 24 and 34) genes passing correction for multiple testing (pFDR <0.05) (online supplementary table 12). Among these genes, 28 were located in the regions 1q22, 5p13.1 and 10q23.33, and the other six new genes were DAGLB, GFRA1, OR10G2, IRGC, YDJC and CCDC116.

In the pathway analysis, 222 pathways satisfied p<0.05, of which 5 reached the significance level of p<1×10 3, that is, cellular response to interleukin 1 (p=4.34×10 6), response to platelet-derived growth factor (p=3.88×10 4), immune response (p=4.57×10−4), positive regulation of T-cell-mediated immunity (p=7.64×10 4) and Toll-like receptor signalling pathway (p=8.28×10 4) (online supplementary table 13). After permutation-based multiple testing correction, the cellular response to interleukin 1 pathway remained significant (p=0.022). These results highlight the importance of pathways related to the immune response in GC susceptibility.

Discussion

In 2011, our group first reported chromosome 5p13.1 as a susceptibility locus for GC in Chinese populations.8 Since then, several studies have replicated this association in populations of different ethnicities, including Chinese,10 Korean27 and European14 populations. However, the functional mechanism of this locus remains unknown to date. In the present study, we found that the risk C allele of rs59133000 may reduce the promoter activity by enhancing the binding affinity of NF-κB1 and also lower the expression of PRKAA1. NF-κB1 (ie, p50), together with the other NF-κB family members (RelA/p65, RelB, c-Rel and NF-κB2/p52), forms homodimer and heterodimer that regulate many genes involved in inflammatory responses.28 Among them, the commonly studied heterodimer p50:p65 can induce gene expression and provide a rapid molecular switch for responses to pathogens or inflammatory stimuli.29 However, p50 homodimers can repress inflammation by competing with activating NF-κB dimers and preventing them from binding to the sites on the promoters of target genes.30 Moreover, the NF-κB signalling is one of the important triggers in inflammation-induced carcinogenesis, including that in GC.31 Therefore, we presented an allele-specific regulation of PRKAA1 mediated by NF-κB1 binding to the promoter, which may have accounted for the observed differences in GC risk among individuals.

PRKAA1 encodes AMP-activated protein kinase (AMPK) subunit 1, AMPKα1, which is required for the formation of the AMPK heterotrimeric serine/threonine kinase complex with the β- and γ-subunits.32 AMPK is a central regulator of the cellular metabolism and energy homeostasis, but the role of AMPKα1 in tumour development has not been extensively investigated. In the present study, the knockout and recovery of PRKAA1 in GC cells and xenograft tumour assays in nude mice demonstrated the role of PRKAA1 in suppressing GC cell growth. Recently, it was also shown that phosphorylation and activation of AMPKα1 suppresses gastric tumourigenesis through the LKB1-AMPK-PGC1α axis.33 Notably, the use of metformin, the activator of AMPK, could reduce the risk of GC,34 but these results remain controversial in observational studies.35 In the present study, we found that the C allele of rs59133000, by lowering PRKAA1 expression (naturally mimicking the opposite condition to metformin use), was associated with an increased GC risk, providing a genetic evidence supporting the protective effect of metformin use in the absence of confounding bias. Therefore, our results not only elucidated the mechanism by which genetic variants at 5p13.1 modify GC risk but also pinpointed the chemopreventive role of metformin in GC risk at the genetic level, suggesting that metformin has clinical implications and translational value to benefit populations at high risk of GC.

The 10q23.33 locus was initially found to be associated with the risk of both GC and oesophageal squamous cell carcinoma (ESCC) by two independent GWASs,7 36 which was replicated in the subsequent studies.8 37 38 PLCE1 was previously considered an ESCC susceptibility gene accounting for the signal of 10q23.33, although its reported functional roles were conflicting in recent studies.39–41 However, the biological mechanisms of 10q23.33 in GC development have not been investigated. Of particular interest, the lead variant rs10509671 at 10q23.33 was not correlated with PLCE1 expression in normal stomach tissues (online supplementary table 10) but instead was strongly correlated to the expression of NOC3L. NOC3L encodes an NOC3-like DNA replication regulator, also known as a factor for adipocyte differentiation 24 (fad24). In previous studies, fad24 was reported to be a positive regulator of adipogenesis by controlling DNA replication and promoting adipocyte differentiation.42 43 However, biological function of NOC3L in cancer remains unknown. The present study, for the first time, demonstrated that NOC3L could promote GC cell growth. It has been reported that genetic variants at 10q23.33 may have a stronger effect on gastric cardia cancer than on non-cardia cancer.44 Obesity and high BMI have been considered risk factors for gastric cardia cancer via adipokine pathophysiology and systemic inflammation mechanism.45 Notably, adiponectin, which is the most abundant adipokine negatively correlated with BMI, can activate the AMPK signalling but inhibit the NF-κB signalling, resulting in a negative effect on carcinogenesis.46 Therefore, we speculate that NOC3L may promote gastric tumourigenesis through promoting adipogenesis and increasing systemic inflammation. Moreover, the 10q23.33 locus was found to switch the binding of POU2F1 and PAX3 to a regulatory element by base changes at two variant sites (rs3781266 and rs3740365, respectively), which, however, are in complete LD. On the other hand, the mutant haplotype of rs3781266[C]–rs3740365[T] can upregulate NOC3L expression to promote GC cell growth, leading to an increased GC risk. Taken together, our findings suggest a model in which two variants at 10q23.33 may synergistically regulate the oncogenic role of NOC3L in GC development via binding affinity switching of the transcription factors POU2F1 and PAX3.

Our gene-based analysis showed that the majority of the significant genes (28/34) map to 1q22, 5p13.1 and 10q23.33. This finding again strengthens the dominant effects of these three loci on GC susceptibility. Of particular interest, the pathway analysis revealed that GC susceptibility genes were enriched in a series of pathways related to the immune response, especially the cellular response to the interleukin 1 pathway. This enrichment is consistent with the established link between chronic inflammation and GC development, because Helicobacter pylori infection causes chronic gastritis by promoting proinflammatory cytokine release and achlorhydria and follows a stepwise cascade of events from metaplasia to dysplasia before malignancy.47 Therefore, the associations between genetic variants in inflammation-related genes and GC risk were extensively examined via a candidate gene approach previously.38 48 The most well-known example is polymorphisms in the interleukin-1 gene cluster,49 including IL1B-31 or IL1B-511 and IL1RN, which exhibit a crucial role in the response to H. pylori infection and gastric neoplastic progression. Our pathway analysis results again highlighted the critical role of the host immunity in determining GC risk.

In summary, the present study reported 12 GC risk-related loci, including two novel ones. Overall, these loci can explain about 1.25% of the phenotypic variance in GC risk, whereas three loci of 1q22, 5p13.1 and 10q23.33 accounted for 0.76%. Importantly, for the first time, we characterised the mechanisms underlying the role the risk loci of 5p13.1 and 10q23.33 play in GC at the regulatory and functional levels. At the 5p13.1 locus, the C allele of rs59133000 reduces the promoter activity by enhancing the binding affinity of the transcriptional repressor NF-κB1, leading to downregulated expression of PRKAA1, weakening the suppressive effect of PRKAA1 on tumourigenesis and eventually resulting in an increased risk of GC. At the 10q23.33 locus, the risk haplotype rs3781266[C]–rs3740365[T] can switch the binding motifs of POU2F1 and PAX3 in an enhancer element, resulting in an increased enhancer activity and upregulated expression of NOC3L and ultimately promoting the tumourigenesis of GC. Moreover, the results of our pathway analysis also highlight an important role of host immunity, especially the cellular response to interleukin 1, in GC susceptibility. Therefore, the present study expands our current understanding of the genetic basis of GC risk and provides new evidence for genes and biological pathways involved in the tumourigenesis of GC.

Acknowledgments

The authors thank all study participants, researchers and clinicians who contributed to this study.

References

Footnotes

  • CY, MZ, YD, MY, MW, GL and CR are joint first authors.

  • Correction notice This article has been corrected since it published Online First. An author note has been added.

  • Contributors GuaJ, HS, QW and ZH designed the study and edited the manuscript; CY and MZ performed statistical analysis, conducted experiments and wrote the manuscript; TH, FY, JW, TW, JN, JC and YJ performed the genotyping and experiments; YD, MY, MenW, GL, CR, WY, BH and MeiW, RZ, YW and DX contributed to sample collection; EZ, JD, HM, SW, YC, ZX, JZ, GuoJ, ZW and ZZ reviewed data and provided critical comments or suggestions; GuaJ had primary responsibility for final content.

  • Funding This work was supported by grants from the National Major Research and Development Programme (2016YFC1302703); National Natural Science Foundation of China (81872702, 81521004 and 81573228); 333 High-Level Talents Cultivation Project of Jiangsu Province (BRA2018057); Jiangsu Outstanding Youth Fund (BK20160095); Project funded by China Postdoctoral Science Foundation (2019TQ0157).

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval This study was approved by the Institutional Review Board of Nanjing Medical University. Animal care and handling procedures were performed in accordance with the National Institutes of Health’s Guide for the Care and Use of Laboratory Animals and were approved by the Committee on the Ethics of Animal Experiments of Nanjing Medical University (Nanjing, China).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement All data relevant to the study are included in the article or uploaded as supplementary information.

  • Author note Dr Qingyi Wei is a visiting professor at Fudan University.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.