Article Text
Abstract
Objective Metabolic biomarkers are expected to decode the phenotype of gastric cancer (GC) and lead to high-performance blood tests towards GC diagnosis and prognosis. We attempted to develop diagnostic and prognostic models for GC based on plasma metabolic information.
Design We conducted a large-scale, multicentre study comprising 1944 participants from 7 centres in retrospective cohort and 264 participants in prospective cohort. Discovery and verification phases of diagnostic and prognostic models were conducted in retrospective cohort through machine learning and Cox regression of plasma metabolic fingerprints (PMFs) obtained by nanoparticle-enhanced laser desorption/ionisation-mass spectrometry (NPELDI-MS). Furthermore, the developed diagnostic model was validated in prospective cohort by both NPELDI-MS and ultra-performance liquid chromatography-MS (UPLC-MS).
Results We demonstrated the high throughput, desirable reproducibility and limited centre-specific effects of PMFs obtained through NPELDI-MS. In retrospective cohort, we achieved diagnostic performance with areas under curves (AUCs) of 0.862–0.988 in the discovery (n=1157 from 5 centres) and independent external verification dataset (n=787 from another 2 centres), through 5 different machine learning of PMFs, including neural network, ridge regression, lasso regression, support vector machine and random forest. Further, a metabolic panel consisting of 21 metabolites was constructed and identified for GC diagnosis with AUCs of 0.921–0.971 and 0.907–0.940 in the discovery and verification dataset, respectively. In the prospective study (n=264 from lead centre), both NPELDI-MS and UPLC-MS were applied to detect and validate the metabolic panel, and the diagnostic AUCs were 0.855–0.918 and 0.856–0.916, respectively. Moreover, we constructed a prognosis scoring system for GC in retrospective cohort, which can effectively predict the survival of GC patients.
Conclusion We developed and validated diagnostic and prognostic models for GC, which also contribute to advanced metabolic analysis towards diseases, including but not limited to GC.
- GASTRIC CANCER
Data availability statement
Data are available on reasonable request. Additional data (beyond those included in the main text and Supplementary Information) are available from the corresponding author upon request.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
WHAT IS ALREADY KNOWN ON THIS TOPIC
Early detection of gastric cancer (GC) can help patients receive timely treatment and improve their prognosis.
Small metabolites, as downstream molecular biomarkers, can directly reveal the phenotype of cancer and may lead to high-performance blood tests.
Nanoparticle-enhanced laser desorption/ionisation mass spectrometry (NPELDI-MS) can selectively recognise and capture metabolites in solid phase, which is promising for effective plasma metabolic fingerprinting as alternative to liquid/gas phase chromatography.
WHAT THIS STUDY ADDS
We launched the world’s largest study to date to determine the value of metabolites as biomarkers for GC diagnosis and prognosis, through the analysis of GC-specific plasma metabolic fingerprints (PMFs) obtained by NPELDI-MS.
The constructed models based on PMFs exhibited excellent diagnostic and prognostic performance for GC in both retrospective and prospective cohorts.
HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
The metabolic biomarker panel constructed can be applied for the diagnosis and prognosis of GC and subsequently contribute to GC management.
Introduction
Gastric cancer (GC) is the fifth common malignant tumour and the fourth-leading cause of cancer-related death globally, resulting in ~2 million cases of new GC patients and ~0.8 million deaths related to GC per year, according to the latest data from International Agency for Research on Cancer.1 Importantly, early diagnosis for timely treatment can efficiently reduce the high mortality of GC. The 5-year overall survival (OS) can exceed 90% for early GC patients,2 compared with advanced GC patients with local or distant metastasis whose 5-year OS can decline to 5%–30%.3 However, current diagnosis of GC relies on sophisticated gastroscopy and experienced doctors, affecting the large-scale application, especially in areas with limited resources. Resultingly, even in the USA, more than 60% of GC patients have local or distant metastases at the time of diagnosis.3 Accordingly, emerging blood tests based on biomarkers are in urgent need and promise the early diagnosis of GC towards large-scale use, due to the simple operation and easy interpretation.4
Selection of cancer biomarkers is the core of blood tests for diagnostic purpose, considering both cellular biomarkers (eg, circulating tumour cells and cancer exosomes) and molecular biomarkers (nucleic acids, proteins and metabolites).5–8 While there have been advancements in cellular biomarkers, many widely used biomarkers in clinical settings are molecular biomarkers known for their stability.9 Further, compared with the upstream molecular biomarkers such as nucleic acids and proteins, small metabolites (molecular weight (MW) <1000 Da) as downstream molecular biomarkers directly reveal the phenotype of cancer.10 11 Notably, the available metabolites as GC biomarkers are still very limited, on small preliminary cohorts (69–600).12–23 Therefore, panels of metabolic biomarkers are in demand, which should be constructed on a designer GC cohort, with multifunctions in both diagnosis and prognosis.
Currently, mass spectrometry (MS) is the primary tool in high-throughput detection and profiling of small metabolites with identification capability, by measuring the mass-to-charge ratio (m/z) of metabolites and their fragments at high resolution of ~ppm level.24 However, traditional MS analysis of biosamples usually calls for liquid/gas phase chromatography for enrichment and purification of metabolites to enhanced specificity and sensibility, affecting the analytical speed and capacity.24 25 In this regard, nanoparticle-enhanced laser desorption/ionisation-MS (NPELDI-MS) relies on nanoparticles of defined structures for selective recognition and trapping of metabolites in solid phase, as a promising alternative to liquid/gas phase chromatography.26–28
To validate the value of blood metabolites in GC diagnosis and prognosis, we launched a large-scale, multicentre, retrospective study with prospective validation and employed NPELDI-MS to record the world’s largest cohort of GC-specific plasma metabolic fingerprints (PMFs) to date. The aim of this study was to establish both diagnostic and prognostic models for GC based on PMFs, and tried to construct metabolic panels with desirable performance towards broad clinical applications.
PATIENTS and methods
Participant cohorts
We conducted a large-scale, multicentre study, which comprised 1944 participants from 7 centres in retrospective cohort and 264 participants in prospective cohort. The medical records of all participants were reviewed. Baseline clinicopathological data, including age, sex, body mass index, smoking history, drinking history and family history, were collected before gastroscopy. And some information of peripheral blood tumour markers, including carcinoembryonic antigen (CEA), carbohydrate antigen 242 (CA242), carbohydrate antigen 724 (CA724), alpha fetoprotein (AFP), carbohydrate antigen 125 (CA125) and carbohydrate antigen 199 (CA199), was also collected. The analysis of protein biomarkers was according to clinical instruction (CA242: No.20163400974, Snibe; CA724: No.11776258, Cobas; AFP: No.3P36/07P90, Alinity; CEA: No.00937450, Simens; CA125: No.2K45/08P49, Alinity; CA199: No.10491244, Simens).
Retrospective cohort
In this study, a multicentre, retrospective cohort that recruited a total of 1944 participants, including 962 GC patients and 982 non-GC participants, was built across China from 1 November 2007 to 31 August 2019. Specifically, the discovery dataset included samples from five centres (Zhejiang Provincial Hospital of Chinese Medicine, Sichuan Cancer Hospital, the People’s Hospital of Fenghua, Tiantai Chinses Medicine Hospital, and the People’s Hospital of Xinchang) in two provinces, while the independent external verification dataset was constructed by samples from another two centres (Zhejiang Cancer Hospital and the First People’s Hospital of Daishan).
For GC patients, the inclusion criteria included: (1) GC patients confirmed by pathological examination and with D2 lymphadenectomy, (2) patients with plasma samples prior to surgery, (3) patients who had not received any tumour treatment before surgery and (4) patients with complete clinical data and follow-up. The exclusion criteria included: (1) patients without plasma samples prior to surgery, (2) patients who had received any tumour treatment before surgery and (3) patients with incomplete medical records or loss of follow-up. Pathological type, tumour size, tumour location, grade of differentiation, nerve invasion and vascular tumour thrombus were also retrieved from medical records. According to the eighth American Joint Committee on Cancer TNM staging system, pN stage, pM stage and pTNM stage were also retrieved.
For non-GC participants, the plasma samples were collected during routine physical examination. According to the reviewing medical examination results and consulting medical history, the health statuses of all participants could be confirmed. The exclusion criteria of non-GC participants included: (1) participants with a history of cancer-related treatment, (2) participants without plasma samples prior to endoscope and (3) participants using drugs that affect the composition of peripheral blood, including adrenocortical hormone and glucocorticoid.
Prospective cohort
The prospective cohort was constructed in the Zhejiang Cancer Hospital from 1 January 2021 to 31 December 2021 and consisted of 264 subjects.
For GC patients, the inclusion criterion was GC patients confirmed by gastroscopy and pathology, and the exclusion criteria included: (1) patients who had received any antitumour treatment, (2) patients with a history of any other malignancies and (3) patients using drugs that affect peripheral blood components. For non-GC participants, the inclusion criterion was participants confirmed by gastroscopy and pathology; and the exclusion criteria included: (1) participants with history of any other malignancies and (2) participants using drugs that affect peripheral blood components, including adrenocortical hormone and glucocorticoid. Written informed consent was obtained from all participants. We collected peripheral plasma and clinical information from all GC and non-GC participants who met the inclusion and exclusion criteria.
Sample collection and processing
Peripheral blood sample (5 mL per person) of each participant was collected before any gastroscopy or surgery with an empty stomach. And the collected samples were transported to the laboratory for processing within 1 hour at 4℃. All samples were obtained by centrifuging at 1800 g at 4℃ for 10 min. After centrifugation, the plasma in top layer was collected and stored at −80℃. The time from blood draw, centrifugation to ultimate storage at −80℃ was within 1 hour. EDTA was used as anticoagulant, and all the plasma samples were collected from fasted individuals. All centres followed the same procedure of sample collection and processing (SOP #GB/T 38576-2020).
MS detection
NPELDI-MS analysis
Before NPELDI MS detection, the ferric nanoparticles were prepared and characterised and the plasma samples were pretreated.
To prepare ferric nanoparticles, a solve thermal approach was employed.29 30 The FeCl3·6H2O, trisodium citrate dihydrate and anhydrous sodium acetate were dissolved in ethylene glycol solution step by step. Then the solution was sonicated at room temperature for 45 min, and then transferred to a Teflon-lined stainless-steel autoclave and heated at 200°C for 10 hours. After cooling to room temperature, the product was collected and washed with ethanol and deionised water before drying for 24 hours at 60°C. Finally, the ferric nanoparticles were suspended in deionised water at a 1.0 mg/mL concentration for further detection. Besides, scanning electron microscopy images were recorded on the S-4800 (Hitachi, Japan), before which the nanoparticle suspension was printed on the aluminium foil. Transmission electron microscopy and elemental mapping images were collected using the JEOL 2100F (JEOL, Japan) by depositing nanoparticle suspension onto a copper grid (mesh size=200). The optical extinction spectrum was obtained using UV 3600 (Shimadzu, Japan) with depositing nanoparticles on a glass slide. The optical microscopy images of matrix-analyte co-crystallisation were collected by a three-dimensional Surface Profiler VK-X3000 (Keyence, Japan). The digital images of the MS chip were photographed on a smartphone of iPhone 13 (Apple, USA). The organic matrices used for comparison included α-cyano-4-hydroxycinnamic acid, 2,5-dihydroxybenzoic acid and 2,6-dihydroxyacetophenon.
For plasma sample pretreatment, a 50 µL of plasma was mixed with 50 µL of organic solvents (methanol/acetonitrile, v/v=1:1), and shaken for 15 min on an orbital shaker. The sample was centrifuged at 9168 g for 15 min, then the sediment was removed, and the supernatant was retained for further analysis.
NPELDI MS experiment of samples was conducted on the Bruker systems Autoflex (Time of Flight—MS) and Solarix 7.0T (Fourier Transform—Ion Cyclotron Resonance—MS, FT-ICR-MS) equipped with Nd:YAG lasers (with a wavelength of 355 nm). For detection, 1 µL of analyte solution (standard samples and plasma samples) was dropped on the plate and dried in air at room temperature, followed by spotting 1 µL of matrix slurry (ferric nanoparticles or organic matrices) and dried for further analysis. The positive ion mode was applied in all NPELDI MS experiments. Mass measurement was calibrated by standard small molecules in the mass range of 100–400 Da, including valine (Val, [Val+Na]+ at m/z 140.05), lysine (Lys, [Lys+Na]+ at m/z 169.05), glucose (Glc, [Glc+Na]+ at m/z 203.05) and sucrose (Suc, [Suc+Na]+ at m/z 365.26). The instrument parameters were optimised with the laser pulse frequency of 1 kHz, the number of laser shots for each analysis of 2000, the delay time of 200 ns and the acceleration voltage of 20 kV. For biomarker identification, the accurate MS was conducted on the FT-ICR-MS. Further, the related molecular formula was confirmed in the Bruker Compass Data Analysis software (V.5.0) according to the accurate MS (<10 ppm). The Human Metabolome Database (HMDB) was employed for searching the candidate metabolite.
Data preprocessing of the raw mass spectra generated by NPELDI MS was performed using home-built code before further machine learning by MATLAB (version R2020a, The MathWorks, USA), including data binning, spectrum smoothing, baseline correction, peak detection and peak filtration.
Ultra-performance liquid chromatography-MS analysis
For plasma sample pretreatment, a 50 µL of plasma was mixed with 200 µL of organic solvents (methanol/acetonitrile, v/v=1:1), kept at −20°C for at least 2 hours, and centrifuged at 13 202 g for 20 min. The supernatant was dried in a vacuum centrifuge for at least 3 hours. The dry extracts were dissolved in 100 µL of solvents (methanol/water, v/v=3:7), vortexed for 3 min, and centrifuged at 13 202 g and 4°C for 15 min to remove insoluble debris. The supernatant was transferred to vials for further analysis. The quality control samples (QCs) were prepared by pooling 140 randomly picked samples from 70 GC patients and 70 non-GC participants. Then, it was split into three aliquots that were stored at −80°C for further analysis.
Chromatographic analysis was performed on a Vanquish UHPLC/Q Exactive plus (Thermo Scientific, USA) with an ACQUITY UPLC HSS T3 column (100×2.1 mm, 1.7 µm, Waters), and mobile phase A (ultrapure water with 0.1% formic acid) and B (acetonitrile with 0.1% formic acid) were included. The elution gradient was as follows: mobile phase A/B 99/1~0/100 (v/v) in 12 min, hold in 0/100 (v/v) for 12–13 min. The injection was 1 µL at a flow rate of 0.4 mL/min with column temperature at 40°C. MS data were collected on a Q Exactive plus Mass spectrometer (Thermofisher), with HESI source in both positive and negative modes. The scan mode was set as DDA mode with full scan range of 67–1000 Da, full scan resolution of 70 000. The spray voltage was set as 3.2 kV for positive mode and 2.8 kV for negative mode, the capillary temperature was 320°C and the s-lens radio frequency (RF) level was 50 V.
The raw data were processed by TraceFinder (Thermo Scientific, USA) to generate peak table. The targeted screening method with m/z mass tolerance of 5.00 ppm and retention time window override of 15 s was applied. For peak detection parameters, the ICIS algorithm was set to detect highest peak, with smoothing of 7, area noise factor of 5, peak noise factor of 10, baseline window of 40, min peak height (S/N) of 3, noise method of repetitive, min peak width of 3, multiplet resolution of 10, area tail extension of 5 and area scan window of 0.
Model construction and statistical analysis
For diagnosis model construction, machine learning of the PMFs was conducted on a data mining toolbox, Orange (V.3.25.0, the Bioinformatics Lab at University of Ljubljana, Slovenia).31 Stratified fivefold cross-validation was employed to train neural network (NN), ridge regression (RR), lasso regression (LR), support vector machine (SVM) and RF, in the discovery dataset. For validation, all the machine learning models were tested in the independent external verification dataset to study the overfitting effect. The diagnosis was considered as a binary classification, and all included machine learning models output a probability of patients with cancer from 0 to 1.
For prognosis model construction, in the discovery dataset, we first performed the univariate Cox analysis of all the 300 m/z signals in PMFs to determine their prognostic values and remove excessive noise, and 41 m/z signals with p<0.2 were retained for further analysis. Then, a multivariate Cox regression was performed, and 4 m/z signals with p<0.05 were selected and applied to generate a linear regression predictor, plasma metabolic prognosis (PMP) score. Typically, a p<0.2 was set as the cut-off to determine prognostic signals in univariate analysis, while a p<0.05 was set in multivariate analysis, as consistent with previous reports.32–34 For validation, the PMP score of each sample in both the discovery dataset and independent external verification dataset was calculated, and all the samples were labelled as high score or low score by the median score in the discovery dataset. Further, Kaplan-Meier survival analysis and log rank test were conducted to evaluate the efficiency of UMP score. All the analyses for prognosis were performed in R (V.4.1.1) with the package of ‘survival’ and ‘timeROC’.
The cosine similarity score between two spectra was calculated on MATLAB with the function ‘squareform’ according to the often-used cosine correlation method by data matrix of PMFs without any kind of transformation or scaling method.35 The statistical difference between two receiver operating characteristic (ROC) curves was tested based on Delong’s method on MATLAB with the function ‘DeLongUI-master’.36 The analysis of similarities (ANOSIM) was conducted in R (V.4.1.1) with the package of ‘anosim’. The unsupervised clustering analysis, including t-distributed stochastic neighbour embedding (t-SNE), principal component analysis (PCA), and uniform manifold approximation and projection (UMAP), were conducted on Python (V.3.8) using packages of ‘scikit-learn’ and ‘umap’. Other statistical analysis in this work was performed on SPSS software to calculate the p value for statistical demonstration. All significance level was set as 0.05.
Results
Study design and baseline information
As shown in figure 1A, nearly 30 000 subjects were considered for inclusion in this study. According to the well-defined criteria, 1944 participants were finally enrolled in retrospective cohort (962 GC patients with complete follow-up information and 982 non-GC participants) and 264 participants (125 GC patients and 139 non-GC participants) were enrolled in prospective cohort. All GC patients were confirmed by pathological examination, and all non-GC participants were confirmed by medical examination results and consulting medical history. Specifically, the discovery dataset contained 1157 participants (including 528 GC patients and 629 non-GC participants) from 5 centres, while 787 participants (including 434 GC patients and 353 non-GC participants) from other 2 independent centres were defined as the independent external verification dataset. As shown in tables 1–3, there were no significant differences in age, sex, smoking history and drinking history between GC patients and non-GC participants both in the discovery, independent external verification and prospective validation dataset. Furthermore, for more detailed clinical information, please refer to online supplemental tables S1–S6.
Supplemental material
Then, microarrayed NPELDI-MS was employed to construct PMFs. A trace of pretreated plasma (0.5 µL of each sample) were analysed under laser to acquire raw mass spectra, and PMFs were constructed by data preprocessing. (figure 1B). Besides, ultra-performance liquid chromatography (UPLC)-MS was applied in prospective cohort for validation. Moreover, analysis of PMFs was conducted for diagnosis and prognosis of GC. Specifically, the feature selection and model building were performed for GC diagnosis with the aid of machine learning, and univariant and multivariant Cox regression were executed successively for prognostic model construction (figure 1C).
Construction of gastric cancer-associated PMFs
The alteration of metabolism in patient’s blood caused by metabolic reprogramming of cancer cells has been widely reviewed in previous studies,13 26 revealing that metabolomic information in blood may reflect the existence of insidious tumours and potential prognostic effect of cancer patients. Therefore, we constructed a gastric cancer-associated plasma metabolic database by NPELDI-MS, showing the advantages of high-throughput, desirable reproducibility and limited centre-specific effects.
To achieve high-throughput, NPELDI-MS featured little sample pretreatment and fast analytical speed, affording an overall experimental time <2 min per sample. Ferric nanoparticles (figure 2A) synthesised by a modified solve-thermal method were selected as the matrix for NPELDI-MS, displaying strong photon adsorption in the ultraviolet-visible region (online supplemental figure S1A). Due to the designed nanoscale surface roughness, the nanoparticles could directly trap small molecules on chip and thus enrich metabolites for solid-phase detection, allowing little sample pretreatment (~1.5 min per sample) before NPELDI-MS detection. The trapping of metabolites by nanoparticles could be validated by the elemental mapping analysis of the nanoparticle-glucose hybrids, which showed significant higher carbon signal on the nanoparticles as compared with the background (p<0.001, figure 2B). Importantly, fast analytical speed of 2 s per sample (with 2000 laser shots at a pulse frequency of 1000 Hz) and throughput of 384 samples per chip could be achieved by NPELDI-MS microarray automatically (figure 2C). Specifically, detection process of 1944 plasma samples could be done within 70 hours and ~124 000 data points were recorded for each sample. In the typical MS results of plasma samples from GC group and non-GC group, strong m/z signal were obtained in the low mass range of 100–400 Da, accounting for ~95.0% of total ion counts in the range of 100–1000 Da (figure 2D, online supplemental figure S2). Further, the PMFs of all samples in retrospective cohort were constructed by extracting 300 m/z signals from raw MS results through data preprocessing, which served as basis for further feature selection and model building for diagnosis and prognosis (figure 2E, online supplemental table S7).
Supplemental material
To determine the detection reproducibility, QC processes were developed both during and after NPELDI-MS. For the stability of NPELDI-MS during the whole long-term sample test, standard samples containing four metabolites (including valine (Val), lysine (Lys), glucose (Glc) and sucrose (Suc)) were applied as QC samples and tested at regular intervals during the whole sample test procedure. As a result, the coefficients of variances (CVs) of intensities of these metabolites were 13.04%–19.41%, featuring desirable system variation of NPELDI-MS for metabolic profiling towards large-scale clinical applications (online supplemental table S8). Specifically, the satisfactory reproducibility of NPELDI-MS could be owed to the homogeneous morphology of nanoparticle-analyte cocrystallisation compared with organic matrices (online supplemental figure S1B–C). Afterwards, cosine correlation algorithm was introduced to evaluate the similarity between metabolic fingerprints within the same group and characterise the detection quality. Consequently, over 95% of samples showed similarity scores over 0.9 in both GC group and non-GC group, demonstrating the high quality of PMFs and its reliability for further diagnostic and prognostic analysis (figure 2F).
Notably, all the plasma samples in retrospective cohort were collected from seven centres and the centre-specific effects of PMFs were characterised. We performed three typical unsupervised clustering algorithms on the PMFs, including t-SNE, PCA and UMAP, to ascertain whether there were any significant centre-specific effects. As the visualisation map showed, the samples did not cluster to an appreciable degree by centre among t-SNE and other clustering analysis (figure 2G–I), which were consistent with the result of ANOSIM (p=0.633, online supplemental figure S3) and suggested that observed patterns of PMFs are not artefacts of the methods used by each centre. Importantly, the PMFs also displayed certain degree of overlap between GC group and non-GC group in above data visualisations, indicating the necessity of introducing advanced algorithms to interpret data for enhanced characterisation performance (online supplemental figure S4).
Machine learning in PMFs towards GC diagnosis
We conducted machine learning to determine the value of PMFs (consisting of 300 signals as listed in online supplemental table S7) as a diagnostic tool to distinguish GC group from non-GC group in retrospective cohort. The discovery dataset contained 1157 participants (including 528 GC patients and 629 non-GC participants) from 5 centres, while 787 participants (including 434 GC patients and 353 non-GC participants) from other 2 independent centres were defined as the independent external verification dataset to confirm the generalisation ability of the models (online supplemental table S9).
Specifically, five commonly used machine learning algorithms, including NN, RR, LR, SVM and RF, were applied to interpret PMFs towards diagnostic application. The fivefold cross-validation method was used for model training. Consequently, the scores, calculated from the probability of each sample being diagnosed as GC patients in machine learning models, were significantly higher (p<0.05) in GC group (average score of 0.638–0.961) than non-GC group (average score of 0.035–0.557) in each fold of discovery dataset and independent external verification dataset, reflecting the distinction of metabolic information between GC compartments and non-GC compartments and suggesting potential diagnostic value of PMFs (figure 3A–B, online supplemental table S10).
Specifically, the built machine learning models based on the PMFs achieved desirable sensitivities of 76.9%–94.7%, specificities of 73.6%–96.3%, accuracies of 81.4%–95.6% and area under curve (AUCs) of 0.909–0.988 in the discovery dataset (figure 3C, online supplemental table S11). Importantly, consistent results (sensitivities of 71.0%–91.5%, specificities of 69.4%–92.9%, accuracies of 77.3%–92.1% and AUCs of 0.862–0.979) could be obtained in the independent external verification dataset (figure 3D). Further, precision-recall curve (PRC) analysis was performed on the diagnostic performances of machine learning models based on the PMFs, and the areas under PRC for all models in both discovery and independent external verification dataset were higher than 0.874, which was agreed with ROC analysis and indicated that PMFs-based metabolomic information exhibited excellent performance for GC diagnosis (figure 3E,F).
Identification of the metabolic biomarker panel
In order to enhance the feasibility of NPELDI-MS for GC diagnosis and facilitate further pathway analysis, we constructed a metabolic panel from the data matrix of PMFs in retrospective cohort.
The stepwise statistical screening criteria were applied to reduce the number of biomarkers and thus construct the metabolic panel. First, the coefficients of signals in LR model, LR scores, were applied as filter for feature selection and the threshold was optimised as 0.4 according to the AUCs in both discovery and verification dataset (online supplemental figure S5). Subsequently, the differential comparison of PMFs was conducted by comparing GC compartments with non-GC compartments in the discovery dataset, and features with p<0.05 were retained. Meanwhile, the mean intensities of signals were taken into consideration to enable detection and quantification, and high-abundance features (mean intensity >100) were selected (figure 4A). Further, a metabolic panel consisting 21 metabolites was identified in selected features, including formic acid (FA), γ-butyrolactone (GBL), alanine (Ala), acetic acid (Ac), succinic acid (SA), acetoacetic acid (AcAc), glycolic acid (GA), creatinine (Cre), 2-aminoacrylic acid (2-AA), pyruvic acid (PA), lysine (Lys), valine (Val), succinylacetone (SucAce), 4-acetamidobutanoic acid (4-AABA), glutamine (Glu), norcotinine (Nor), ornithine (Orn), pyridoxamine (Pyr), urocanic acid (UA), O-phosphothreonine (OPT) and syringol sulfate (SS), based on Fourier transform ion cyclotron resonance MS (FT-ICR MS) and HMDB (online supplemental table S12). Specifically, 11 of them (FA, GBL, Ac, SA, GA, PA, Lys, SucAce, Pyr, UA and SS) were upregulated (p<0.05), while the other 10 metabolites (Ala, AcAc, Cre, 2-AA, Val, 4-AABA, Glu, Nor, Orn and OPT) were downregulated (p<0.05) in GC group compared with non-GC group (figure 4B).
Particularly, we performed pathway analysis in MetaboAnalyst (http://www.metaboanalyst.ca/) to reveal the biological relevance of these 21 metabolites and determine molecular mechanisms of GC. As a result, there were seven metabolic pathways associated with GC with the pathway impact >0.1 and hit number (the number of matched metabolites in the pathway) ≥1, including (1) synthesis and degradation of ketone bodies, (2) histidine metabolism, (3) pyruvate metabolism, (4) glycolysis/gluconeogenesis, (5) arginine and proline metabolism, (6) alanine, aspartate and glutamate metabolism and (7) butanoate metabolism (figure 4C, online supplemental table S13).
Consequently, the machine learning model (NN, LR, RR, RF and SVM) built by the metabolic panel exhibited desirable diagnostic efficiency. The sensitivities of 78.2%–89.6%, specificities of 80.6%–93.3%, accuracies of 81.2%–91.6% and AUCs of 0.921–0.971 were achieved in discovery dataset (figure 4D), while corresponding results could be also obtained in independent external verification dataset with sensitivities of 78.8%–90.1%, specificities of 75.6%–89.2%, accuracies of 82.6%–87.5% and AUCs of 0.907–0.940 (figure 4E, online supplemental table S14).
To demonstrate the clinical benefit of established metabolic biomarker panel for early diagnosis of GC, we trained and validated machine learning models when considering GC patients in stage I and stages I–II only. Considering the imbalanced sample size between GC group and non-GC group if only GC patients in stage I or stages I-II were included, propensity score matching based on gender and age was conducted to select samples with match tolerance of 0.02. Specifically, five machine learning algorithms, including NN, RR, LR, SVM and RF, were applied. The fivefold cross-validation method was used for model training in discovery dataset. As a result, when considering patients in stages I–II only, the AUCs of machine learning models achieved 0.942–0.978 in the discovery dataset and 0.871–0.926 in the independent external verification dataset (online supplemental figure S6A,B, online supplemental table S15). When considering patients in stage I only, the machine learning models demonstrated AUCs of 0.915–0.956 in the discovery dataset and 0.902–0.929 in the independent external verification dataset (online supplemental figure S6C,D, online supplemental table S16).
Performance of Met-NN compared with traditional blood-based biomarker tests
The machine learning model (NN, LR, RR, RF and SVM) built by the metabolic panel outperformed the traditional blood-based biomarker tests (p<0.05, by DeLong test). Specifically, the Met-NN (the NN model built by the metabolic panel) exhibited best diagnostic AUCs (0.971 in discovery dataset and 0.940 in independent external verification dataset, figure 5A–B, online supplemental table S14, as compared with CA242, CA724, AFP, CEA, CA125 and CA199 (0.487–0.702 in discovery dataset and 0.443–0.723 in independent external verification dataset, online supplemental table S17). Specifically, the distribution of the traditional blood-based biomarkers in the discovery and independent external verification datasets were shown (online supplemental table S7).
In addition, the detection rates of Met-NN in the cancer group (81.6%–96.4%) were higher than that of traditional blood-based biomarkers (0.10%–39.4%) with the specificity of 80%, 90% and 98% (figure 5C), while the false detection rates (1.75%–35.1%) were lower than traditional biomarkers (47.8%–100.0%) with the sensitivity of 80%, 90% and 98% (figure 5D). As for the comparison between metabolic biomarker panel and traditional blood-based biomarker panel, we further built machine learning models considering all the protein biomarkers together, showing AUCs of 0.432–0.759 in the discovery dataset and 0.364–0.686 in the independent external verification dataset (online supplemental figure S8, online supplemental table S18). Comparably, the established metabolic biomarker panel exhibited significantly higher diagnostic performance than traditional blood-based biomarker panel (p<0.05).
Notably, to demonstrate the universal diagnostic performance, we calculated the detection sensitivity of Met-NN in different GC populations. The overall sensitivity of Met-NN was 89.6% in the discovery dataset and 86.2% in the independent external validation dataset. For gender and age, the sensitivity of Met-NN in female patients (91.47%) was higher than that in male patients (86.79%), and increased with age from 85.75% to 91.41% (figure 5E–F, online supplemental table S19). Considering the tumour progression, including size, grade, metastasis and stage, the Met-NN displayed satisfactory sensitivity over 85.43% in all groups, suggesting wide application scenario of Met-NN (figure 5G–J, online supplemental table S19). Particularly, for the early stage of GC patients, the sensitivity reached 87.25% for stage I and 90.31% for stage II, respectively, suggesting the general applicability of the metabolic panel for early stage GC detection. Typically, the clinical significance of the Met-NN was estimated by decision curve analysis (DCA). According to the DCA curve, the Met-NN exhibited positive net benefit for threshold probability above 7% compared with intervention for all patients or none of the patients (online supplemental figure S9). These findings suggest that Met-NN might offer more clinical benefit for all populations with superior diagnostic performance (p<0.05, by DeLong test), as compared with clinical available blood-based biomarker tests.
Besides, while NNs are not inherently interpretable, a clear formula listing individual metabolites with individually assigned weights cannot be given.11 To address this problem, we trained another logistical regression model for GC diagnosis based on established metabolic biomarker panel, and the constructed equation is
, where p is the prediction probability of GC, and was the weight of intensity of each metabolite (online supplemental table S20). As a result, the logistical regression model achieved an AUC of 0.965 (95% CI 0.955 to 0.974) in the discovery dataset and 0.927 (95% CI 0.908 to 0.946) in the independent external verification dataset for the diagnosis of GC (online supplemental figure S10).
Prospective validation of the biomarker panel via NPELDI-MS and UPLC-MS
The constructed metabolic biomarker panel was validated in a prospective validation cohort of the lead centre based on both NPELDI-MS and UPLC-MS. A total of 264 subjects, including 125 GC patients and 139 non-GC participants, underwent endoscopy, and had plasma collected for metabolic biomarker detection and traditional blood-based biomarker tests (including CA242, CA724, AFP, CEA, CA125 and CA199).
Based on NPELDI-MS platform as described in the retrospective cohort, the PMFs of the prospective validation cohort were obtained, and the metabolic biomarker panel was extracted for validation. Consequently, the biomarker panel distinguished GC patients from non-GC participants with AUCs of 0.855–0.918 with the prespecified diagnostic machine learning model (figure 6A, online supplemental table S21). Among the described five models (NN, LR, RR, RF and SVM), the Met-NN afforded the highest AUC of 0.918 (95% CI 0.879 to 0.956), which also showed significant superiority to traditional blood-based biomarker tests with AUCs of 0.328–0.752 (p<0.05, by DeLong test, figure 6B, online supplemental table S22).
Notably, to further validate the metabolic panel, we applied UPLC-MS, the mainstream method for metabolic profiling of human biofluids currently, to detect the biomarker panel in the prospective cohort, and six metabolites, including Ala, 4-AABA, Glu, Orn, Cre and Val, could be robustly detected with CVs<30% in QC samples (online supplemental table S23). Through the fivefold cross-validation method, five machine learning models (NN, LR, RR, RF and SVM) were trained to validate the diagnostic efficiency of the metabolic panel, showing desirable AUCs of 0.856–0.916 (figure 6C, online supplemental table S24). Specifically, the scores generated by machine learning models were also significantly higher in the GC group than in the non-GC group (p<0.05, figure 6D).
In general, the constructed metabolic biomarker panel exhibited high-performance diagnostic efficiency (with sensitivity of 83.20%–88.80%, specificity of 85.61%–88.49% and accuracy of 84.47%–88.64%), which outdistanced traditional blood-based biomarker tests (with specificity over 75%, but sensitivity lower than 15% and accuracy lower than 50%, figure 6E, online supplemental table S25). Specifically, the thresholds of protein biomarkers were set according to clinical instruction, with 20 U/mL for CA242, 6.9 U/mL for CA724, 8.1 ng/mL for AFP, 5 ng/mL for CEA, 35 U/mL for CA125 and 37 U/mL for CA199. Notably, the further application of 21-metabolic panel or 6-metabolic panel will depend on the analytical platform. These results further validated the desirable diagnostic value of proposed metabolic biomarker panel and its advantages as compared with traditional blood-based biomarker tests.
Construction of the prognostic model for GC survival prediction
Moreover, we investigated the potential of PMFs for prediction of prognosis in GC. The patients followed up for more than 3 months and with complete survival information were involved for further analysis. Similar to the diagnostic analysis, the discovery dataset (512 observations with 131 deaths from 2 centres) and independent external verification dataset (423 observations with 127 deaths from one centre) were set in retrospective cohort, with a median follow-up time of 50 months (ranging from 3 to 148 months).
In the discovery dataset, the univariant Cox regression (UniCox) were implemented to reduce the dimensionality and select metabolic features related to the OS of GC (online supplemental table S26). Then, a multivariant Cox regression (MultiCox) model was built to construct a PMP scoring system with a four-metabolite panel (figure 7A). These four metabolites could be identified as acrylamide, glycine, proline and Val based on FT-ICR MS and HMDB (online supplemental table S27). Subsequently, all the patients were divided into two groups, including high-risk group and low-risk group, according to the median PMP score in discovery dataset. Specifically, the high-risk group (PMP score >−2.09) had 256 observations with 79 events in the discovery dataset and 206 observations with 70 events in the independent external verification dataset, while and the low-risk group (PMP score <−2.09) had 256 observations with 52 events in the discovery dataset and 237 observations with 77 events in the independent external verification dataset (figure 7B).
Further, the Kaplan-Meier curves were generated to evaluate the prognostic prediction efficiency of the PMP score. Consequently, the median survival time in the low-score group was significantly more prolonged than that in the high-score group (p=0.0089 in the discovery dataset and p=0.0191 in the independent external verification dataset, by log-rank test, figure 7C–D), illustrating the desirable prognosis efficiency of PMP score. Specifically, time-dependent ROC curve analysis was performed to evaluate and compare the predictive ability of PMP score with conventional TNM staging system.37 As a result, PMP score displayed comparable performance with the TNM staging system in predicting 5–9 years OS of GC patients (p>0.05, by DeLong test, online supplemental table S28).
Discussion
With millions of new cases and deaths annually, GC remains a global important disease, which highlights the need for early diagnosis and predictive prognosis of GC to improve cancer control and clinical practice. Although gastroscopy coupled with histopathology is the gold standard for screening and diagnosis of GC, it has major disadvantages of its invasive nature and poor population compliance (down to 33.5%38). In contrast, the blood test is expected to own better population compliance (higher than 90%39) due to increased accessibility and minimal invasiveness.
Specifically, several categories of blood-based molecular biomarkers were highlighted in diagnostics and therapeutic intervention towards GC, ranging from genomic, proteomic, to metabolomic level. For genetic biomarkers, numerous nucleic acids circulating in the blood have been reported as diagnostic biomarkers towards GC.40–43 Particularly, the most extensive study (comprising 5248 subjects) for evaluation of circulating nucleic acid was launched by So et al, in which a serum microRNA biomarker panel was developed and validated for GC detection with an AUC of 0.93.41 For proteomic biomarkers as upstream molecules in the pathway, the clinical applications of cancer-related antigens (including CEA, APF, CA199 and CA12544 45) have been widely studied, but still limited by their low sensitivity and accuracy (down to ~30%). In parallel, metabolic biomarkers, as the downstream products, are expected to provide real-time phenotype information of biosystems and developed as highly efficient targets for therapeutic intervention.46 47 Metabolic biomarkers have shown promising proof-of-concept results for GC diagnosis in previous study but remain largely inconclusive, due to traditional analytical tools and preliminary study design.12–20
For analytical tools, traditionally, detection of metabolites mainly relied on UPLC-MS and nuclear magnetic resonance spectroscopy, which typically require hours of sample pretreatment or submillilitre of sample consumption to overcome the sample complexity and enrich the target metabolites.24 48 In contrast, NPELDI-MS featured little sample pretreatment (~min) due to the enhanced detection sensitivity through desirable photo-thermal properties and in-situ preconcentration via nanoscale surface roughness of nanoparticles.49 50 Besides, the fast analytical speed of 2 s per sample coupled with throughput of 384 samples per chip can be achieved by microarray design, demonstrating the application potency of discovering metabolic biomarkers in large-scale cohort.
Study design plays a fundamental role in the development of high-performance biomarker panel and thus provides precise insights into disease phenotype, regarding sufficient cohort size, advanced data interpretation method and multiple phases of biomarker development.51 For cohort size, previous works for metabolic analysis of GC were mainly carried out with small-scale single-centre cohorts (n=69–600).12–16 In contrast, a total of 2208 subjects (including n=1944 in retrospective cohort and n=264 in prospective cohort) were enrolled in this study with well-defined inclusion and exclusion criteria, and their PMFs were recorded by NPELDI-MS, which incorporated advantageous of high throughout, desirable reproducibility and limited centre-specific effects that were designed for detection of large-scale multicentre cohort. Subsequently, with the machine learning as advanced data interpretation method for feature selection and model building, the metabolic panel was constructed in discovery dataset (with AUCs of 0.921–0.971) and robustly verified in independent external verification dataset (with AUCs of 0.907–0.940). Moreover, except independent external verification in retrospective cohort, the validation phase of the metabolic biomarker panel was designed and conducted in a prospective cohort by both NPELDI-MS and traditional UPLC-MS, which consequently afforded consistent AUCs of 0.855–0.918 and 0.856–0.916, respectively. Particularly, through above study design, we further demonstrated that the diagnostic performance of our metabolic panel was significantly higher than traditional blood-based protein biomarkers (including CA242, CA724, AFP, CEA, CA125 and CA199, p<0.05, by DeLong test). Significantly, the universal diagnostic performance of Met-NN was also certified by the high detection sensitivity (85.43%–100.00%) in different GC populations. These findings strongly suggested that our metabolic panel can robustly serve for GC diagnosis and is promising for universal and large-scale applications, due to the large cohort size, advanced machine learning method and three-phase design.
Furthermore, pathway analysis has revealed that identified biomarker panel is associated with several biological pathways related to tumour progression, including the synthesis and degradation of ketone bodies, histidine metabolism, pyruvate metabolism, glycolysis/gluconeogenesis, arginine and proline metabolism, alanine, aspartate and glutamate metabolism, and butanoate metabolism. For instance, Ala, as a non-essential amino acid, has been reported to be associated with tumour progression and metastasis through synthesising glucose and maintaining metabolic homoeostasis in the body.52 Glu has a pleiotropic effect on tumour cell function, including macromolecular synthesis, energy production, mTOR activation, maintenance of reactive oxygen species balance and antitumour acidic microenvironment.53 Ornithine can remove ammonia by stimulating the urea cycle and glutamine synthesis, and thus inhibit the growth of metastatic colorectal cancer and increase the sensitivity of the tumour to immunotherapy.54 Creatine, as an important metabolic regulatory factor, can conserve biological energy to enhance antitumour immunity of CD8 T cells.55 Valine, an essential amino acid for tumour cell growth and metabolism, has been reported to involved in protein synthesis in tumour cells and can promote tumour cell growth, while whose catabolic inhibitor has been found to control the progression of advanced colorectal cancers.56 These findings suggest that regulating tumour metabolism may be a potential therapeutic strategy for GC, and further functional verification study needs to be conducted to reveal the underlying biological mechanism of these metabolites, which is promising to provide novel metabolic intervention targets towards GC therapy. Notably, the metabolites in the diagnostic marker panel were not overly specific to GC, but some of them were associated with several biological pathways related to tumour progression as reported. We postulate the diagnostic specificity would be guaranteed with the metabolic biomarker panel rather than single biomarker by measuring a more comprehensive representation of human phenotype.57 In response to this, a multicentre study with larger sample cohort is currently planned, and we will design the non-GC group with a sufficient number of patients with other types of cancers or benign diseases. This will enable us to validate the specificity of the established metabolic biomarker panel for large-scale screening applications.
Moreover, developing effective prognostic prediction model to identify high-mortality-risk GC patients is critical for the improvement of clinical treatment decision. Specifically, increasing numbers of studies have demonstrated the value of molecular biomarkers towards the prognosis of GC, but mainly limited to upstream molecules, including nucleic acids (eg, circular RNAs and long non-coding RNAs)58–60 and proteins (eg, CA199 and CA724).61 62 In parallel, metabolic biomarkers, as the downstream molecules, which are promising to provide a timely feedback of human body than genomics and proteomics, have been demonstrated to have prognostic value towards cancer but rarely reported for GC.26 63 In this work, a PMP score was constructed with a four-metabolite panel by implementing Cox regression in the PMF dataset, showing desirable prognostic efficiency (p=0.0089 in the discovery dataset and p=0.0191 in the independent external verification dataset, by log-rank test) and comparable performance in predicting 5–9 years OS of GC patients with the TNM staging system (p>0.05). In particular, the TNM staging system, as the main prognostic predictor, is the comprehensive combination considering imaging and tissue biopsy results, while the NPELDI-MS-based PMP score only required a blood test, indicating its potential to serve as a point-of-care tool towards clinical application and thus improve cancer management.
Nevertheless, there are still some limitations in this study. First, it should be noted that this study solely focuses on the Chinese population, and further research is required to establish its relevance and generalisability to other populations from diverse geographical locations and racial backgrounds. Second, the clear biological significance of metabolites is still unknown and needs to be fully elucidated. Further exploration of the relationship between metabolites and GC development may indirectly provide metabolic intervention targets towards GC therapy. Third, our non-GC cohort only includes healthy controls, atrophic gastritis patients and superficial gastritis patients, and more lesions, such as intestinal metaplasia and high-grade intraepithelial neoplasia, were expected to be enrolled in further study to enhance the clinical significance. Fourth, hyphenated techniques enhance specificity and confidence in detection and annotation in metabolic analysis, while further advanced instrumental approaches such as multidimensional chromatography, MSn and ion mobility can further improve these capabilities. For NPELDI-MS, high-resolution mass spectrometer system (FT-ICR MS) and designer nanoparticles were essential to perform and annotate the metabolic fingerprinting, which may be engineered for point-of-care tests.
Conclusions
In general, we validate the high-performance of plasma metabolic panel towards GC diagnosis and prognosis by launching a large-scale, multicentre study and employing NPELDI-MS to record the GC-specific PMFs. Currently, we are conducting a large-scale, prospective clinical trial to further validate the performance of these model. We envisage that these diagnostic and prognostic models have the potential to transform the screening of GC patients and subsequently contribute to GC management.
Data availability statement
Data are available on reasonable request. Additional data (beyond those included in the main text and Supplementary Information) are available from the corresponding author upon request.
Ethics statements
Patient consent for publication
Ethics approval
This study involves human participants and the retrospective study was approved by the centralised ethics board used by seven participating centres, including Zhejiang Provincial Hospital of Chinese Medicine, Sichuan Cancer Hospital, the People’s Hospital of Fenghua, Tiantai Chinses Medicine Hospital, the People’s Hospital of Xinchang, Zhejiang Cancer Hospital and the First People’s Hospital of Daishan (IRB-2019-113), and the prospective study was approved by the ethics board used by Zhejiang Cancer hospital (IRB-2020-61). Participants gave informed consent to participate in the study before taking part.
Acknowledgments
We sincerely appreciate colleagues of the hospitals participating in this study for their help of sample collection. We appreciate the great technical support from the Centre of Zhejiang Cancer Hospital for their follow-up of gastric cancer patients.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
ZX, YH and CH are joint first authors.
ZX, YH and CH contributed equally.
Contributors XC, KQ and LY conceived the study and acquired the funding. ZX, YH and CH carried out clinical research, collected clinical samples, analysed clinical data, and wrote articles. LD, YD, JQ, GC, HL, PZ, WH, XW, MX, PW, CH, LY, YZ, JX, JC and QW participated in clinical samples collection. WL, RW, SY, JW, JC and JZ contributed to the data analysis and material characterisation. All authors have read and approved the final manuscript.
Funding This study was supported by National Key R&D Program of China (2021YFA0910100), National Natural Science Foundation of China (82074245, 81973634, 82204828, and 81971771), Medical-Engineering Joint Funds of Shanghai Jiao Tong University (YG2021ZD09, YG2022QN107, YG2023ZD08), Zhejiang Provincial Research Center for Upper Gastrointestinal Tract Cancer (JBZX-202006), Natural Science Foundation of Zhejiang Province (HDMY22H160008), Chinese Postdoctoral Science Foundation (2022M713203), Shanghai Institutions of Higher Learning (2021-01-07-00-02-E00083), Innovative Research Team of High-Level Local Universities in Shanghai (SHSMU-ZDCX20210700), Innovation Research Plan by the Shanghai Municipal Education Commission (ZXWF082101) and National Research Center for Translational Medicine Shanghai (TMSK-2021-124, NRCTM(SH)-2021-06).
Competing interests The authors declare competing financial interests. The authors have filed patents for both the technology and the use of the technology to detect biosamples.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.