Article Text

Download PDFPDF
IDDF2024-ABS-0448 Predicting stomach cancer survival in patients with type 2 diabetes: a comparative analysis of machine learning models
  1. Junjie Huang1,
  2. Zhaojun Li1,
  3. Junjie Hang2,
  4. Yu Li1,
  5. Qi Dou1,
  6. Ziwei Huang3,
  7. Claire Chenwen Zhong1,
  8. Jinqiu Yuan4,
  9. Martin CS Wong1
  1. 1The Chinese University of Hong Kong, Hong Kong
  2. 2Cancer Hospital & Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, China
  3. 3Boston University, United States
  4. 4The Seventh Affiliated Hospital, Sun Yat-sen University, China.

Abstract

Background Stomach cancer is a major global health concern and the third leading cause of cancer-related deaths worldwide. Recent studies suggest that patients with diabetes mellitus (DM) have a higher risk of developing stomach cancer. This study aimed to identify the predictors of stomach cancer survival among patients with type 2 diabetes and to compare the performance of different machine learning models in predicting survival.

Methods The study included 1,912 type 2 diabetes patients with stomach cancer, retrieved from the Hong Kong Hospital Authority Data Collaboration Laboratory (HADCL) from 2000 to 2020. Twenty-two variables, including demographic, clinical, and behavioral factors, were analyzed. Five machine learning models were built and compared: Cox Proportional-Hazards (CoxPH), Gradient Boosting Survival Tree (GBST), Survival Decision Tree, Lasso Penalized Cox Regression, and Random Survival Forest (RSF). Concordance index (c-index) and time-dependent Area Under the Curve (AUC) were used to evaluate the model performance.

Results Cox regression analysis indicated that older age at diagnosis (HR=1.06, p<.05), longer duration of diabetes (HR=1.04, p<.05), and higher body mass index (BMI) (HR=1.03, p<.05) were the most significant predictors of poorer stomach cancer survival among diabetes patients. The highest c-index for the testing set was found in the GBST model at 0.690, followed by the RSF model at 0.687, Lasso penalized Cox regression at 0.682, the CoxPH model at 0.668, and the Survival Decision Tree at 0.649. In the time-dependent AUC, the GBST model achieved the highest mean AUC, indicating its superior performance. The Shapley Additive Explanations (SHAP) plot for the GBST model showed that the patients’ survival outcomes were most strongly influenced by age at diagnosis, duration of diabetes, high-density lipoprotein cholesterol (HDL-C), fasting glucose, and BMI.

Conclusions The GBST model demonstrated the highest accuracy in predicting stomach cancer survival among patients with type 2 diabetes. These findings suggest that the GBST model has the potential to improve treatment efficiency by providing more precise identification of risk factors for stomach cancer survival in this patient population.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.