Article Text
Abstract
Background Stomach cancer is a major global health concern and the third leading cause of cancer-related deaths worldwide. Recent studies suggest that patients with diabetes mellitus (DM) have a higher risk of developing stomach cancer. This study aimed to identify the predictors of stomach cancer survival among patients with type 2 diabetes and to compare the performance of different machine learning models in predicting survival.
Methods The study included 1,912 type 2 diabetes patients with stomach cancer, retrieved from the Hong Kong Hospital Authority Data Collaboration Laboratory (HADCL) from 2000 to 2020. Twenty-two variables, including demographic, clinical, and behavioral factors, were analyzed. Five machine learning models were built and compared: Cox Proportional-Hazards (CoxPH), Gradient Boosting Survival Tree (GBST), Survival Decision Tree, Lasso Penalized Cox Regression, and Random Survival Forest (RSF). Concordance index (c-index) and time-dependent Area Under the Curve (AUC) were used to evaluate the model performance.
Results Cox regression analysis indicated that older age at diagnosis (HR=1.06, p<.05), longer duration of diabetes (HR=1.04, p<.05), and higher body mass index (BMI) (HR=1.03, p<.05) were the most significant predictors of poorer stomach cancer survival among diabetes patients. The highest c-index for the testing set was found in the GBST model at 0.690, followed by the RSF model at 0.687, Lasso penalized Cox regression at 0.682, the CoxPH model at 0.668, and the Survival Decision Tree at 0.649. In the time-dependent AUC, the GBST model achieved the highest mean AUC, indicating its superior performance. The Shapley Additive Explanations (SHAP) plot for the GBST model showed that the patients’ survival outcomes were most strongly influenced by age at diagnosis, duration of diabetes, high-density lipoprotein cholesterol (HDL-C), fasting glucose, and BMI.
Conclusions The GBST model demonstrated the highest accuracy in predicting stomach cancer survival among patients with type 2 diabetes. These findings suggest that the GBST model has the potential to improve treatment efficiency by providing more precise identification of risk factors for stomach cancer survival in this patient population.