Objective Tumour pathology contains rich information, including tissue structure and cell morphology, that reflects disease progression and patient survival. However, phenotypic information is subtle and complex, making the discovery of prognostic indicators from pathological images challenging.
Design An interpretable, weakly supervised deep learning framework incorporating prior knowledge was proposed to analyse hepatocellular carcinoma (HCC) and explore new prognostic phenotypes on pathological whole-slide images (WSIs) from the Zhongshan cohort of 1125 HCC patients (2451 WSIs) and TCGA cohort of 320 HCC patients (320 WSIs). A ‘tumour risk score (TRS)’ was established to evaluate patient outcomes, and then risk activation mapping (RAM) was applied to visualise the pathological phenotypes of TRS. The multi-omics data of The Cancer Genome Atlas(TCGA) HCC were used to assess the potential pathogenesis underlying TRS.
Results Survival analysis revealed that TRS was an independent prognosticator in both the Zhongshan cohort (p<0.0001) and TCGA cohort (p=0.0003). The predictive ability of TRS was superior to and independent of clinical staging systems, and TRS could evenly stratify patients into up to five groups with significantly different prognoses. Notably, sinusoidal capillarisation, prominent nucleoli and karyotheca, the nucleus/cytoplasm ratio and infiltrating inflammatory cells were identified as the main underlying features of TRS. The multi-omics data of TCGA HCC hint at the relevance of TRS to tumour immune infiltration and genetic alterations such as the FAT3 and RYR2 mutations.
Conclusion Our deep learning framework is an effective and labour-saving method for decoding pathological images, providing a valuable means for HCC risk stratification and precise patient treatment.
Data availability statement
The data from Zhongshan Hospital that support the findings of this study are available upon reasonable request from the corresponding author (QG). The data from Zhongshan Hospital are not publicly available, because they contain protected patient privacy information. The external validation of TCGA data set is publicly available at the TCGA portal (https://portal.gdc.cancer.gov). We provide a manifest linking to the sample IDs considered in the study (at https://github.com/wangxiaodong1021/HCC_Prognostic). We also provided annotated files of TCGA tumour regions (at https://github.com/wangxiaodong1021/HCC_Prognostic). Code availability: All code related to this method was written in Python. Custom code related to the image extraction, preprocessing pipeline, deep-learning model builder, data provider and experimenter driver were available (at https://github.com/wangxiaodong1021/HCC_Prognostic).
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.