Objective Clinical diagnosis and approval of new medications for non-alcoholic steatohepatitis (NASH) require invasive liver biopsies. The aim of our study was to identify non-invasive biomarkers of NASH and/or liver fibrosis.
Design This multicentre study includes 250 patients (discovery cohort, n=100 subjects (Bariatric Surgery Versus Non-alcoholic Steato-hepatitis - BRAVES trial); validation cohort, n=150 (Liquid Biopsy for NASH and Liver Fibrosis - LIBRA trial)) with histologically proven non-alcoholic fatty liver (NAFL) or NASH with or without fibrosis. Proteomics was performed in monocytes and hepatic stellate cells (HSCs) with iTRAQ-nano- Liquid Chromatography - Mass Spectrometry/Mass Spectrometry (LC-MS/MS), while flow cytometry measured perilipin-2 (PLIN2) and RAB14 in peripheral blood CD14+CD16− monocytes. Neural network classifiers were used to predict presence/absence of NASH and NASH stages. Logistic bootstrap-based regression was used to measure the accuracy of predicting liver fibrosis.
Results The algorithm for NASH using PLIN2 mean florescence intensity (MFI) combined with waist circumference, triglyceride, alanine aminotransferase (ALT) and presence/absence of diabetes as covariates had an accuracy of 93% in the discovery cohort and of 92% in the validation cohort. Sensitivity and specificity were 95% and 90% in the discovery cohort and 88% and 100% in the validation cohort, respectively.
The area under the receiver operating characteristic (AUROC) for NAS level prediction ranged from 83.7% (CI 75.6% to 91.8%) in the discovery cohort to 97.8% (CI 95.8% to 99.8%) in the validation cohort.
The algorithm including RAB14 MFI, age, waist circumference, high-density lipoprotein cholesterol, plasma glucose and ALT levels as covariates to predict the presence of liver fibrosis yielded an AUROC of 95.9% (CI 87.9% to 100%) in the discovery cohort and 99.3% (CI 98.1% to 100%) in the validation cohort, respectively. Accuracy was 99.25%, sensitivity 100% and specificity 95.8% in the discovery cohort and 97.6%, 99% and 89.6% in the validation cohort. This novel biomarker was superior to currently used FIB4, non-alcoholic fatty liver disease fibrosis score and aspartate aminotransferase (AST)-to-platelet ratio and was comparable to ultrasound two-dimensional shear wave elastography.
Conclusions The proposed novel liquid biopsy is accurate, sensitive and specific in diagnosing the presence and severity of NASH or liver fibrosis and is more reliable than currently used biomarkers.
- NONALCOHOLIC STEATOHEPATITIS
- HEPATIC FIBROSIS
Data availability statement
All data relevant to the study are included in the article or uploaded as supplementary information.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
WHAT IS ALREADY KNOWN ON THIS TOPIC
The diagnosis of non-alcoholic steatohepatitis (NASH) currently relies on invasive liver biopsy. There is therefore an urgent need to find non-invasive biomarkers for NASH diagnosis, disease progression and intervention response monitoring. However, until now, no specific biomarker has been officially endorsed by the Food and Drug Administration and European Medicines Agency.
WHAT THIS STUDY ADDS
We identified two monocyte proteins, PLIN2 and RAB14, which are able to predict the presence and severity of NASH and liver fibrosis, respectively.
HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
The biomarkers we identified are sensitive and specific in diagnosing the presence and severity of NASH and/or liver fibrosis and are more reliable than currently used biomarkers. A liquid biopsy is, therefore, feasible in making diagnosis of NASH and/or liver fibrosis. Sensitive and specific biomarkers can help in identifying patients eligible for NASH pharmacotherapy or surgery in clinical trials and treatment efficacy monitoring.
The approval of drugs for non-alcoholic steatohepatitis (NASH) by the US Food and Drug Administration and the European Medicines Agency requires histological improvement of inflammation without worsening of fibrosis, or NASH resolution and fibrosis improvement.1 Although histology remains the gold standard, the limitations include intraobserver and interobserver variability, but it also requires an invasive liver biopsy.
A large number of patients (65%–73%) enrolled in clinical trials who underwent liver biopsy do not meet the eligibility criteria.2 3 Hence, prebiopsy strategies targeting the right candidates and reducing the number of screen failures are necessary. Indeed, the identification of appropriate biomarkers would increase patients’ enrolment in clinical trials, accelerating the development of therapeutic interventions for NASH.
Unfortunately, plasma biomarkers for the diagnosis of NASH have low sensitivity, ranging from 62% to 66%, and specificity, between 78% and 82%.4–17 Notably, none of the available biomarkers is able to predict the severity of NASH and, thus, NASH and fibrosis staging.
Moreover, it has been shown that scores of NASH and liver fibrosis greatly rely on body mass index (BMI) as a predictor variable and thus show a poor performance in obesity and morbid obesity with increase in false positives.18
Hence, we sought to identify a biomarker and algorithm able to predict not only the presence of NASH but also its severity.
We previously demonstrated that ectopic fat deposition in hepatocytes in non-alcoholic fatty liver (NAFL) and NASH correlates with ectopic fat accumulated in blood monocytes as lipid droplets (LDs).19 20 Hepatic macrophages include not only resident Kupffer cells (KCs) but also monocyte-derived macrophages (MoMFs). It has been suggested that as far as NASH progresses, KCs are replaced by MoMFs.21 These MoMFs can be reprogrammed or repolarised in the liver toward a proinflammatory and pathological phenotype acquiring a foamy phenotype. From the liver, they can re-enter the circulatory stream, as demonstrated in other diseases,22 in a dynamic transition.
NAFLs/NASHs are often associated with liver fibrosis, representing the main determinant of mortality in NASH.7 8 Liver fibrosis derives from the accumulation of extracellular matrix proteins. Inflammation activates KCs, releasing proinflammatory cytokines, including transforming growth factor beta-1. In turn, these activate transdifferentiation of hepatic stellate cells (HSCs) into myofibroblasts, the major source of extracellular matrix.9 23 24
Our study tested perilipin-2 (PLIN2) levels in circulating monocytes as a predictor of histological NASH. Secondary, we tested RAB14 levels in circulating monocytes as a predictor of liver fibrosis.
The discovery cohort consisted of 100 consecutive subjects aged 46.9±10.5 years (67% women), screened during the enrolment of BRAVES (ClinicalTrials.gov identifier: NCT03524365), a multicentre randomised controlled trial (RCT) in which 288 subjects with histologically proven NASH with or without liver fibrosis were randomised (1:1:1) in three intervention arms (intensive lifestyle modification and medical treatment, Roux-en-Y gastric bypass (RYGB) or sleeve gastrectomy). Thirty-nine of the 100 subjects had screen failure after liver biopsy showing NAFL rather than NASH, and therefore they were excluded from the BRAVES trial.
The metabolic surgery cohort consisted of 50 different subjects enrolled in the BRAVES RCT studied at baseline and at 1 year after RYGB.
None of the subjects had secondary causes or a history of alcohol excess.
Enrolment inclusion criteria were liver ultrasound showing steatosis, non-alcoholic fatty liver disease (NAFLD) fibrosis score of >0.676, BMI of ≥30 and <50 kg/m2 (amendment 1 July 2019, previously BMI ≤40 kg/m2), age 25–65 years, both sexes, with informed consent signed.
Enrolment exclusion criteria were (1) regular and/or excessive alcohol uptake (>20 g alcohol/day for women and >30 g alcohol/day for men); (2) clinical evidence of NAFLD secondary to iatrogenic GI or immunodeficiency (HIV infection) diseases; (3) clinical evidence of non-NAFLD hepatic diseases, including hepatitis B or C, or haemochromatosis; (4) Wilson’s disease; (5) glycogenosis; (6) alpha-1 antitrypsin deficiency; (7) autoimmune hepatitis; (8) cholestasis liver disease; (9) presence of relevant cardiovascular, GI or respiratory diseases, or any hormonal disorder; (10) clinical evidence of decompensated liver disease (Child-Pugh score >7 points); (11) undergoing narcotics abuse; (12) relevant systemic diseases; and (13) pregnancy.
The validation cohort included 150 subjects (LIBRA study, ClinicalTrials.gov identifier: NCT04677101) aged 43.4±11.9 years, with 56% women. LIBRA, a multicentre cohort study, enrolled 100 individuals with histologically proven NASH with or without liver fibrosis and 50 individuals who underwent elective cholecystectomy and whose histology showed either NAFL or no histological lesions.
Inclusion criteria for the 100 subjects were NASH documented by liver biopsy and no evidence of another form of liver disease in subjects with a BMI of ≥30 and ≤55 kg/m2. Fifty subjects, who underwent laparoscopic elective cholecystectomy but were otherwise in healthy conditions, aged 25–65 years, including both sexes, with informed consent signed, had an apparently normal liver.
Exclusion criteria were coronary event or procedure (myocardial infarction, unstable angina, coronary artery bypass and surgery or coronary angioplasty) in the previous 6 months; liver cirrhosis; end-stage renal failure; participation in any other concurrent therapeutic clinical trial; any other life-threatening, non-cardiac disease; pregnancy; inability to give informed consent; substantial alcohol consumption (>20 g/day for women or >30 g/day for men); Wilson’s disease; lipodystrophy; parenteral nutrition; and interfering medications (eg, amiodarone, methotrexate, tamoxifen and corticosteroids).
The authors had full access to the data and take responsibility for the completeness and accuracy of the data and integrity of their analysis.
Liver biopsy and histology
In subjects with obesity, percutaneous liver biopsies were performed under ultrasonography with 16-gauge biopsy needles. Needle liver biopsies (also 16-gauge biopsy needles) were obtained during laparoscopic cholecystectomy. All liver biopsies had a length of >15 mm and contained ≥11 portal areas.27 NASH was diagnosed histologically in the presence of steatosis, lobular inflammation and hepatocyte ballooning with or without perisinusoidal fibrosis, and NASH activity was graded according to the value of Non-alcoholic Fatty Liver Disease Activity Score (NAS). NAS was calculated by adding the severity scores for steatosis, lobular inflammation and ballooning with a range from 0 to 8.28 A NAS=3, resulting from the sum of steatosis=1, lobular inflammation=1 and hepatocyte ballooning=1, was the minimum value to make a diagnosis of NASH.29
The SAF scoring system separately assesses the grade of steatosis S histologically from S0 to S3, the activity grade A from A0 to A4 by addition of grades of ballooning and lobular inflammation, each graded from 0 to 2, and the stage of fibrosis F, from F0 to F4 according to NASH clinical research network (CRN) staging system. A single expert pathologist (FMV) read in a blinded manner the digitised slides according to CRN criteria. Relecture of the histological scores on digitalised images by a second independent pathologist (JCM) was performed.
All patients underwent two-dimensional shear wave elastography, performed with MyLab V.9 platform ultrasound system (Esaote, Genova, Italy) using a convex broadband abdominal probe C1-8 MHz. Liver stiffness was measured obtaining four valid measurements in each patient considering both the median values in kilopascal and the ratio between IQR and the median value (M) (directly provided by the software) for the analysis. This technique has been shown to be effective in differentiating significant fibrosis ≥F2 from mild or absent fibrosis in a large series of patients with compensated chronic liver disease without comorbidities potentially affecting liver stiffness measurement.32 The cut-off thresholds used to stage fibrosis are reported in Garcovich et al.32
Online supplemental materials report the methods for anthropometric measures, dual X-ray absorptiometry, blood sample analyses, proteomic, human primary hepatocyte isolation, human primary HSC isolation, cell viability and cell purity after isolation, isolation of peripheral blood mononuclear cells, immunofluorescence, flow cytometry, and validation parameters and more detailed statistics.
Sample size was calculated based on a restrictive hypothesis of area under the receiver operating characteristic (AUROC)=0.70 for the new diagnostic test in discriminating subjects with and without NASH. We used a one-sided test: H0: AUROC=0.5, vs H1: AUROC >0.5, power=90%, alpha=0.025. The ratio between cases and controls was set at 0.6, and the total number of subjects enrolled in the discovery cohort was calculated as 84. Considering an attrition rate of 15%, we estimated a final sample size of 100 subjects (40 with NASH and 60 without) in the discovery cohort. The validation cohort was 3/2 of the discovery cohort, therefore 150 patients were enrolled. The software used was easyROC.33
The main outcome was the prediction of NASH diagnosis using a score derived from a neural network (NN) classifier including PLIN2 mean florescence intensity (MFI) in monocytes and ALT, presence/absence of diabetes, triglycerides and waist circumference as covariates. These covariates, with proven univariate model significance, represent hepatic function and metabolic and lipid profiles.
For SAF-A level prediction, see online supplemental tables 1 and 2.
Variables not normally distributed were log-transformed prior to analyses. Missing data were not replaced by imputation. In our machine-learning analysis, we combined NN-based probabilistic classification with resampling/bootstrapping. In this way, we calculated the number of hidden nodes and the analysis accuracy. The importance of variables was calculated using the Olden method.34 Model discrimination was measured by the AUROC. AUROC CIs were computed by bootstrapping procedure.
The total NAS score, computed as the sum of scores for steatosis, lobular inflammation and ballooning, originally ranging from 0 to 8 was split into three levels: NAS level=0 for total NAS score of <3, NAS level=1 for total NAS score=3 and NAS level=2 for total NAS score of ≥4. A NN classifier analysis was used to predict NASH severity based on NAS scores. We calculated the confusion matrix, accuracy and receiver operating characteristic (ROC) curves, with the respective areas under the curve (AUCs), for each level of NAS.
RAB14 (MFI) was tested as predictor of liver fibrosis, diagnosed by SAF-F (presence: SAF-F ≥1, absence: SAF-F=0), in a multivariate logistic stepwise regression model including relevant covariates. Since in the discovery cohort only 3% of patients were free of fibrosis, according to SAF-F, making the dataset unsuitable for model development, we randomly split the whole dataset into a discovery and a validation set to obtain a balanced number of patients with and without fibrosis, using the createdataPartition function of the R caret package, which allows random split of the sample, preserving the overall class distribution of the data.
SAF-F was then recoded as a three-level variable: SAF-F_level=0 if SAF-F=0, SAF-F_level=1 if SAF-F=1 and SAF-F_level=2 if SAF-F is ≥2.
RAB14 was then used to predict SAF-F levels in a multinomial model. A model including elastography instead of RAB14 was also used to compare the predictive capacity of the two predictors. The quantitative variables were log-transformed prior to the analyses. AUROC assessed the discrimination ability of the model. The Youden criterion and the ‘closest top-left’ methods were employed to determine the best threshold. When Youden’s J statistic is used, the optimal cut-off is the threshold that maximises the distance from the diagonal line or, in other words, that maximises the sum of the sensitivity and specificity. The closest top-left method instead determines the optimal threshold as the point closest to the top-left part of the plot with perfect sensitivity or specificity; specifically, it minimises the quantity ((1−sensitivities)2+(1−specificities)2). The two criteria may or may not lead to the same cut-off point, but while the Youden criterion reflects the intention of maximising overall correct classification rates, the closest top-left criterion mathematically involves a quadratic term of non-immediate interpretation from a clinical point of view.35
AUROCs of the new score and classical indices of fibrosis (NAFLD fibrosis score, FIB4 and AST-to-platelet ratio) were compared according to DeLong et al.36
Continuous variables are reported as mean and SD, while categorical variables are reported as numbers and percentages. A p value of <0.05 was considered statistically significant. The analyses were conducted in R.37
Proteomics and in vitro studies
To identify a possible biomarker of liver fibrosis, we performed proteomics in monocytes and HSCs obtained from liver biopsies of 5 subjects (2 men and 3 women) with NASH and liver fibrosis and in five subjects with negative histology for NASH and liver fibrosis from the LIBRA trial. Age was 45.80±6.18 in subjects with NASH and 40.60±2.41 years (p=0.118) in those without NASH; BMI was 39.21±6.22 and 36.20±4.76 kg/m2 (p=0.415) in subjects with and without NASH, respectively.
Two of the subjects in each group had type 2 diabetes (T2D) and hypertension and were treated with metformin and Dipeptidyl Peptidase-IV (DPP-IV) inhibitors for diabetes and ACE inhibitors and beta blockers for hypertension.
Using p<0.0001 and q<0.0001 as statistical significance thresholds, the proteomic analysis identified nine proteins differentially expressed in both HSCs and monocytes. Ras-related protein Rab-18 (RAB18), annexin A6 (ANXA6) and Ras-related protein Rab-14 (RAB14) were downregulated, while disintegrin and metalloproteinase domain-containing proteins 8 and 9 (ADAM8 and ADAM9), Ras-related protein Rab-25 (RAB25), galectin-1 and 12 (LGALS1 and LGALS12) and profilin-1 (PFN1) were upregulated in the presence of liver fibrosis (see online supplemental figure 1). Among the proteins screened by proteomic, RAB14 was the most modified in the presence of fibrosis and our first choice as possible biomarker for the prediction of liver fibrosis.
To confirm the proteomic analysis, we have assessed RAB14 expression by flow cytometry in both monocytes and HSCs of 20 subjects with NASH and liver fibrosis and 20 subjects without NASH and liver fibrosis, spanning from normal liver to NAFL (figure 1A–E). A linear regression analysis confirmed a high correlation of RAB14 in monocytes and HSCs (R2=0.73, p<0.0001) (figure 1F).
To assess PLIN2 as a possible diagnostic biomarker for NASH, we performed flow cytometry in monocytes and hepatocytes of the same subjects (figure 1G–K). Indeed, PLIN2 is a major protein coating LDs, and PLIN2 liver-specific knockout alleviates diet-induced hepatic steatosis and inflammation in mice.38 Using flow cytometry analysis, we found a high correlation (R2=0.85, p<0.0001) of PLIN2 expression in monocytes and hepatocytes (figure 1L).
The subjects included in RAB14 and PLIN2 analysis were 10 men and 10 women in each group, the mean age was 46.05±2.03 years in subjects with NASH and 43.80±1.79 years (p=0.411) in those without NASH; BMI was 38.49±1.23 and 35.97±0.87 kg/m2 (p=0.103) in subjects with and without NASH, respectively. Ten subjects in each group had T2D and hypertension and were treated with Metformin, DPP-IV, and SGLT2i inhibitors for diabetes and ACE inhibitors and beta-blockers for hypertension.
Discovery and validation cohorts
Table 1 reports the characteristics of the patients enrolled in the discovery, validation and global cohorts as well as in the NASH and NAFL groups. Overall, 250 subjects, aged 44.8±11.5 years, of which 60% were women, were studied. While the gender distribution in the two datasets was not different, the subjects in the discovery cohort were older (p=0.014). Diabetes prevalence was 35% in the discovery cohort and 25% in the validation cohort, respectively (p=0.13). The prevalence of hypertension was 56% in the discovery cohort and 60% in the validation cohort (p=0.968), and that of hyperlipidaemia was 45% in the discovery cohort and 49% in the validation cohort (p=0.964). The type and frequency of use of antidiabetic, antihypertensive and antihyperlipidaemic medications are reported in online supplemental table 3.
NASH was not associated with the cohorts: 61% vs 67% (p=0.43). Therefore, we can exclude dataset biases that could potentially affect supervised machine learning, since the primary aim of our study was to identify and validate biomarkers of NASH.
In contrast, the prevalence of liver fibrosis was different in the discovery (97%) and validation cohorts (66.7%) (p<0.001). Therefore, we randomly split the whole database into a discovery and a validation set to obtain a balanced number of patients with and without fibrosis when a model was built to predict fibrosis, but preserving the overall class distribution of the data.
Table 1 shows the two cohorts differed in anthropometric characteristics. While plasma glucose levels did not differ between the two samples, plasma insulin and HOMA-IR showed borderline statistical significance (p=0.036 and p=0.043, respectively). On histological examination, 20% of all participants had liver steatosis<5%, had no inflammation and no ballooning; 15.6% (NAS 3, ie, NAFL) had liver steatosis >5%, did not have or did have inflammation or had liver steatosis >5% did not have or did have ballooning; and 64.4% had NASH (NAS≥3). Liver fibrosis was observed in 78.8% of participants.
The relecture of the histological scores on digitalised images by a second independent pathologist (JC-M) yielded 85% accordance with the centralised lecture (FMV). The results provided derived from the agreement of the two pathologists.
The mean PLIN2 levels in the subjects with NAFL were 1.72±0.40 vs 4.58±1.70 MFI in the group with NASH (p<0.0001, Mann-Whitney U test).
The NN analysis for the prediction of presence/absence of NASH produced an accuracy of 93% in the discovery and of 92% in the validation cohort; the AUCs were 97.8% (CI 95% to 100%) and 97.6% (CI 95% to 100%), respectively. Sensitivity and specificity were 95% and 90% in the discovery cohort and 88% and 100% in the validation cohort, respectively. All the subjects in the validation cohort without histological NASH were correctly predicted as having a NAS score of <3. Eight per cent of individuals with NASH, who were misclassified as being without NASH, had a NAS score of 3. The Olden algorithm identified PLIN2 in monocytes as the most important variable in classifying subjects with and without NASH, followed by presence of diabetes and ALT levels. Figure 2A shows the NN composition; the AUROC curves for both cohorts are reported in figure 2B.
The model including only PLIN2 in monocytes as predictor produced an accuracy of 93% in the discovery cohort, with a sensitivity and specificity of 98.4% and 84.6%, respectively. Values in the validation cohort were 90%, 85% and 100%, respectively.
We also used an NN analysis (figure 3A) to predict the stages of NASH. Also in this case, the Olden algorithm identified monocyte PLIN2 as the most important variable in classifying subjects according to NAS levels. The classification had an accuracy of 85% in the discovery and 85.2% in the validation cohort. Twenty-one subjects without histologically proven NASH were correctly classified in the NAS level=0 class (ie, NAS score <3). Figure 3B shows the ROC curves for each NAS level in the validation cohort. The AUROC ranges from 83.7% (CI 75.6% to 91.8%) to 97.8% (CI 95.8% to 99.8%). The average levels of PLIN2 in monocytes in the three classes of NAS are depicted in figure 4.
Two Excel files implementing the estimated networks to facilitate NASH diagnosis and NAS level prediction are provided in the online supplemental material.
In 2012, Bedossa et al30 described the SAF scoring system, which includes steatosis, activity and fibrosis and was proposed to aid in distinguishing between NAFL and NASH in subjects with morbid obesity.
SAF-A score is the activity part of the SAF scoring system that incorporates scores for ballooning and inflammation. Although NAS continues to be the most used score for the histologic diagnosis of NASH, a decrease of 2 points of SAF-A or more has been adopted as primary endpoint in some RCTs.39
Online supplemental figure 2 shows the monocyte levels of PLIN2 at different degrees of NASH severity, according to SAF-A.
Both the Youden and the closest top-left criteria led to a threshold of 0.38, according to which the accuracy of the algorithm was 98.4% in the discovery cohort with a sensitivity of 100% and a specificity of 93%; accuracy, sensitivity and specificity were 96%,100% and 82% when the same threshold was applied to the validation sample. The diagnostic ability of the model is shown in online supplemental figure 3, where the ROC curve is reported along with the identified threshold, AUROC, sensitivity and specificity.
The model predicted SAF-A levels with an accuracy of 89% in the discovery and 84% in the validation samples (online supplemental table 4).
Liver fibrosis prediction
RAB14 was used to predict liver fibrosis with a logistic model including also waist circumference, age, plasma glucose, high-density lipoprotein (HDL) cholesterol and ALT. The predictors were those variables, which were significant in a univariate analysis (online supplemental table 5) and represented a particular physiological aspect, the metabolic, lipidic and hepatic ones. The AUROC with its CI, calculated using bootstrap replicates, was 95.9% (CI: 87.6% to 100%) in the discovery sample. Accuracy, sensitivity and specificity were 99.2%, 100% and 95.8%, respectively, when the Youden criterium was adopted as classification factor with a threshold of 0.55. In the validation sample, AUROC was 99.3% (CI 98.1% to 100%); accuracy was 97.6%; sensitivity was 99% and specificity and 89.6%.
When RAB14 was used as the only variable in the model, accuracy, sensitivity and specificity were 86.4%, 96.0% and 45.8%, respectively, in the discovery cohort. In the validation cohort, they were 82.4%, 96.9% and 34.5%, respectively. In both cohorts, half of subjects without fibrosis were erroneously predicted as being with fibrosis (13/24 and 19/29); however, the diagnosis of fibrosis was correctly predicted (97/101 and 93/96). The use of glycaemia as a covariate increases the specificity to 87.5% and to 86.2% in the discovery and validation cohorts, respectively.
Figure 5 shows the AUROC of the model containing RAB14 (figure 5A) for the prediction of presence/absence of liver fibrosis in the discovery dataset and the RAB14 monocyte levels at SAF-F=0 (presence of fibrosis) and SAF-F ≥1 (figure 5B).
The model accuracy in predicting fibrosis severity (SAF-F levels) was 69.6% in both the discovery and validation samples.
If the fibrosis stage was recoded as a three-level variable assuming values of 0 if SAF-F ≤1, 1 if SAF-F=2 and 2 if SAF-F=3, then the two models, one including RAB14 and other including elastography, produced the following accuracies: 67.2% and 63.2% in the discovery and validation cohorts, respectively, for RAB14, and 65.6% in both the discovery and validation cohort for elastography. For the model including RAB14, the greatest percentage of misclassified individuals was those with SAF-F=2 who were classified in the lower level (SAF-F ≤1). The same occurred for elastography.
When RAB14 was replaced with the variable of elastography liver stiffness in the same algorithm to predict presence/absence of fibrosis, AUROC was 95.9% (CI 87.5% to 100%), accuracy=98.4%, sensitivity=99.01% and specificity=95.8% in the discovery dataset. AUROC=99.2% (CI 98% to 100%), accuracy=96%, sensitivity=98% and specificity=91.5% were demonstrated in the validation dataset when the same Youden threshold of 0.51 was used.
No differences were found between the two models in terms of AUROCs in either the discovery or the validation datasets (p=0.48 and p=0.34, respectively).
Figure 5 shows the AUROC of the model containing elastography (kPa) (figure 5C) for the prediction of presence/absence of liver fibrosis in the discovery dataset and the RAB14 monocyte levels at SAF-F=0 and SAF-F ≥1 (figure 5D).
When the elastography variable was used instead of RAB14 in the multinomial model predicting SAF-F level, the accuracy was 68% in the discovery sample and 68.8% in the validation sample. RAB14 performance in diagnosing the severity of liver fibrosis was comparable to that of elastography.
RAB14 versus FIB4, AST-to-platelet ratio and NAFLD fibrosis score
The RAB14 algorithm was compared with the predictive capacity of FIB4, NAFLD fibrosis score and AST-to-platelet ratio index to diagnose liver fibrosis in the validation cohort. The highest AUROC value was obtained with the new algorithm (99.3%, CI 98.1% to 100%), which was significantly higher than the AUROCs obtained with the other indices: AUROC NAFLD fibrosis score=85.2% (CI 77% to 92.3%) (p=0.0002), AUROC FIB4=62.2% (CI 49.8% to 74.6%) (p<0.0001) and AUROC AST-to-platelet ratio=61.8% (CI 51.3% to 72.6%) (p<0.0001).
Online supplemental figure 4 reports the ROC curves for each of the aforementioned indices.
Therefore, the algorithm containing RAB14 outperformed currently used biomarkers of liver fibrosis.
NASH and liver fibrosis prediction in subjects with diabetes and/or obesity
We wanted to verify how well the new biomarkers predicted NASH and fibrosis in subjects with diabetes and/or obesity. To this end, we used two stepwise regression models, one testing the dependence of PLIN2 on BMI, presence or absence of NASH, and presence or absence of diabetes. We also tested the dependence of RAB14 on BMI, presence or absence of liver fibrosis, and presence or absence of diabetes. NASH (β=2.27, p<0.0001) and BMI (β=0.04, p=0.006) were the only predictors of PLIN2. A further model including all the variables used to predict NASH was tested in a stepwise regression procedure. The variables entered the model were therefore, in addition to BMI and diabetes, triglycerides, ALT and waist circumference. The only significant predictors in the final model were NASH (β=1.80, p<0.0001), BMI (β=0.037, p=0.005), triglycerides (β=0.003, p<0.046) and ALT (β=0.025, p<0.0001).
When only BMI and diabetes were considered as predictors of RAB14, it depended only on the presence of liver fibrosis (β=−5.66, p<0.0001), while BMI and diabetes were not significant predictors and were excluded from the final model by the stepwise selection procedure. When also age, ALT, HDL, waist circumference and plasma glucose were tested into the model, the only predictors of RAB14 were age (β=0.13, p<0.0001) and fibrosis (β=−7.64, p<0.0001).
Patients were then divided into two BMI classes, subjects without or with class 2 obesity, BMI of <35 and BMI of ≥35. The performance of the two algorithms, the one including PLIN2 and the other one including RAB14, were evaluated in terms of accuracy. PLIN2 had an accuracy of 81% for predicting NAS level in the subgroup of 178 subjects with severe obesity and an accuracy of 87.7% in the subgroup of 73 patients suffering from diabetes. Sixty-six subjects had both obesity and diabetes; in this subgroup, the accuracy of our algorithm was 87.9% (online supplemental figure 5). In the subgroup of subjects with BMI of <35, the accuracy was 95.8%.
When the algorithm with RAB14 was used, its accuracy in predicting liver fibrosis (presence/absence) was 95.8% in the subgroup of subjects with obesity, 100% in the subgroup with diabetes, and 100% in the subgroup of patients with both obesity and diabetes.
Accuracies were 62%, 68.5% and 68.2% when predicting fibrosis stages, respectively.
Patients were then divided into two BMI classes, subjects without or with obesity, BMI of <30 and BMI of ≥30, and the performance of the two new algorithms was evaluated in terms of accuracy (online supplemental figure 5). The algorithm with PLIN2 predicted the NAS levels with an accuracy of 81.2% in the subgroup of 196 subjects with obesity and with an accuracy of 87.7% in 73 patients with diabetes. All patients with diabetes had a BMI of ≥30; therefore, the accuracy was the same than in those patients with obesity. Accuracy in the subsample of 53 patients with BMI of <30 was 100%. The accuracy of the algorithm with RAB14 in predicting SAF-F levels was 62.9% in patients with obesity and 68.5% in those with diabetes. Using liver stiffness elastography variable instead of RAB14 allowed an accuracy of 61% and 63% in patients with obesity and/or diabetes, respectively. The misclassified patients with obesity were mostly subjects with SAF-F equal to 1 but who were predicted as having a more severe condition (SAF-F_level=2) or vice versa their fibrosis severity was underestimated, they were predicted in the SAF-F_level=1 class when instead had an SAF-F ≥2 (in total 35% for RAB14 and 37.6% for elastography).
NASH and fibrosis prediction before and after metabolic surgery
Fifty patients with histologically proven NASH who underwent RYGB were included in this study (BRAVES RCT). Ultrasound-guided needle liver biopsy was performed at 1 year after surgery. The patients lost an average of 37 kg corresponding to 28.8% wt loss. Anthropometry, plasma glucose and insulin, HOMA-IR, lipid profile, blood pressure and liver enzymes levels are reported in table 2; all variables were significantly improved after metabolic surgery. NASH was fully reversed in 74% of participants according to the values of NAS; however, while the severity of liver fibrosis improved, the fibrosis did not disappear. Actually, the prevalence of SAF-F1 increased from 46% to 70%, while that of SAF-F2 halved from 46% to 22%, and SAF-F3 decreased from 8% to only 2%.
We evaluated the performance of the two new algorithms in this external cohort.
The accuracies were 75.5% and 87.5% before and after surgery, respectively, for NAS level prediction. The accuracy in predicting SAF-F levels before surgery was 67.3%. Since bariatric surgery causes changes in metabolism and has effects on several mechanisms, the model developed on a population with 78.8% of individuals with moderate to severe obesity could be inappropriate to predict a population of an individuals who underwent metabolic surgery. The model, including the same variables, was therefore fitted on the surgery population, and the accuracy was 81.2% for SAF-F level prediction using RAB14. Online supplemental table 6 reports the coefficients of the model for the surgery cohort. When the TE variable was used in place of RAB14, accuracy was 73.5%. No significant differences were observed between the predictivity of RAB14 and elastography algorithms.
Data in the group of subjects who underwent metabolic surgery are graphed in online supplemental figure 6.
We show that a liquid biopsy using circulating monocytes can accurately predict the presence and severity of NASH as well as liver fibrosis in subjects without other causes of chronic liver disease or steatosis.
An algorithm containing PLIN2, as measured in peripheral blood monocytes, had an accuracy between 92% and 93%, sensitivity of 88%–95% and specificity of 90%–100% for the diagnosis of NASH. Similarly, an algorithm with RAB14 in circulating monocytes had an accuracy between 99.2% and 97.6%, sensitivity of 90%–98% and specificity of 87%–93% for the diagnosis of liver fibrosis. Unlike other algorithms in the literature,11–18 ours was able to discriminate among various stages of NASH severity.
The algorithm with RAB14 outperformed currently used biomarkers of liver fibrosis, such as FIB4, NAFLD fibrosis score or AST-to-platelet ratio, and gave results comparable to those of elastography. Both PLIN2 and RAB14 algorithms diagnosed with accuracy a significant liver fibrosis (≥F2) in association with NASH severity (NAS ≥4), a rapidly worsening condition which represents the target for therapeutical RCTs. Finally, the new algorithms well predicted histological improvement of NASH and liver fibrosis after metabolic surgery, and thus they can be used not only as diagnostic but also as monitoring biomarkers. However, although the severity of liver fibrosis declined after metabolic surgery, the prevalence of SAF-F1 increased from 46% to 70%. This can explain why our algorithm including RAB14 as well as that with elastography had a reduced performance.
Our algorithm for NASH diagnosis and staging did not include BMI in order to avoid introducing a bias inherent to body weight. In fact, NASH is present also in subjects with normal weight as shown, for instance, in the GOASIA registry where the prevalence of NAFLD in subjects with a BMI <25 kg/m2 ranged from 7.6% to 25.6% and a substantial proportion of these subjects (50.5%) had biopsy-proven NASH.40 In a meta-analysis, 39% of subjects with NAFLD and normal weight or overweight had NASH; 29.2% had stage ≥2 liver fibrosis; and 3.2% had cirrhosis.41
Indeed, our algorithm showed a good sensitivity and specificity also in the validation cohort where the BMI was lower compared with the discovery cohort, with a range of 23.1–42.43 kg/m2 vs 38.53–46.03 kg/m2.
We used the presence/absence of diabetes as a covariate in the NN analysis because of the high prevalence of NASH among patients with T2D. In fact, in a meta-analysis, the estimated prevalence of NASH among individuals with T2D was 37.3% (95% CI 24.7% to 50.0%) and that of advanced liver fibrosis 17.0% (95% CI 7.2 to 34.8).42
Fibrosis is often associated with NASH and has important implications for clinical outcomes; therefore, an effective treatment for NASH must, at least, prevent liver fibrosis progression.
In our study, the performance of RAB14 algorithm in diagnosing liver fibrosis presence and stages was comparable with that of ultrasound two-dimensional shear wave elastography.
Liver stiffness evaluated with elastographic techniques is expensive, operator-dependent and machine-dependent, and may be not feasible in patients with severe obesity when the skin–liver capsule distance is higher than 5 cm or in patients with thin intercostal spaces.43
Measuring PLIN2 and RAB14 in monocytes is inexpensive and scalable with up to 800 samples that can be analysed in a single day. Usually, immunophenotyping is performed in fresh blood or in polymorphonuclear cells the same day or within 24 hours of collection. We demonstrated that cryopreserved is comparable to fresh blood for monocyte flow-cytometry studies making possible to postpone and centralise analyses.
PLIN2 and RAB14 may permit diagnosis of NASH and/or liver fibrosis with a simple blood test. Our biomarkers can be used in community and population studies permitting to investigate the real prevalence of NASH and liver fibrosis. Moreover, since it requires only blood sampling, they are potentially valuable tools for population-based and prevention studies in children.
Strengths of our multicentre study include that liver histology was available in all subjects. A limitation is that only Caucasian subjects were enrolled, thus limiting the generalisability of our results to other ethnicities, although no differences between ethnicities are expected. Another limitation of our study is the different prevalence of liver fibrosis in the discovery (97%) and validation (67%) cohorts (p<0.001). To cope with this, we randomly split the whole database into a discovery and a validation set to obtain a balanced number of patients with and without fibrosis when a model was built to predict fibrosis while preserving the overall class distribution of the data.
In conclusion, new liquid biopsy tests that use peripheral blood monocyte PLIN2 and RAB14 as biomarkers were reliable in diagnosing NASH and/or liver fibrosis. PLIN2 and RAB14 have the potential to replace invasive liver biopsy-based histology for the diagnosis and management of NASH and liver fibrosis.
Due to the epidemic nature of metabolic liver diseases, rapid and cost-effectiveness tests for the diagnosis of NASH and liver fibrosis can permit the study of their prevalence in the general population and to monitor the effects of lifestyle, surgical and pharmacological interventions.
Data availability statement
All data relevant to the study are included in the article or uploaded as supplementary information.
Patient consent for publication
This study involves human participants and was approved by Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy (ID 2162). The study protocols were approved by the ethical committees of the Università Cattolica del Sacro Cuore, Università La Sapienza and San Camillo Hospital, all in Rome, Italy. All participants provided written informed consent at the time of enrolment and a further written consent before metabolic surgery.
We would like to thank Mrs. Anna Caprodossi, an invaluable technician.
GA and SP are joint first authors.
Contributors GM, GA, SP, CWlR, SRB and FR designed the study. SP and FP did the statistics. GA did the analyses. LC-G, GC, MR, LC, PLM, OV, MG, JRC-M and MFR carried out the study. MP, LR and MG performed the liver biopsies and hepatological follow-up. FMV, pathologist, read all liver biopsies. JRC-M reread the biopsies. GM, GA, SP, LC-G, CWlR, SRB and FR wrote the first draft. GM and SP are guarantors of the data. All authors actively contributed to the definitive version.
Funding Microbesomics: effect of gut microbiome on 'obesitypes' in human subjects (PRIN 2017 n. 2017FM74HK_004), Elucidating Pathways of Steatohepatitis (EPoS) (EPOS Horizon 2020 n. MIN-EPO-17-013), Stratification of Obese Phenotypes to Optimize Future Obesity Therapy (SOPHIA IMI n. 875534). Metadeq Inc. GM and SB acknowledge support from the Transcampus Initiative.
Competing interests GM reports consulting fees from Novo Nordisk,_Fractyl Inc and Recor Inc; she is also scientific current advisor and consultant of Metadeq Limited, and current advisor and consultant of Keyron Limited, GHP Scientific Limited, and Jemyll Limited. FR reports receiving research grants from Ethicon and Medtronic; consulting fees from Novo Nordisk, Ethicon and Medtronic; serving on scientific advisory boards for GI Dynamics; and is former director and current stock option holder of Metadeq Limited and former director and current advisor of Keyron Limited and GHP Scientific Limited. CWlR reports grants from the Irish Research Council, Science Foundation Ireland, Anabio and the Health Research Board; serves on advisory boards of Novo Nordisk, Herbalife, GI Dynamics, Eli Lilly, Johnson & Johnson, Sanofi Aventis, AstraZeneca, Janssen, Bristol-Myers Squibb, Glia and Boehringer Ingelheim. ClR is a member of the Irish Society for Nutrition and Metabolism outside the area of work commented on here, and is the chief medical officer and director of the Medical Device Division of Keyron since January 2011; both of these are unremunerated positions. CWlR is also current director, shareholder and stock option holder of Metadeq Limited, current director of GHP Scientific Limited, was a previous investor in Keyron, which develops endoscopically implantable medical devices intended to mimic the surgical procedures of sleeve gastrectomy and gastric bypass. The product has only been tested in rodents and none of Keyron’s products are currently licensed. They do not have any contracts with other companies to put their products into clinical practice. No patients have been included in any of Keyron’s studies and they are not listed on the stock market. He continues to provide scientific advice to Keyron for no remuneration. All other authors declare no competing interests.
Patient and public involvement Patients and/or the public were not involved in the design, conduct, reporting or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.