Article Text

Download PDFPDF

Original research
Quantifying and monitoring fibrosis in non-alcoholic fatty liver disease using dual-photon microscopy
  1. Yan Wang1,
  2. Grace Lai-Hung Wong2,3,
  3. Fang-Ping He4,
  4. Jian Sun1,
  5. Anthony Wing-Hung Chan5,
  6. Jinlian Yang1,
  7. Sally She-Ting Shu2,3,
  8. Xieer Liang1,
  9. Yee Kit Tse2,3,
  10. Xiao-Tang Fan6,
  11. Jinlin Hou1,
  12. Henry Lik-Yuen Chan2,3,
  13. Vincent Wai-Sun Wong2,3
  1. 1 State Key Laboratory of Organ Failure Research, Guangdong Provincial Research Center for Liver Fibrosis, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, China
  2. 2 Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China
  3. 3 State Key Laboratory of Digestive Disease, The Chinese University of Hong Kong, Hong Kong, China
  4. 4 Department of Infectious Diseases, The Eighth Affiliated Hospital of Sun Yat-Sen University, Shenzhen, China
  5. 5 Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Hong Kong, China
  6. 6 Department of Hepatology, The First Affliated Hospital of Xinjiang Medical University, Urumqi, China
  1. Correspondence to Dr Vincent Wai-Sun Wong, Department of Medicine and Therapeutics, Chinese University of Hong Kong, Hong Kong, Hong Kong; wongv{at}cuhk.edu.hk

Abstract

Objective Fibrosis stage is strongly associated with liver-related outcomes and is a key surrogate endpoint in drug trials for non-alcoholic steatohepatitis. Dual-photon microscopy allows automated quantification of fibrosis-related parameters (q-FPs) and may facilitate large-scale histological studies. We aim to validate the performance of q-FPs in a large histological cohort.

Design 344 patients with non-alcoholic fatty liver disease (NAFLD) underwent 428 liver biopsies (240 had paired transient elastography examination). Fibrosis stage was scored using the NASH Clinical Research Network system, and q-FPs were measured by dual-photon microscopy using unstained slides. Patients were randomly assigned to the training and validation cohorts to test the performance of individual q-FPs and derive optimal cut-offs.

Results Over 25 q-FPs had area under the receiver-operating characteristics curves >0.90 for different fibrosis stages. Among them, the perimeter of collagen fibres and number of long collagen fibres had the highest accuracy. At the best cut-offs, the two q-FPs had 88.3%–96.2% sensitivity and 78.1%–91.1% specificity for different fibrosis stages in the validation cohort. q-FPs and histological scoring had nearly identical correlations with liver stiffness measurement, suggesting that the accuracy of q-FPs approached that of histological assessment. Among patients with paired liver biopsies, changes in the same q-FPs were associated with changes in fibrosis stage. At a median follow-up of 5.6 years, baseline q-FPs predicted liver-related events.

Conclusion q-FP is highly accurate in the assessment of fibrosis in NAFLD patients. This automated platform can be used in future studies as objective and reliable evaluation of histological fibrosis.

  • non-alcoholic steatohepatitis
  • liver fibrosis
  • cirrhosis
  • liver biopsy
View Full Text

Statistics from Altmetric.com

Significance of this study

What is already known on this subject?

  • Liver fibrosis is strongly associated with adverse outcomes in non-alcoholic fatty liver disease.

  • Dual-photon microscopy of unstained slides can be used to evaluate features of collagen fibrils.

What are the new findings?

  • Automated quantification of fibrosis-related parameters by dual-photon microscopy has high accuracy in diagnosing fibrosis and cirrhosis.

  • Fibrosis-related parameters and histological scoring by a pathologist had similar degree of correlation with liver stiffness measurement by transient elastography, suggesting the two had similar accuracies for the detection of liver fibrosis.

  • The lack of increase and decrease in fibrosis-related parameters had high negative predictive values in excluding fibrosis progression and regression, respectively.

  • Vessel-bridging features at baseline predict fibrosis progression during follow-up assessment.

  • Patients with increased fibrosis-related parameters at baseline had a higher risk of developing hepatocellular carcinoma and cirrhotic complications during follow-up.

How might it impact on clinical practice in the foreseeable future?

  • Accurate and automated assessment of liver fibrosis by dual-photon microscopy can facilitate clinical studies involving liver histology and allow robust comparison across studies.

Introduction

Non-alcoholic fatty liver disease (NAFLD) is currently the most common chronic liver disease, affecting a quarter of the global adult population.1 In the USA, NAFLD and its more active form—non-alcoholic steatohepatitis (NASH)—have already become the second leading indication for liver transplantation and the third most common cause of hepatocellular carcinoma.2 3 Because of the magnitude of the problem, there has been concerted effort in improving the management of NAFLD/NASH. Several drugs have entered phase 3 development and may be available for clinical use in the next 1 to 2 years.4 Along the same line, a number of biochemical and imaging biomarkers are being evaluated as non-invasive tests of NASH and liver fibrosis to facilitate prognostication, selection of patients for treatment and monitoring treatment response.5

One unique feature in NAFLD research is the heavy reliance of liver histology. A new drug for NASH may be conditionally registered if it can lead to resolution of NASH without worsening of fibrosis or improvement in fibrosis without worsening of NASH in pivotal trials using serial liver biopsies.4 Likewise, liver histology is the reference standard in the development of most non-invasive biomarkers. Among the histological features of NAFLD, fibrosis stage has the strongest association with liver-related morbidity and mortality.6 7 However, histological scoring is labour-intensive and time-consuming. It may also suffer from intraobserver and interobserver variability.8 Therefore, there has been much interest in developing automated digital imaging analysis to ensure robust comparison across studies.

In recent years, the dual-photon microscopy has been used to quantify fibrosis-related parameters (q-FP) with high accuracy in patients with chronic hepatitis B and NAFLD.9 10 The system identifies different morphological features of collagen fibril, which correlate well with fibrosis stages. However, previous studies are limited by a small sample size. As a result, it is unclear which q-FP correlates best with fibrosis stages and what cut-off values clinicians and researchers can apply. In the current study, we provide a complete evaluation of q-FPs in a large cohort of patients with biopsy-proven NAFLD and test the use of q-FPs to monitor and predict fibrosis progression.

Patients and methods

Study population

This was a retrospective study of a longitudinal cohort of patients who underwent liver biopsy at the Prince of Wales Hospital, Hong Kong. We included adult patients aged 18 years or above with biopsy-proven NAFLD. We excluded patients with excessive alcohol consumption (>30 g/day in men and >20 g/day in women); secondary fatty liver (eg, use of systemic steroids, methotrexate or tamoxifen); positive hepatitis B surface antigen or anti-hepatitis C virus antibody; clinical, radiological or histological evidence of other liver diseases; and malignancies in the past 5 years. The study protocol was approved by the Joint Chinese University of Hong Kong-New Territories East Cluster Clinical Research Ethics Committee. All patients provided informed written consent for liver biopsy and subsequent prospective follow-up.11

Clinical assessment

We performed clinical assessment within 1 week before liver biopsy. We recorded the medical history and measured the body weight, body height and waist circumference. Waist circumference was measured at a level midway between the lower rib margin and iliac crest with the tape all around the body in the horizontal position. Body mass index (BMI) was calculated as body weight (kg) divided by height (m) squared. Venous blood was taken after overnight fasting for at least 8 hours for liver biochemistry, plasma glucose, haemoglobin A1c and lipids.

Liver biopsy

Liver histology served as the reference standard of this study. We performed percutaneous liver biopsy using 16G Temno needles. One experienced pathologist (AWHC) evaluated and scored all biopsy samples using the NASH Clinical Research Network system.12 In particular, a five-point fibrosis staging system was used, with F0=no fibrosis, F1=perisinusoidal or periportal fibrosis, F2=perisinusoidal and portal/periportal fibrosis, F3=bridging fibrosis and F4=cirrhosis. To evaluate the reliability of histological scoring, we previously compared the blinded scores by our pathologist against experts from Europe, Australia and Cuba.7 We showed high interobserver agreement for fibrosis stage (κ 0.80–1.00) and steatosis grade (κ 0.71–0.85) and moderate agreement for lobular inflammation (κ 0.44–0.63) and hepatocyte ballooning (κ 0.53–0.75).

Dual-photon microscopy

We performed dual-photon microscopy on an unstained 4 μm-thick paraffin-embedded liver biopsy section. Details of the procedure were described previously.9 10 In brief, the whole section of each slide was imaged with a 20× objective and 512×512 pixels per tile, and the images were assessed with computerised image analysis by two investigators (JY and YW). The image analysis software measured collagen fibril characteristics in operator-defined segmentation regions including (1) the entire liver section, (2) perisinusoidal, (3) vessel and (4) vessel bridges. Then the system generated a batch of q-FPs by measuring the textural features of collagen fibre within these regions. Out of the batch, q-FPs with correlation coefficients >0.8 in intraobserver and interobserver agreement tests were included in the subsequent analysis. Technical procedures for the image segmentation and the profile of q-FPs set are described in online supplementary table 1. Moreover, the experimental workflow from imaging to q-FP assessments is illustrated in figure 1.

Figure 1

Experimental workflow for quantitative fibrosis parameter analysis.

Our team has developed a standardised operation protocol through a series of experiments in animal models and clinical cohorts.9 10 13 14 This includes the imaging procedures and image analysis procedures as described in our manuscript. By using the standardised protocol, imaging detection reproducibility and image data-to-pathology validity were confirmed and maintained. For a biopsy slice with size of 15–20 mm2, the time duration going through our standardised operation protocol normally takes about 2 hours. The cost of this technique is mainly related to the microscope; there are no additional costs required for imaging and analysis. There are currently several commercial products of multiphoton microscopes suitable for imaging collagen fibrils. To ensure reliable testing, centres need to build an image analysis algorithm, standardised laboratory flow and comprehensive clinical validation through multidisciplinary efforts.

Vibration-controlled transient elastography

A typical liver biopsy sample represents around 1:50 000 of the liver volume. Because of sampling variability, liver histology is an imperfect reference standard.15 Even if q-FPs are more accurate than histological scoring, a study using liver histology as the reference would not be able to demonstrate it. Hence, we used liver stiffness measurement as a second reference standard in this study. We performed liver stiffness measurement by vibration-controlled transient elastography (FibroScan, Echosens, Paris, France) on the same day of clinical assessment with fasting with both the M and XL probes as described previously.16–18 An examination was considered reliable if we obtained ≥10 valid measurements and the interquartile range-to-median ratio was ≤0.3. All operators were blinded to the diagnosis and clinical data and had performed at least 50 examinations before the beginning of the study. We used the XL probe results in the analysis if the M probe results were unreliable. When both probes yielded reliable results, we used the M probe results if the patient’s BMI was <30 kg/m2 and XL probe results if ≥30 kg/m2. According to our recent study, the two probes generated similar liver stiffness values for patients with the same fibrosis stage when used according to the appropriate BMI.18 This was also in keeping with the automated probe selection tool in newer FibroScan models.

Statistical analysis

We performed statistical analysis using IBM SPSS Statistics V.25 (IBM, Armonk, New York, USA) and R software V.3.5.2 (R Development Core Team, Vienna, Austria). Continuous variables were expressed as mean±SD and median (IQR) and compared using the unpaired t-test and Mann-Whitney U test as appropriate. Categorical variables were presented as n (%) and compared using χ2 test or Fisher’s exact test as appropriate. We evaluated the overall accuracy of each q-FP using the area under the receiver-operating characteristics curve (AUROC) analysis. For the best performing q-FPs, we determined the best cut-offs for F2–F4, F3–F4 and F4 fibrosis based on the Youden index and calculated the corresponding sensitivities, specificities, positive and negative predictive values, and positive and negative likelihood ratios. Support vector machine models were also used to investigate the diagnostic performance for discriminating F2–F4, F3–F4 and F4 fibrosis using all q-FPs.19 Correlation among q-FPs and between histological parameters and liver stiffness was tested using Spearman’s test. In addition, the incidence of liver-related events (hepatocellular carcinoma, cirrhotic complications and liver-related deaths) was compared between patients with low and high q-FPs using the Cox proportional hazard model.

Since this study involves derivation and validation of q-FP cut-offs, we randomly assigned patients to the training cohort and validation cohort in a 3:2 ratio. With a sample size of 344 patients, we could determine the AUROC with a 95% CI of ±0.03 to 0.10 in both cohorts. We further validated the findings in another cohort of 46 patients with biopsy-proven NAFLD from the First Affiliated Hospital of Xinjiang Medical University (Urumqi, China). All statistical tests were two-sided. Statistical significance was defined as p<0.05.

Results

From November 2003 to November 2017, we performed 1452 liver biopsies for patients with different liver diseases, of whom 344 patients had biopsy-proven NAFLD and fulfilled the inclusion and exclusion criteria. Two hundred and six patients were randomly assigned to the training cohort and 138 to the validation cohort (table 1). The patient characteristics were similar in both cohorts with a mean age of 51 years and 58% being males. Diabetes and hypertension were present in 60% and 55% of patients, respectively. One hundred and forty-eight (43%) patients had F2–F4 fibrosis. The median length of liver biopsy samples was 21 mm (IQR 16–25 mm), containing a median of 8 (6–10) portal tracts.

Table 1

Clinical characteristics of NAFLD patients

Diagnostic accuracy of q-FPs in the training cohort

Online supplementary table 2 lists the AUROC of all 70 q-FPs by dual-photon microscopy in the training cohort. Among them, 29 q-FPs had AUROC >0.90 for F2–F4 fibrosis, suggesting excellent diagnostic accuracies (table 2). The top 28 q-FPs on the same list also had AUROC >0.90 for F3–F4 and F4 fibrosis. The selected parameters included the total collagen content as well as features of collagen fibres such as their perimeter, length, width, area and numbers, either globally or at the perivessel area. The top q-FPs were highly correlated with each other (Spearman correlation coefficients >0.90 for the top candidates; online supplementary table 3), suggesting a high degree of collinearity.

Table 2

AUROC of the 29 best performing fibrosis-related parameters in the training and validation cohorts

The top two q-FPs, perimeter of collagen fibres (StrPerimeter) and number of long collagen fibres (NoLongStr), increased with fibrosis stage in a stepwise manner (figure 2). At a cut-off of 3.01, StrPerimeter had a sensitivity, specificity, positive and negative predictive values of 93.2%, 87.3%, 84.5% and 94.5% for F2–F4 fibrosis, respectively (table 3). The corresponding values were 87.1%, 87.5%, 75.0% and 94.0% for F3–F4 fibrosis at a cut-off of 3.77 and 93.9%, 90.8%, 66.0% and 98.7% for F4 fibrosis at a cut-off of 5.14, respectively.

Figure 2

Quantitation of fibrosis-related parameters by fibrosis stage in the (A, C) training cohort (n=206) and (B, D) validation cohort (n=138). Circles and asterisks indicate outliers and extreme values, respectively.

Table 3

Diagnostic performance of StrPerimeter and NoLongStr in the training and validation cohorts

At a cut-off of 0.0046, NoLongStr had a sensitivity, specificity, positive and negative predictive values of 90.9%, 88.1%, 85.1% and 92.9% for F2–F4 fibrosis, respectively (table 3). The corresponding values were 87.1%, 88.2%, 76.1% and 94.1% for F3–F4 fibrosis at a cut-off of 0.0056 and 93.9%, 91.3%, 67.4% and 98.8% for F4 fibrosis at a cut-off of 0.0079, respectively.

Combining both StrPerimeter and NoLongStr resulted in modest improvement in diagnostic accuracy for F2–F4 fibrosis but no improvement for F3–F4 or F4 fibrosis. The inclusion of all 70 q-FPs using support vector machine model likewise resulted in marginal improvement in the diagnostic accuracy for different fibrosis stages (online supplementary table 4).

Diagnostic accuracy of q-FPs in the validation cohorts

Among the 29 q-FPs selected in the training cohort, the top 27 also had AUROC >0.90 for F2–F4 fibrosis in the Hong Kong validation cohort (table 2). All 29 q-FPs including StrPerimeter and NoLongStr had AUROC >90% for F3–F4 and F4 fibrosis. Both q-FPs increased with fibrosis stage in a stepwise manner (figure 2).

Using the cut-offs derived in the training cohort, StrPerimeter had a sensitivity, specificity, positive and negative predictive values of 91.7%, 78.2%, 76.4% and 92.4% for F2–F4 fibrosis, respectively (table 3). The corresponding values were 95.2%, 82.3%, 70.2% and 97.5% for F3–F4 fibrosis and 96.2%, 91.1%, 71.4% and 99.0% for F4 fibrosis, respectively.

Using the cut-offs derived in the training cohort, NoLongStr had a sensitivity, specificity, positive and negative predictive values of 88.3%, 78.2%, 75.7% and 89.7% for F2–F4 fibrosis, respectively (table 3). The corresponding values were 92.9%, 78.1%, 65.0% and 96.2% for F3–F4 fibrosis and 92.3%, 90.2%, 68.6% and 98.1% for F4 fibrosis, respectively.

Combining q-FPs did not improve the diagnostic accuracy for any fibrosis stage. Likewise, the inclusion of all 70 q-FPs using support vector machine model resulted in marginal improvement in the diagnostic accuracy for different fibrosis stages (online supplementary table 4).

Online supplementary table 5 shows the clinical characteristics of 46 patients with biopsy-proven NAFLD in the second validation cohort from Urumqi. Compared with the Hong Kong cohort, the Urumqi cohort was younger and had a lower metabolic burden. Fewer patients in the Urumqi cohort had type 2 diabetes and hypertension. They had lower BMI, steatosis grade and fibrosis stage. Nonetheless, the q-FPs StrPerimeter and NoLongStr again showed excellent accuracies in diagnosing F2–F4 and F3–F4 fibrosis, with AUROC of 0.925–0.970 (online supplementary table 6). At the cut-offs derived in the training cohort, both q-FPs had 100% sensitivity and negative predictive value in excluding F2–F4 and F3–F4 fibrosis. The low positive predictive value for F3–F4 fibrosis was due to the small number of patients with advanced fibrosis in this cohort.

Correlations with liver stiffness measurement

Two hundred and forty patients had reliable liver stiffness measurement. StrPerimeter had good correlation with fibrosis stages in both the training and validation cohorts (Spearman’s rho 0.813 and 0.852, respectively; figure 3). StrPerimeter also had moderate correlation with liver stiffness measurement (Spearman’s rho 0.716 in the training cohort and 0.682 in the validation cohort). For comparison, fibrosis stage had a similar degree of correlation with liver stiffness measurement (Spearman’s rho 0.715 in the training cohort and 0.675 in the validation cohort).

Figure 3

Correlation between (A, B) StrPerimeter and histological fibrosis stage, (C, D) StrPerimeter and liver stiffness by transient elastography and (E, F) liver stiffness and histological fibrosis stage. Panels A, C and E represent the training cohort, whereas panels B, D and F represent the validation cohort. Data best-fit line by least squares method (dashed line) and Spearman’s rho correlation coefficient are indicated.

Using q-FPs to assess fibrosis change

Ninety-seven patients had two liver biopsies at a median interval of 39 (22–50) months. Twenty of 84 patients with F0–F3 disease at baseline had fibrosis progression of one stage or more. Fourteen of 63 patients with F1–F4 disease at baseline had fibrosis regression of one stage or more (online supplementary table 7). There was no significant difference in the baseline characteristics of patients who did and did not have fibrosis progression and regression.

Similar to the baseline liver biopsies, the top q-FPs had excellent accuracy in diagnosing F2–F4, F3–F4 and F4 fibrosis on the follow-up biopsy samples (online supplementary table 8). Again, StrPerimeter and NoLongStr were among the top performing q-FPs.

Table 4 summarises the use of q-FPs to demonstrate fibrosis changes at the two liver biopsies. Depending on the percentage reduction, StrPerimeter had 42%–67% sensitivity and 56%–86% specificity in detecting fibrosis regression. The corresponding sensitivity and specificity for NoLongStr were 33%–42% and 56%–86%, respectively. Furthermore, an increase in StrPerimeter had 55%–75% sensitivity and 55%–72% specificity in detecting fibrosis progression. The corresponding sensitivity and specificity for NoLongStr were 55%–75% and 50%–75%, respectively.

Table 4

Operating characteristics of relative change in StrPerimeter and NoLongStr to predict ≥1-stage reduction or increase in fibrosis between baseline and follow-up

We further explored the possibility of using baseline q-FPs to predict fibrosis change. As shown in online supplementary table 9, the AUROC of baseline q-FPs in predicting fibrosis progression ranged from 0.447 to 0.678. The top features discriminating between progressors and non-progressors include the area, width and length of vessel-bridging or aggregated vessel-bridging collagen fibres; and the number of long and thin vessel-bridging or aggregated vessel-bridging collagen fibres. In contrast, 21 baseline q-FPs had AUROC >0.70 in predicting fibrosis regression (online supplementary table 10). The top q-FPs for fibrosis regression include the number of thin perivessel collagen fibre, solidity of collagen fibre and area of vessel-bridging collagen fibre.

Prognostic significance of q-FPs

During a median follow-up of 5.6 years (IQR 4.6–9.9), 12 patients developed liver-related events (3 had hepatocellular carcinoma and 9 had cirrhotic complications; 2 died of liver causes). Liver-related events occurred in 11 of 129 (8.5%) patients with StrPerimeter ≥3.77 and 1 of 215 (0.5%) patients with StrPerimeter <3.77 (p=0.004) and 11 of 131 (8.4%) patients with NoLongStr ≥0.0056 and 1 of 213 (0.5%) patients with NoLongStr <0.0056 (p<0.001) (figure 4).

Figure 4

Incidence of liver-related events in patients stratified by (A) StrPerimeter and (B) NoLongStr. The stratification was based on the best cut-offs for the diagnosis of F3–F4 disease as shown in table 3.

Factors associated with discrepant results

Among patients with F0–F2 disease, high StrPerimeter and NoLongStr were associated with higher aspartate aminotransferase level, liver stiffness measurement by transient elastography, hepatocyte ballooning and F2 (instead of F0–F1) fibrosis (online supplementary table 11). Among patients with F3–F4 disease, low StrPerimeter and NoLongStr were associated with F3 instead of F4 fibrosis.

Discussion

In this large longitudinal cohort of patients with biopsy-proven NAFLD, q-FPs by dual-photon microscopy showed high accuracy in determining the fibrosis stage. The q-FPs StrPerimeter and NoLongStr had consistently high accuracy in the training cohort, validation cohort and follow-up liver biopsy samples. We further validated the findings in a separate validation cohort with a lower metabolic burden and less severe histology. The validated cut-offs facilitate the application of q-FPs as objective measurement of liver fibrosis in future studies. On the contrary, none of the baseline q-FPs had sufficient accuracy in predicting fibrosis progression over time.

Previous longitudinal studies have consistently shown a close association between fibrosis stage and liver-related morbidity and mortality.6 7 20 Accordingly, fibrosis improvement and prevention of progression to cirrhosis are among the most important endpoints in NASH clinical trials.4 It is therefore important to score fibrosis stage accurately. In this large histological study, we confirmed the high accuracy of q-FPs in staging fibrosis in NAFLD patients. Extending previous works, we further provided a complete evaluation of 70 q-FPs and derived and validated cut-offs of two top performing q-FPs. StrPerimeter and NoLongStr demonstrated consistent and robust diagnostic performance across fibrosis stages and subgroups. Compared with our previous work on a small histological cohort from the USA,10 the current study has a more balanced sex distribution and included patients with a lower average BMI, the latter being typical for Asian NAFLD patients.11 21 However, both cohorts included a sizeable proportion of patients with significant fibrosis. The number, eccentricity and length of collagen fibres were identified in both studies to be discriminative across fibrosis stages.

One difficulty of evaluating fibrosis tests is the lack of a real gold standard. Evaluation of liver histology is limited by intraobserver and interobserver variability. In a previous modelling study, even if the sensitivity and specificity of liver biopsy are 90%, the AUROC of a perfect test would only be 0.90.22 Hence, we performed the ‘fair umpire’ test by comparing the correlation of q-FPs and histological fibrosis staging against liver stiffness measurement by vibration-controlled transient elastography. The concept of a ‘fair umpire’ test is based on a third test that is reasonably discriminating for the disease of interest and mechanistically unrelated to the comparator tests.23 When these criteria are fulfilled, a more accurate test should demonstrate a stronger correlation with the ‘umpire’ test. In this study, both histological fibrosis staging and q-FPs had similar correlation with liver stiffness, suggesting that the accuracy of q-FPs is actually similar to that by histological evaluation.

We also included a subgroup of patients with paired liver biopsies over time. In clinical trials, it is important to evaluate fibrosis change between biopsies accurately. In this study, the diagnostic performance of q-FPs was similar for baseline and follow-up liver biopsies. As expected, a bigger change in q-FPs over time was associated with a higher specificity in detecting changes in fibrosis stage (table 4). In particular, the negative predictive value of a 0% change in q-FPs StrPerimeter and NoLongStr was 75%–88% for both fibrosis improvement and regression. In other words, a patient without any increase in q-FPs is unlikely to have an increase in fibrosis stage and vice versa.

Our findings can be readily applied in future clinical studies. In a typical phase 3 study, a central pathologist has to score thousands of biopsy samples.4 The availability of an objective and accurate automated scoring method would facilitate and speed up the work of pathologists. Compared with computed morphometry,24 dual-photon microscopy is equally objective and can assess different features of collagen fibres (online supplementary table 1). The latter also does not require staining. In fact, a recent phase 2 study on the use of a thyroid hormone receptor-beta agonist in NASH has used second harmonic generation microscopy to demonstrate fibrosis improvement in the treatment group.25 Our study provides the evidence base for the application of this new technology.

Moreover, we explored the use of baseline q-FPs to predict fibrosis progression. A provocative study of 71 patients with chronic hepatitis B undergoing paired liver biopsies after entecavir treatment suggests that some histological features can predict fibrosis progression and regression.26 In particular, patients with thick/broad/loose/pale septa with inflammation are predominantly progressive, whereas those with delicate/thin/dense/splitting septa are predominantly regressive. In our study, although the individual q-FPs had modest accuracy in predicting fibrosis progression, it is interesting to note that the most discriminating features were invariably characteristics of vessel-bridging collagen fibres. Nonetheless, focusing on features of fibrous tissue alone is unlikely sufficient because fibrosis change is largely driven by NASH activity.27 Future studies should develop techniques to capture the other features of NASH.

While this study focuses on the use of q-FPs to diagnose fibrosis and cirrhosis, it should be noted that the range of q-FPs can extend well beyond the cirrhotic cut-offs. Among patients with F4 disease in this study, the median StrPerimeter was 8.38 (IQR 6.36–12.94; maximum value 33.16), and the median NoLongStr was 0.0127 (IQR 0.009–0.0175; maximum value 0.0475). The values in the upper end may have additional prognostic significance. In our study, patients with high q-FPs had increased incidence of liver-related events (figure 4). Indeed, the Laennec staging system, which further divides F4 into three groups, has been shown to predict liver-related events.28

Our study has the strengths of a large sample size, inclusion of paired biopsy samples and histological evaluation by a single experienced pathologist. It also has a few limitations. First, liver biopsy is not a perfect gold standard. We therefore performed correlation analysis against liver stiffness measurement and showed that the q-FPs probably had similar accuracy as histological staging. Second, because this is a real life cohort, patients underwent follow-up liver biopsies at variable intervals. Because the clinical conditions could change over time, baseline q-FPs would be less accurate in predicting fibrosis change if the interval between two biopsies was long. Third, all patients were ethnic Chinese. Although we have also tested the technique in another American cohort, further validation studies are welcomed.10 Finally, there is a small possibility that the characteristics of collagen fibres may be different between treated and untreated patients. Therefore, future clinical trials should confirm the diagnostic accuracy of q-FPs during NASH treatment.

In conclusion, q-FPs by dual-photon microscopy are highly accurate in the assessment of fibrosis stage in NAFLD patients. This automated platform can be used in future clinical trials and observational studies as objective and reliable evaluation of histological fibrosis. Features of vessel-bridging collagen fibres at baseline predict future fibrosis progression with modest accuracy. Future studies should refine the technique to evaluate the other histological features of NASH.

References

View Abstract

Footnotes

  • Contributors YW and VW-SW designed the study. GL-HL-HW, SS and VW-SW (HK cohort), and FH and X-TF (Urumqi cohort) collected the clinical data. YW and JY performed dual-photon microscopy. AWHC evaluated and scored all liver biopsy samples. YKT and VW-SW performed statistical analysis. YW, YKT and VW-SW drafted the manuscript. JS, XL, JH and HLYC provided administrative support. All authors contributed to and approved the final version of the manuscript.

  • Funding This study was partly supported by the National Natural Science Foundation of China (81670522) to YW and the Guangzhou Science and Technology Plan Project (201607020019) to JH.

  • Competing interests VW-SW has served as an advisor/consultant for AbbVie, Allergan, Echosens, Gilead Sciences, Janssen, Novo Nordisk, Perspectum Diagnostics, Pfizer and Terns; and a speaker for Bristol-Myers Squibb, Echosens, Gilead Sciences and Merck. GL-HL-HW and HLYC have served as speakers for Echosens. The other authors report no conflict of interests.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data are available upon reasonable request.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.