Objective Liver biopsy is still needed for fibrosis staging in many patients with non-alcoholic fatty liver disease. The aims of this study were to evaluate the individual diagnostic performance of liver stiffness measurement by vibration controlled transient elastography (LSM-VCTE), Fibrosis-4 Index (FIB-4) and NAFLD (non-alcoholic fatty liver disease) Fibrosis Score (NFS) and to derive diagnostic strategies that could reduce the need for liver biopsies.
Design Individual patient data meta-analysis of studies evaluating LSM-VCTE against liver histology was conducted. FIB-4 and NFS were computed where possible. Sensitivity, specificity and area under the receiver operating curve (AUROC) were calculated. Biomarkers were assessed individually and in sequential combinations.
Results Data were included from 37 primary studies (n=5735; 45% women; median age: 54 years; median body mass index: 30 kg/m2; 33% had type 2 diabetes; 30% had advanced fibrosis). AUROCs of individual LSM-VCTE, FIB-4 and NFS for advanced fibrosis were 0.85, 0.76 and 0.73. Sequential combination of FIB-4 cut-offs (<1.3; ≥2.67) followed by LSM-VCTE cut-offs (<8.0; ≥10.0 kPa) to rule-in or rule-out advanced fibrosis had sensitivity and specificity (95% CI) of 66% (63–68) and 86% (84–87) with 33% needing a biopsy to establish a final diagnosis. FIB-4 cut-offs (<1.3; ≥3.48) followed by LSM cut-offs (<8.0; ≥20.0 kPa) to rule out advanced fibrosis or rule in cirrhosis had a sensitivity of 38% (37–39) and specificity of 90% (89–91) with 19% needing biopsy.
Conclusion Sequential combinations of markers with a lower cut-off to rule-out advanced fibrosis and a higher cut-off to rule-in cirrhosis can reduce the need for liver biopsies.
- hepatic fibrosis
- fatty liver
- clinical decision making
Data availability statement
Anonymised individual patient data are available upon reasonable request and with the agreement of the authors of original studies.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Significance of this study
What is already known on this subject?
Patients with non-alcoholic fatty liver disease (NAFLD) and advanced fibrosis (F3–4) are at risk of disease progression and adverse clinical outcomes.
Non-invasive tests with predefined cut-offs are used as screening biomarkers to identify those at low risk of advanced fibrosis who can be safely managed in primary care.
Liver biopsy is still needed in secondary care to further identify those with cirrhosis who would benefit from surveillance for hepatocellular cancer and screening for oesophageal varices.
Significance of this study
What are the new findings?
Existing non-invasive tests cut-offs are validated for their use as screening biomarkers to rule out advanced fibrosis in a study group of 5735 patients.
The sequential combination of Fibrosis-4 Index (FIB-4) (<1.3; ≥2.67) and liver stiffness measurement by vibration controlled transient elastography (LSM-VCTE) (<8.0 kPa; ≥10.0 kPa) which is increasingly used in routine practice has a false negative rate of 9% for advanced fibrosis.
The diagnostic performance of LSM-VCTE for advanced fibrosis is influenced by biopsy quality, body mass index and presence of type 2 diabetes.
An algorithm combining FIB-4 and LSM-VCTE sequentially with lower cut-offs to rule out advanced fibrosis (FIB-4 <1.3; LSM-VCTE <8.0 kPa) and with upper cut-offs to rule-in and positively diagnose cirrhosis without the need for liver biopsy with specificity of 95% (FIB-4 ≥3.48; LSM-VCTE ≥20.0 kPa) or 98% (FIB-4 ≥4.63; LSM-VCTE ≥28.0 kPa) can reduce the need for liver biopsies from 33% to 19% or 24%, respectively.
How might it impact on clinical practice in the foreseeable future?
The non-invasive test cut-offs for the diagnosis of cirrhosis can be incorporated into clinical practice as they have been validated in a large group of patients.
Application of these cut-offs can lead to a decrease in the need for liver biopsies in secondary care.
Non-alcoholic fatty liver disease (NAFLD) is the hepatic manifestation of the metabolic syndrome with high prevalence worldwide.1 Most patients remain asymptomatic for long periods of time (years/decades) with slowly progressive disease, but a minority2 progress to cirrhosis, liver failure and hepatocellular carcinoma (HCC).
NAFLD comprises several histological features ranging from simple steatosis to steatosis with lobular inflammation and ballooned hepatocytes (steatohepatitis), both of which can be accompanied by varying degrees of fibrosis. The currently accepted reference standard for diagnosing NAFLD is liver biopsy as its diagnostic features are based on histology.3 Liver biopsy, however, is invasive and carries a risk of complications,4 is limited by sampling variability5 and high observer dependent variability in pathological reporting.6 7
NAFLD is often diagnosed after incidental findings of elevated liver transaminases on blood tests, or liver steatosis or cirrhosis on imaging. One challenge clinicians face is to identify which of these patients are at high risk of progression or clinical outcomes, as they would benefit from specialist follow-up. There is now substantial evidence showing that those with at least advanced fibrosis (F3–4) are at higher risk of liver-related events in later life.8–10
A large body of evidence also exists on how non-invasive tests (NITs) could be used to risk-stratify patients for the presence of advanced fibrosis. These approaches usually involve sequential application of two NITs, with the first tier of a simple, inexpensive, serum-based test performed in the community (eg, Fibrosis-4 Index (FIB-4) or NAFLD Fibrosis Score (NFS)), followed by a second tier of liver stiffness measurement (LSM) (eg, vibration controlled transient elastography: VCTE), or a proprietary serum-based test (eg, enhanced liver fibrosis test; ELF). A lower and an upper threshold are usually used in each tier of testing to rule out (those with a NIT result less than the lower threshold) or rule in (those with a NIT result more than the upper threshold) patients at high risk of advanced fibrosis. Patients with indeterminate results in both tiers of testing would need a liver biopsy for risk stratification. The main value of these approaches lies in their high negative predictive value to rule out patients with low risk of advanced fibrosis who can be safely managed in primary care.
Despite the increasing evidence to support these approaches, some aspects of their application require further clarifications. First, there is no consensus on which NIT thresholds to use for this purpose. For example, FIB-4 upper cut-offs of 3.2511 and 2.6712 have been described, while other investigators omit the FIB-4 upper cut-off altogether.13 There is also some uncertainty about the performance of NITs in specific patient subgroups, such as those with diabetes or obesity. Furthermore, for patients who are ruled in as being at high risk of advanced fibrosis (F3–4), liver biopsy is often needed to identify those with cirrhosis who would need surveillance for HCC.14 Developing approaches that can minimise the need for liver biopsy in secondary care is therefore an area of unmet need.
To address these problems, we conducted an individual patient data meta-analysis (IPDMA) with three main aims: (1) to evaluate the performance of LSM-VCTE and compare it to the performance of FIB-4 and NFS as screening tests to rule out advanced fibrosis; (2) to evaluate NIT combination strategies to minimise the number of cases that would need a liver biopsy in secondary care; (3) to explore factors that influence diagnostic accuracy.
This IPDMA was reported in accordance with the recommendations of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses-IPD Statement15 and was registered as PROSPERO CRD42019157661.
Criteria for considering studies for the IPD meta-analysis
Studies reporting data on adults (≥18 years) with NAFLD and paired liver histology and LSM-VCTE were eligible. When studies reported study groups of participants with unselected aetiologies, only IPD of those with NAFLD were sought.
The index test of main interest was LSM-VCTE performed with FibroScan (Echosens, France). Results for serum-based biomarkers NSF,16 FIB-4,17 aspartate aminotransferase (AST) to alanine aminotransferase (ALT) ratio18 and AST-to-platelet ratio index (APRI)19) were also computed where data were available. Online supplemental table 1 summarises the definition of NITs considered in this IPDMA.
Universally accepted cut-offs for diagnosing different groups of fibrosis stages do not exist (several suggested cut-offs are presented in online supplemental table 2). For LSM-VCTE, <7.9 kPa and ≥9.6 kPa are the most used for respectively ruling out and in, advanced fibrosis.20
Only studies reporting histological classification of liver fibrosis based on the non-alcoholic steatohepatitis Clinical Research Network (NASH CRN) staging system were considered.21
Advanced fibrosis (F3–4) and cirrhosis (F4) were the target conditions of interest. To fulfil the aims of the study, cut-offs were selected to rule out or rule in advanced fibrosis, and to rule out advanced fibrosis or rule in cirrhosis.
All study designs were considered if they were reporting on patients with NAFLD undergoing both liver biopsy and LSM-VCTE within 6 months. No language restrictions were applied.
Authors of eligible studies were contacted by email and reminders were sent if a response was not received within 2 weeks. Only data from studies that received ethical approval were used. Additional ethical approval was not sought for the meta-analysis as only anonymised data were provided.
Range checks of measurement values provided for individual patients were carried out and authors were asked to provide clarifications where necessary. Missing data were queried until received or confirmed as unavailable. Missing data were handled in the analysis by pairwise deletion.
LSM-VCTE with median stiffness ≥7.1 kPa and IQR-to-median LSM ratio >30% were considered unreliable.22 These were included in the main analysis and were later compared in a subgroup analysis to reliable measurements, to assess whether they can be reliably used to diagnose advanced fibrosis.
Authors were provided with a template table of required data (online supplemental table 3) and were asked to deduplicate data were possible. We also checked for duplicate entries and where identified these were removed.
Quality and bias assessment
The quality of studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies tool (QUADAS-2).23
The original data sets were merged, a study identification variable was added, and descriptive statistical analysis of the data sets was conducted. Dichotomous variables are displayed as percentages. Continuous variables are reported as means with SD, or medians with IQRs according to the distribution of the data.
Analyses were done per protocol, as we did not have information on failed LSM-VCTE. To express the diagnostic performance of NITs, non-parametric, empirical receiver operating characteristic (ROC) curves were constructed for the target conditions of interest. Diagnostic performance was expressed as the area under the ROC curve (AUROC) with 95% CI, based on De Long’s method. AUROCs were compared using De Long’s test statistic.
Thresholds to maximise the Youden index (ie, sensitivity+specificity−1), for 90% sensitivity, and for 90% specificity were reported. The diagnostic performance of previously published cut-offs was also evaluated. Sequential combinations of serum biomarkers and LSM-VCTE were evaluated, by computing sensitivity, specificity and proportions of misclassified and indeterminate patients.
Positive and negative predictive values (PPV and NPV) were estimated for prevalences within the range of those reported in the original studies. The number of false positive and false negative results for 100 theoretical cases was also reported.
The main analysis was conducted to maximise data for each NIT. For a valid comparison of the performance of NITs, a separate analysis was conducted in the subgroup of patients where all three of VCTE, FIB-4 and NFS were available in each participant.
To fulfil the aim of developing testing strategies that reduce the number of patients in need of a liver biopsy, lower cut-offs for ruling out advanced fibrosis and upper cut-offs for ruling in cirrhosis were used. The rationale for this approach is illustrated in online supplemental figure 1. The upper cut-offs for identifying cirrhosis were chosen at 95% and 98% specificity in a derivation set and tested in a validation set. Derivation and validation sets were obtained by random sampling from the IPD study group in a 3:2 ratio. These upper cut-offs were combined with lower cut-offs from the literature for ruling out advanced fibrosis and the algorithm was tested in the whole IPD study group. For ease of reference, we also examined the cut-offs of 8 kPa and 10 kPa (corresponding to the most common VCTE cut-offs in the literature of 7.9 kPa and 9.6 kPa rounded to the nearest integer) and also rounded our cirrhosis cut-offs to the nearest integer to facilitate application in clinical practice.
Only test-positive and test-negative patients were included in the calculation of diagnostic performance indices, and patients in the indeterminate group were excluded from calculations.
Subgroup analysis was performed according to biopsy length (<20 mm, ≥20 mm), number of portal tracts in biopsy samples (<11, ≥11), biopsy quality (intermediate: 10 mm ≤length <20 mm; high: length ≥20 mm and ≥11 tracts), age (four quartiles), sex, body mass index (BMI; BMI <25 kg/m2, 25 kg/m2≤ BMI <30 kg/m2, BMI ≥30 kg/m2), presence of type 2 diabetes mellitus (T2DM), continent of provenance (Europe, Asia), probes used (M, XL), reliability criteria for LSM-VCTE (reliable (median LSM <7.1 kPa or median LSM ≥7.1 kPa and IQR/median LSM <0.30) vs unreliable (median LSM ≥7.1 kPa and IQR/median LSM ≥0.30)22 ; reliable (IQR/median LSM <0.30) vs unreliable (IQR/median LSM ≥0.30)), and aminotransferase levels (ALT or AST<40, 40≤ALT or AST<100, ALT or AST≥100; ALT<40 and AST<40, ALT≥40 or AST≥40.
All statistical analyses were performed using R (V.1.2.1335, R Foundation for Statistical Computing, Vienna, Austria) with the pROC package24 25; 95% CIs were calculated using 500 stratified bootstrap replicates using the boot package.26 27
VCTE probe types
The analysis to account for probe type is described in the online supplemental materials.
Patient and public involvement
Patients and the public were not involved in the conduct of this study as there was no direct patient participation in the study.
Search process and data collection
Ten thousand three hundred ninety-two articles were identified in a search performed for a larger systematic review evaluating the diagnostic performance of LSM-VCTE and other index tests for the staging of fibrosis and diagnosis of NASH in adult patients with NAFLD. After removing duplicates, and screening titles, abstracts, and full texts, 59 studies examining VCTE were identified. The authors of 37 studies shared useable data (figure 1). Authors of more than one study supplied data in a single dataset and, overall, we received 30 data sets including data from 6571 patients. After removing duplicates (n=628) and patients with missing biopsy (n=14) or LSM-VCTE (n=194) data, the final dataset consisted of 5735 unique patients.
Study and population characteristics
The characteristics of the 30 data sets are summarised in table 1. Studies were conducted in Europe (67%), Asia (40%) and Australia (3%). Data availability is shown in online supplemental table 3. FIB-4 and NFS were determined in 5393 (94%) and 3248 (57%) cases, respectively. Median age was 54 years, 2570 (45%) patients were women, 33% had diabetes and 43% had BMI ≥30 kg/m2. Overall, 30% had advanced fibrosis and 11% had cirrhosis. Details of the IPD study group are included in table 2, and online supplemental tables 4 and 5.
The methodological quality of the studies assessed with the QUADAS-2 tool is summarised in online supplemental figures 2 and 3. Only one study had low risk of bias or low applicability concerns in all QUADAS-2 domains.28 The flow and timing domain were judged to have high risk or unclear risk of bias in 65% of studies, as these either excluded technical failures from their final diagnostic performance analysis or did not report them.
Validating the diagnostic performance of LSM by VCTE and serum-based tests for detecting advanced fibrosis
LSM-VCTE, FIB-4, NFS, APRI and AST/ALT had corresponding AUROCs of 0.85, 0.76, 0.73, 0.70, 0.64 for identifying advanced fibrosis (table 3), and 0.90, 0.80, 0.78, 0.72, 0.69 for the identification of cirrhosis (online supplemental table 6). LSM-VCTE performed significantly better (p<10−15) in detecting both advanced fibrosis and cirrhosis than all serum-based tests. This relationship was preserved when performing a head-to-head comparison of LSM-VCTE, FIB-4 and NFS in the same group of patients (online supplemental tables 7 and 8).
When considering cut-offs from the literature, we evaluated lower and higher cut-offs separately. For any given test, as would be expected, low thresholds yielded higher sensitivity and high thresholds were associated with higher specificity (online supplemental table 9). Indicative PPV and NPV are also provided for the range of prevalences (5%–50%) reported in the primary studies (online supplemental tables 10–14).
APRI and AST/ALT ratio had only modest diagnostic performance for advanced fibrosis (AUROC ≤0.70, table 3), and were therefore not considered further.
None of the thresholds regarded in isolation resulted in both a high sensitivity (≥80%) and high specificity (≥80%) (figure 2, table 3, online supplemental tables 9 and 15, and online supplemental figure 4). Therefore, we explored the use of a lower and an upper cut-off. LSM-VCTE literature cut-offs performed well in only two cases (<7.1 kPa and ≥14.1 kPa: 83% sensitivity, 90% specificity; and <7.9 kPa and ≥9.6 kPa: 84% sensitivity, 78% specificity), while for other LSM-VCTE, NFS and FIB-4 thresholds a high specificity was observed (FIB-4: 91% for <1.3 and ≥2.67, 95% for <1.3, ≥3.25) but sensitivity was <60% (table 4). In addition, the proportion of indeterminate cases was >30% for serum-based NITs. Threshold pairs derived from the IPD study group did not reduce the proportion of misclassified and indeterminate patients seen with literature-based threshold pairs (table 4).
We further evaluated the performance of LSM-VCTE, FIB-4 and NFS to diagnose advanced fibrosis in sequential combinations of serum-based NITs and LSM-VCTE. When selecting threshold combinations for FIB-4 and NFS available in the literature (<1.3 & ≥2.67,<1.3 & ≥3.25 for FIB-4;<−1.455 & ≥0.676 for NFS) and pairing them with the best threshold pair for LSM-VCTE (<7.9 kPa & ≥9.6 kPa, identified as the one with highest sensitivity and lowest indeterminate proportion), the proportion of patients in the indeterminate group was 5%. While both the FIB-4+LSM VCTE and NFS+LSM VCTE sequential combinations had specificity >80%, their sensitivity was ≤80% (table 5). A better sensitivity was reached by using thresholds derived from the IPD study group (<0.88 & ≥2.31 for FIB-4;<−2.55 & ≥0.28 for NFS), but the proportion of indeterminate cases was near 20% in those cases and the proportions of patients needing LSM-VCTE was also larger than when using literature cut-offs (table 5).
Algorithms to minimise the need for liver biopsy
In the derivation set, the cut-offs for 95% and 98% specificity for the diagnosis of cirrhosis were respectively 20.4 kPa and 27.6 kPa for LSM-VCTE, 3.48 and 4.63 for FIB-4 and 1.01 and 1.57 for NFS. These cut-offs performed similarly in the validation set (online supplemental tables 16 and 17).
Algorithms combining FIB-4 (lower cut-off of 1.3 as described in the literature and upper cut-offs of 3.48 and 4.63 as described above) and LSM by VCTE (lower cut-off rounded to 8.0 kPa and upper cut-offs rounded to 20.0 kPa and 28.0 kPa, as described above) were then compared with the traditional way of applying these tests, also with rounded cut-offs for LSM by VCTE (8 kPa and 10 kPa) (figure 3). This approach increased the number of patients requiring a LSM (from 34% to 40% and 44%) but decreased the number of patients needing liver biopsy (from 33% to 19% and 24% when using the 95% and 98% specificity cut-offs, respectively) (online supplemental table 18 and figure 3).
Subgroup and sensitivity analyses
In subgroup analysis for the diagnosis of advanced fibrosis (online supplemental table 19), NITs performed better in patients with lower BMI (AUROCs LSM-VCTE: 0.91, p<0.005; FIB-4: 0.81, p<0.001; NFS: 0.76, p<0.025), without T2DM (LSM-VCTE: 0.87, p<10−6; FIB-4: 0.77, p<0.01), and with biopsies shorter than 20 mm (LSM-VCTE: 0.87, p<0.005; FIB-4: 0.80, p<0.001; NFS: 0.79, p<0.05), or with fewer than 11 portal tracts (LSM-VCTE: 0.86, p=0.01; FIB-4: 0.79, p=0.04; NFS: 0.78, p<0.005). Diagnostic performance was also lower in patients in the youngest age quartile (<43 years, AUROC: 0.58, p<0.001) and in women (AUROC: 0.71, p=0.03) for NFS, while continent of provenance did not have a significant effect for any NITs. In patients with normal levels of ALT (ALT<40) FIB-4 performed worse (AUROC: 0.73) than in patients with ALT≥40 and ALT<100 (AUROC: 0.77, p<0.01). NFS performed better in patients with AST<40 (AUROC: 0.76), than in patients with AST≥100 (AUROC: 0.65, p<0.01). FIB-4 performed better in patients with at least one abnormal aminotransferase measurement (AUROC: 0.72, p=0.014). For cirrhosis, the trends were similar, except that for the diagnosis of cirrhosis, LSM by VCTE performed better in the youngest age group (AUROC: 0.97, p<10−4) and NIT diagnostic performance was independent of aminotransferase levels (online supplemental table 20).
The diagnostic performance of LSM-VCTE was significantly lower in patients with unreliable LSMs (p<10−8; both for advanced fibrosis and cirrhosis) when applying the Boursier-criteria,22 but not when only considering IQR/median LSM <0.30. The proportion of unreliable results was 12% both in the advanced fibrosis and cirrhosis groups (online supplemental table 21).
There was no difference in the diagnostic performance of LSM-VCTE between the M and XL probes in the subgroup of patients who had undergone LSM by both probes (online supplemental table 22).
In a sensitivity analysis of patients with LSM matched to BMI (only M probe measurements if BMI <30 kg/m2 and only XL probe measurements if BMI ≥30 kg/m2), there was no significant difference between the diagnostic performance of LSM-VCTE when comparing to the entire IPD study group (online supplemental table 23).
Through an extensive collaboration network with authors of primary studies we were able to collect the largest dataset of its kind ever to be reported on. This includes a diverse set of study groups from Europe, Asia, and Australia, 30% of whom had advanced fibrosis. We believe that our findings are therefore relevant for patients typical of secondary care in these territories and may be applied in the development of new strategies or in the consolidation of existing practices in evaluating patients for referral to secondary care.
A few studies evaluated the diagnostic performance of LSM-VCTE and other NITs, but most report on fewer than 500 patients. One similarly large study reported on patients screened for inclusion in clinical trials, where the prevalence of advanced fibrosis was 71%,29 making it difficult to make generalisations about its applicability in routine practice or compare its results to ours. A smaller study with 1073 patients with NAFLD of whom 29% had advanced fibrosis30 examined the diagnostic performance of LSM by VCTE. The authors of that study reported AUC and specificity values similar to our findings, however they reported increased sensitivity. Other smaller studies reported similar prevalence of advanced fibrosis and similar AUROCs for LSM-VCTE.31–34
Overall, the diagnostic performance of LSM-VCTE for advanced fibrosis was good (AUROC=0.85), while that of FIB-4 and NFS in the same group was moderate (AUROC=0.76 for FIB-4, AUROC=0.73 for NFS). None of the studied NITs had both sufficiently high sensitivity and specificity (≥80%) when used with single cut-offs. Diagnostic performance was higher for detecting cirrhosis, as reported in previous studies.31 35 36 LSM-VCTE had the highest sensitivity and specificity, both in the case of a single cut-off (9.1 kPa obtained by maximising the Youden index; 77% and 78%) and for two cut-offs (<7.4 kPa & ≥12.1 kPa; 84% and 87%). Of the LSM-VCTE cut-off pairs tested,<7.1 kPa and ≥14.1 kPa, first published by Eddowes et al.,31 performed well for advanced fibrosis, with sensitivity of 83% and specificity of 90%, but with a proportion of 39% of patients ending up with an indeterminate result, similar to 41% indeterminate patients reported in the original paper.31
LSM-VCTE thresholds identified in our study group (<9.1 kPa; <7.4 kPa & ≥12.1 kPa) were similar to thresholds reported in the literature (<9.9 kPa; <7.1 kPa & ≥14.1 kPa, <7.9 kPa & ≥9.6 kPa). However, thresholds for FIB-4 (<1.44; <0.88 & ≥2.31) and NFS (<−1.39; <−2.55 & ≥0.28) defined in our IPD study group spanned a wider range than those reported in the literature (<1.3 & ≥2.67 or <1.3 & ≥3.25 for FIB-4;<−1.455 & ≥0.676 for NFS).
Our findings are in line with the existing literature suggesting that sequential combinations of NITs increase sensitivity and specificity.29 Additionally, we have found NFS+LSM VCTE and FIB-4+LSM VCTE combinations to have similar sensitivity and specificity as recently reported by Boursier et al.37 Such combined testing strategies can reduce the number of indeterminate cases and reduce the costs associated with liver biopsies.
Furthermore, we propose an approach that could minimise the need for liver biopsies further, by using upper cut-offs with 95% and 98% specificity for the identification of cirrhosis. The rationale for this approach is explained in the online supplemental discussion. When using the 95% specificity cut-off, the proportion of patients needing liver biopsy decreases from 33% to 19% (figure 3). However, in this approach, 345 of 656 patients ‘ruled-in’ as having cirrhosis do not have histologically diagnosed cirrhosis. While this may seem like a high proportion of patients with false positive results, this must be interpreted in the light of two factors. First, the limitations of liver biopsy could mean that these patients are falsely classified as not having cirrhosis histologically. Furthermore, patients without cirrhosis on histology and with high NIT values could have equivalent risks as patients with cirrhosis on histology. For example, it is known from the hepatitis C literature38 that patients without cirrhosis on liver biopsy but with a high FIB-4 (>3.25) still had a significant risk of developing HCC after hepatitis C treatment, demonstrating that NITs can have added benefit beyond the histological diagnosis of cirrhosis alone. The rate of false positive results for cirrhosis can be decreased by choosing cut-offs with higher specificity, but this will come at the expense of doing more biopsies. Despite this encouraging result, this is an area where more information is needed, particularly longitudinal data comparing the prognostic value of LSM-VCTE and other NITs against histology, and ultimately, the cost effectiveness of the various cut-offs would need to be evaluated.
Surprisingly, subgroup analyses showed that the diagnostic accuracy of NITs was better in cases with poor biopsy quality. This finding is difficult to explain but a similar observation was reported previously in a large group of patients screened for clinical trials.29 The use of local biopsy reports as reference standard and the well-known observer-dependent variability of biopsy interpretation, even among expert pathologists,7 are factors that may have contributed to our finding. Spectrum bias was excluded as a source of this finding due to a near-identical proportion of patients in both the advanced fibrosis and cirrhosis group having short biopsies (online supplemental table 5).
Subgroup analysis showed better diagnostic performance of NITs in patients with lower BMI,39 40 and patients without diabetes, in keeping with other studies.41 42 This effect is likely to be primarily driven by BMI as there is thought to be a causal association between BMI and T2DM. NIT performance was impacted by age, with all NITs performing worse in the younger quartile of our study group for advanced fibrosis, but the trend was reversed for cirrhosis where NITs performed better in those younger than 43 years of age. The age dependence of FIB-4 and NFS is expected, as age is one of the parameters included in the algorithms, and has indeed been previously described.13 43 It is, however, difficult to explain why performance of NITs is better in the younger age group for the diagnosis of cirrhosis.
Our study has several strengths, including the large size of the IPD study group and composition with prevalence of advanced fibrosis of 30%, which makes it relevant to routine practice. Furthermore, the proportion of unreliable VCTE measurements in our study was 12%, in keeping with the literature.22 However, we acknowledge some limitations. We did not have any data from the USA and very few studies from Australia, so the results could not be globally applicable, due to differences in BMI across study populations. In addition, due to the nature of our study, we had to use the locally provided histology results possibly introducing bias. Furthermore, we covered a large chronological period, during which LSM-VCTE application underwent significant changes, initially with the introduction of the XL probe, followed by the advice to measure skin-to-capsule distance (SCD) and the introduction of the Automatic Probe Selection tool. There was therefore some heterogeneity in the performance of LSM-VCTE, with early studies using only the M probe to assess all patients, while only a subset of studies assessed SCD to guide probe selection. Furthermore, one third of the included studies was carried out in France, as the technology used for LSM by VCTE originates from there. Lastly, our data confirm that LSM-VCTE had superior accuracy to serum-based tests, and this is independent of probe type, sex, ALT, AST, and participants’ continent of origin. There was, however, some dependence on the presence of T2DM, BMI and for the detection of cirrhosis, and we did not check for subgroup-specific cut-offs, but these should be explored in future studies.
Our study examined some of the most widely available NITs. While it cannot be considered exhaustive, it can be regarded as the benchmark against which newer NITs can be tested. This is particularly important as new tests are continuously being developed (FibroTest-FibroSURE, ActiTest,44 ELF45). Furthermore, newer tests are also needed for patients with ‘at risk’ NASH (NASH+F2–3) who would be candidates for clinical trials or treatments, once approved therapies become available (FAST score,46 NIS4,47 cTAG48).
In conclusion, our study provides further validation of the use of sequential combination of FIB-4 and LSM-VCTE to rule out patients with NAFLD and advanced fibrosis who can be managed in primary care. We have shown how the use of upper cut-offs to rule in cirrhosis in combination with lower cut-offs to rule out advanced fibrosis can lead to a reduction in the number of patients who would need to undergo liver biopsy.
Data availability statement
Anonymised individual patient data are available upon reasonable request and with the agreement of the authors of original studies.
Patient consent for publication
Correction notice This article has been corrected since it published Online First. The funding statement and table 1 have been updated and an author has been added.
Collaborators The LITMUS Investigators: Quentin Anstee, Ann Daly, Katherine Johnson, Olivier Govaere, Simon Cockell, Dina Tiniakos, Pierre Bedossa, Fiona Oakley, Heather Cordell, Chris Day, Kristy Wonders (Newcastle University); Patrick Bossuyt, Hadi Zafarmand, Yasaman Vali, Jenny Lee (AMC Amsterdam); Vlad Ratziu, Karine Clement, Raluca Pais (Hôpital Pitié Salpêtrière, Assistance Publique -Hôpitaux de Paris, and Institute of Cardiometabolism and Nutrition, Paris, France); Detlef Schuppan, Jörn Schattenberg (University Medical Center Mainz); Detlef Schuppan, Jörn Schattenberg (University Medical Center Mainz); Toni Vidal-Puig, Michele Vacca, Sergio Rodrigues-Cuenca, Mike Allison, Ioannis Kamzolas, Evangelia Petsalaki (University of Cambridge); Matej Oresic, Tuulia Hyötyläinen, Aiden McGlinchey (Örebro University); Jose M Mato, Oscar Millet (Center for Cooperative Research in Biosciences); Jean-François Dufour, Annalisa Berzigotti (University of Bern); Michael Pavlides, Stephen Harrison, Stefan Neubauer, Jeremy Cobbold, Ferenc Mozes, Salma Akhtar (University of Oxford); Rajarshi Banerjee, Matt Kelly, Elizabeth Shumbayawonda, Andrea Dennis, Charlotte Erpicum, Micheala Graham (Perspectum); Manuel Romero-Gómez, Emilio Gómez-González, Javier Ampuero, Javier Castell, Rocío Gallego-Durán, Isabel Fernández, Rocío Montero-Vallejo (Servicio Andaluz de Salud, Seville); Morten Karsdal, Elisabeth Erhardtsen, Daniel Rasmussen, Diana Julie Leeming, Mette Juul Fisker, Antonia Sinisi, Kishwar Musa (Nordic Bioscience); Fay Betsou, Estelle Sandt, Manuela Tonini (Integrated Biobank of Luxembourg); Elisabetta Bugianesi, Chiara Rosso, Angelo Armandi, Fabio Marra (UNIFI), Amalia Gastaldelli (CNR), Gianluca Svegliati (UNIPM) (University of Torino); Jérôme Boursier (University Hospital of Angers); Sven Francque; Luisa Vonghia (Antwerp University Hospital); Mattias Ekstedt, Stergios Kechagias (Linköping University); Hannele Yki-Jarvinen, Kimmu Porthan (University of Helsinki); Saskia van Mil (UMC Utrecht); George Papatheodoridis (National & Kapodistrian University of Athens); Helena Cortez-Pinto (Faculdade de Medicina de Lisboa); Luca Valenti (Università degli Studi di Milano); Salvatore Petta (Università degli Studi di Palermo); Luca Miele (Università Cattolica del Sacro Cuore); Andreas Geier (University Hospital Würzburg); Christian Trautwein (RWTH Aachen University Hospital); Guru Aithal (University of Nottingham); Paul Hockings (Antaros Medical); Philip Newsome (University Hospitals Birmingham NHS Foundation Trust); David Wenn (iXscient); Cecília Maria Pereira Rodrigues (University of Lisbon); Pierre Chaumat, Rémy Hanf (Genfit); Aldo Trylesinski (Intercept Pharma); Pablo Ortiz (OWL); Kevin Duffin (Ely-Lilly); Julia Brosnan, Theresa Tuthill, Euan McLeod (Pfizer); Judith Ertle, Ramy Younes (Boehringer-Ingelheim); Rachel Ostroff, Leigh Alexander (Somalogic); Mette Skalshøi Kjær (Novo Nordisk); Lars Friis Mikkelsen (Ellegaard Göttingen Minipigs); Maria-Magdalena Balp, Clifford Brass, Lori Jennings, Miljen Martic, Juergen Loeffler (Novartis Pharma AG); Guido Hanauer (Takeda Development Centre Europe Ltd); Sudha Shankar (AstraZeneca); Céline Fournier (Echosens); Kay Pepin, Richard Ehman (Resoundant); Joel Myers (Bristol-Myers Squibb); Gideon Ho (HistoIndex); Richard Torstenson (Allergan); Rob Myers (Gilead); Lynda Doward (RTI-HS).
Contributors FEM, EAS, ANAJ, MT, JB, AG, TT, JMB, QMA, SN, SAH, PMB, and MP contributed to the planning and design of the study. MT, JB, CF, KS, RES, EB, RY, SG, MLP, SP, TS, TO, SM, WKC, PJE, PNN, VWSW, VL, JGF, FS, JFC, YS, AO, JMS, CL, WK, MSL, JW, TK, YY, GPA, NP, CC, SA, HG, GO, AN, MY, MZ, and NB collected and provided individual patient data. FEM, JAL, PMB, and MP performed statistical analyses and data interpretation. FEM and MP wrote the first draft of the manuscript. All coauthors have approved the final version of the manuscript.
Funding This individual patient data meta-analysis is being conducted as part of the imaging study in the LITMUS (Liver Investigation: Testing Marker Utility in Steatohepatitis) study. The LITMUS study is a large multicentre study aiming to evaluate biomarkers on Non-Alcoholic Fatty Liver Disease. The LITMUS study is funded by the Innovative Medicines Initiative 2 (IMI2) Joint Undertaking under Grant Agreement 777377. This Joint Undertaking receives support from the European Union’s Horizon 2020 research and innovation programme and EFPIA.
Competing interests AG reports personal fees from AbbVie, personal fees from Alexion, personal fees from Bayer, personal fees from BMS, personal fees from CSL Behring, personal fees from Gilead, grants and personal fees from Intercept, personal fees from Ipsen, personal fees from Merz, grants and personal fees from Novartis, personal fees from Pfizer, personal fees from Sanofi-Aventis, personal fees from Sequana, grants and personal fees from Falk, personal fees from MSD, during the conduct of the study. ANAJ reports other from Perspectum, during the conduct of the study. CF reports other from Echosens, during the conduct of the study. JFC reports personal fees from AstraZeneca, personal fees from Novo Nordisk, personal fees from Intercept, personal fees from Alnylam, during the conduct of the study. JB reports other from Pfizer, during the conduct of the study. JMS reports personal fees from BMS, personal fees from Boehringer Ingelheim, personal fees from Echosens, personal fees from Genfit, personal fees from Gilead Sciences, personal fees from Intercept Pharmaceuticals, personal fees from Madrigal, personal fees from Novartis, personal fees from Pfizer, personal fees from Roche, personal fees from Sanofi, personal fees from Falk Foundation, personal fees from MSD, grants from Gilead Sciences, during the conduct of the study. JW reports grants from Echosens, during the conduct of the study. Dr. Pavlides reports other from Perspectum, during the conduct of the study. MT reports personal fees from Bristol-Myers Squibb, personal fees from Falk Foundation, personal fees from Gilead, personal fees from Intercept, personal fees from Merck Sharp & Dohme, personal fees from Albireo, personal fees from Boehringer Ingelheim, personal fees from BiomX, personal fees from Falk Pharma GmbH, personal fees from GENFIT, personal fees from Jannsen, personal fees from Novartis, personal fees from Phenex, personal fees from Regulus and Shire, grants from AbbVie, grants from Falk, grants from Gilead, grants from Intercept, grants from Albireo, grants from CymaBay, grants from Merck Sharp & Dohme, grants from Takeda, during the conduct of the study; in addition, MT has a patent norUDCA issued. PNN reports personal fees from Bristol-Myers Squibb, personal fees from Gilead, personal fees from Boehringer Ingelheim, personal fees from Pfizer, personal fees from Novo Nordisk, personal fees from Poxel, grants from Pharmaxis, grants from Boehringer Ingelheim, grants from Echosens, grants from Novo Nordisk, during the conduct of the study. QMA reports grants from AbbVie, grants and personal fees from Allergan/Tobira, grants from AstraZeneca, grants from GlaxoSmithKline, grants from Glympse Bio, grants and personal fees from Novartis Pharma, grants and personal fees from Pfizer, grants from Vertex, personal fees from Abbott Laboratories, personal fees from Acuitas Medical, personal fees from Blade, personal fees from BNN Cardio, personal fees from Cirius, personal fees from CymaBay, personal fees from EcoR1, personal fees from Eli Lilly, personal fees from Galmed, personal fees from Genfit, personal fees from Gilead, personal fees from Grunthal, personal fees from HistoIndex, personal fees from Indalo, personal fees from Imperial Innovations, personal fees from Intercept Pharma Europe, personal fees from Inventiva, personal fees from IQVIA, personal fees from Janssen, personal fees from Kenes, personal fees from Madrigal, personal fees from MedImmune, personal fees from Metacrine, personal fees from NewGene, personal fees from NGMBio, personal fees from North Sea Therapeutics, personal fees from Novo Nordisk, personal fees from Poxel, personal fees from ProSciento, personal fees from Raptor Pharma, personal fees from Servier, personal fees from Viking Therapeutics, personal fees from Abbott Laboratories, personal fees from BMS, personal fees from Clinical Care Options, personal fees from Falk, personal fees from Fishawack, personal fees from Integritas Communications, personal fees from MedScape, other from IMI2 LITMUS consortium, during the conduct of the study. SH reports grants and personal fees from Akero, grants and personal fees from Axcella, grants and personal fees from Cirius, grants and personal fees from CiVi Biopharma, grants and personal fees from CymaBay, grants and personal fees from Galectin, grants from Galmed, grants and personal fees from Genfit, grants and personal fees from Gilead Sciences, grants and personal fees from Hepion Pharmaceuticals, grants and personal fees from Hightide Therapeutics, grants and personal fees from Intercept, grants and personal fees from Madrigal, grants and personal fees from Metacrine, grants and personal fees from NGM Bio, grants and personal fees from North Sea Therapeutics, grants and personal fees from Novartis, grants and personal fees from Novo Nordisk, grants and personal fees from Poxel, grants and personal fees from Sagimet, grants and personal fees from Viking, personal fees from Altimmune, personal fees from Alentis, personal fees from Arrowhead, personal fees from Canfite, personal fees from Echosens, personal fees from Enyo, personal fees from Fibronostics, personal fees from Foresite Labs, personal fees from Fortress Biotech, personal fees from HistoIndex, personal fees from Kowa, personal fees from Prometic, personal fees from Ridgeline, personal fees from Terns, during the conduct of the study. SM reports personal fees from Echosens, during the conduct of the study. SN reports other from Perspectum, during the conduct of the study. SP reports personal fees from AbbVie, personal fees from Gilead, personal fees from Intercept, personal fees from Pfizer, during the conduct of the study. TK reports grants from Echosens, during the conduct of the study. TT reports other from Pfizer, during the conduct of the study. VdL reports personal fees from Bristol-Myers Squibb, personal fees from Gilead Sciences, personal fees from AbbVie, personal fees from Pfizer, personal fees from Echosens, personal fees from Intercept Pharmaceuticals, personal fees from MSD, personal fees from Myr-Pharma, personal fees from Supersonic Imagine, personal fees from Tillotts, during the conduct of the study. Dr. Wong reports personal fees from AbbVie, personal fees from 3V-BIO, personal fees from Allergan, personal fees from Boehringer Ingelheim, personal fees from Center for Outcomes Research in Liver Diseases, grants and personal fees from Gilead, personal fees from Intercept, personal fees from Echosens, personal fees from Hanmi Pharmaceutical, personal fees from Novartis, personal fees from Pfizer, personal fees from Merck, personal fees from Novo Nordisk, personal fees from Perspectum, personal fees from ProSciento, personal fees from Sagimet Biosciences, personal fees from TARGET PharmaSolutions, personal fees from Terns, personal fees from BMS, during the conduct of the study. WK reports personal fees from Samil, personal fees from Boehringer Ingelheim, personal fees from Ildong, personal fees from LG Chemistry, personal fees from Gilead Sciences, personal fees from HK inno.N, personal fees from GreenCross, personal fees from Bukwang, personal fees from Standigm, personal fees from PharmaKing, personal fees from KOBIOLABS, personal fees from Eisai, personal fees from Zydus, personal fees from Novo Nordisk, grants from Gilead, grants from Ildong, grants from GreenCross, grants from Bukwang, grants from PharmaKing, grants from Roche, grants from Galmed, grants from Novartis, grants from Pfizer, grants from Springbank, grants from Altimmune, grants from MSD, grants from BMS, grants from Dicerna, grants from Enyo, grants from Hitachi-Aloka, other from KOBIOLABS, other from Lepidyne, during the conduct of the study. YY reports grants from Biocodes, grants and personal fees from Gilead Sciences, personal fees from Bilim Pharmaceuticals, personal fees from Pharmactive Pharmaceutical, personal fees from Sanovel Pharmaceuticals, personal fees from Galmed, personal fees from Zydus, personal fees from Novo Nordisk, during the conduct of the study. MP, ANAJ and SN are shareholders of Perspectum, Oxford, UK. CF is employed by Echosens, France. MT received speaker fees from Bristol-Myers Squibb (BMS), Falk Foundation, Gilead, Intercept and Merck Sharp & Dohme (MSD); advisory board fees from Albireo, Boehringer Ingelheim, BiomX, Falk Pharma GmbH, GENFIT, Gilead, Intercept, Jannsen, MSD, Novartis, Phenex, Regulus and Shire; travel grants from AbbVie, Falk, Gilead, and Intercept; and research grants from Albireo, CymaBay, Falk, Gilead, Intercept, MSD, and Takeda. He is also coinventor of patents on the medical use of norUDCA filed by the Medical University of Graz. SP was speaker and/or Advisor for AbbVie, Gilead, Intercept and Pfizer. PNN received grant and research support from Pharmaxis, Boehringer Ingelheim, Echosens and Novo Nordisk and consulting fees from BMS, Boehringer Ingelheim, Gilead, Novo Nordisk, Pfizer, and Poxel on behalf of the University of Birmingham. VL reports consultancy for AbbVie, BMS, Echosens, Gilead Sciences, Intercept Pharmaceuticals, MSD, Myr-Pharma, Pfizer, Supersonic Imagine and Tillotts. SM received honorarium fees from Echosens. JFC received consultancy, advisory board, and speaker fees from Astra Zeneca, NovoNordisk, Intercept and Alnylam. JMS reports consultancy for BMS, Boehringer Ingelheim, Echosens, Genfit, Gilead Sciences, Intercept Pharmaceuticals, Madrigal, Novartis, Pfizer, Roche, Sanofi; received research funding from Gilead Sciences and was on the speakers bureau for Falk Foundation MSD Sharp & Dohme GmbH. WK has served as a speaker and consultant of Gilead, Boehringer-Ingelheim, Samil, Ildong, LG Chemistry, HK inno.N, GreenCross, Bukwang, Standigm, PharmaKing, KOBIOLABS, Eisai, Zydus, and Novonordisk, received grants from Gilead, Ildong, GreenCross, Bukwang, Pharmaking, Roche, Galmed, Novartis, Pfizer, Springbank, Altimmune, MSD, BMS, Dicerna, Enyo, and Hitachi-Aloka, and owns stocks in KOBIOLABS and Lepidyne. TK and JW received unrestricted research grants from Echosens, Paris France. TK participated in a clinical advisory board meeting. YY received research grant from Biocodex, Gilead Sciences, speaker fees for Gilead Sciences, Bilim Pharmaceuticals, Pharmactive Pharmaceutical, Sanovel Pharmaceuticals, and served as advisory board member for Galmed, Zydus, NovoNordisk. AG reports consultancy for AbbVie, Alexion, Bayer, BMS, CSL Behring, Gilead, Intercept, Ipsen, Merz, Novartis, Pfizer, Sanofi-Aventis, Sequana; received research funding from Intercept, Falk, Novartis and was on the speakers bureau for AbbVie, Alexion, BMS, CSL Behring, Falk Foundation, Gilead, Intercept, MSD, Merz, Novartis, Sequana. VWSW has served as a consultant or advisory board member for 3V-BIO, AbbVie, Allergan, Boehringer Ingelheim, Center for Outcomes Research in Liver Diseases, Echosens, Gilead Sciences, Hanmi Pharmaceutical, Intercept, Merck, Novartis, Novo Nordisk, Perspectum Diagnostics, Pfizer, ProSciento, Sagimet Biosciences, TARGET PharmaSolutions, and Terns; and a speaker for AbbVie, Bristol-Myers Squibb, Echosens, and Gilead Sciences. He has also received a research grant from Gilead Sciences for fatty liver research. QMA is coordinator of the IMI2 LITMUS consortium and he reports research grant funding from Abbvie, Allergan/Tobira, AstraZeneca, GlaxoSmithKline, Glympse Bio, Novartis Pharma AG, Pfizer Ltd., Vertex; consultancy on behalf of Newcastle University for Abbott Laboratories, Acuitas Medical, Allergan/Tobira, Blade, BNN Cardio, Cirius, CymaBay, EcoR1, E3Bio, Eli Lilly & Company Ltd., Galmed, Genfit SA, Gilead, Grunthal, HistoIndex, Indalo, Imperial Innovations, Intercept Pharma Europe Ltd., Inventiva, IQVIA, Janssen, Kenes, Madrigal, MedImmune, Metacrine, NewGene, NGMBio, North Sea Therapeutics, Novartis, Novo Nordisk A/S, Pfizer Ltd., Poxel, ProSciento, Raptor Pharma, Servier, Viking Therapeutics; and speaker fees from Abbott Laboratories, Allergan/Tobira, BMS, Clinical Care Options, Falk, Fishawack, Genfit SA, Gilead, Integritas Communications, MedScape. SAH has research grants from Akero, Axcella, Cirius, CiVi Biopharma, Cymabay, Galectin, Galmed, Genfit, Gilead Sciences, Hepion Pharmaceuticals, Hightide Therapeutics, Intercept, Madrigal, Metacrine, NGM Bio, Northsea Therapeutics, Novartis, Novo Nordisk, Poxel, Sagimet, Viking. He has received consulting fees from Akero, Altimmune, Alentis, Arrowhead, Axcella, Canfite, Cirius, CiVi, Cymabay, Echosens, Enyo, Fibronostics, Foresite Labs, Fortress Biotech, Galectin, Genfit, Gilead Sciences, Hepion, HIghtide, HistoIndex, Intercept, Kowa, Madrigal, Metacrine, NGM, Northsea, Novartis, Novo Nordisk, Poxel, Prometic, Ridgeline, Sagimet, Terns, and Viking.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.