Objective Despite improvements in imaging, serum CA19-9 and pathological evaluation, differentiating between benign and malignant bile duct strictures remains a diagnostic conundrum. Recent developments in next-generation sequencing (NGS) have opened new opportunities for early detection and management of cancers but, to date, have not been rigorously applied to biliary specimens.
Design We prospectively evaluated a 28-gene NGS panel (BiliSeq) using endoscopic retrograde cholangiopancreatography-obtained biliary specimens from patients with bile duct strictures. The diagnostic performance of serum CA19-9, pathological evaluation and BiliSeq was assessed on 252 patients (57 trainings and 195 validations) with 346 biliary specimens.
Results The sensitivity and specificity of BiliSeq for malignant strictures was 73% and 100%, respectively. In comparison, an elevated serum CA19-9 and pathological evaluation had sensitivities of 76% and 48%, and specificities of 69% and 99%, respectively. The combination of BiliSeq and pathological evaluation increased the sensitivity to 83% and maintained a specificity of 99%. BiliSeq improved the sensitivity of pathological evaluation for malignancy from 35% to 77% for biliary brushings and from 52% to 83% for biliary biopsies. Among patients with primary sclerosing cholangitis (PSC), BiliSeq had an 83% sensitivity as compared with pathological evaluation with an 8% sensitivity. Therapeutically relevant genomic alterations were identified in 20 (8%) patients. Two patients with ERBB2-amplified cholangiocarcinoma received a trastuzumab-based regimen and had measurable clinicoradiographic response.
Conclusions The combination of BiliSeq and pathological evaluation of biliary specimens increased the detection of malignant strictures, particularly in patients with PSC. Additionally, BiliSeq identified alterations that may stratify patients for specific anticancer therapies.
- bile duct
- pancreatic cancer
- precision medicine
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
- bile duct
- pancreatic cancer
- precision medicine
Significance of this study
What is already known on this subject?
Despite a multidisciplinary approach, distinguishing between benign and malignant bile duct strictures can be challenging.
Whole exome and whole genome sequencing studies have defined the genetic landscape of neoplasms arising and secondarily involving the bile duct system.
A test that could detect key genomic alterations in malignant neoplasms involving the bile duct system may be of diagnostic utility to endoscopic retrograde cholangiopancreatography (ERCP)-obtained biliary specimens.
Significance of this study
What are the new findings?
We have developed and validated a highly sensitive, targeted next-generation sequencing (NGS) assay (BiliSeq) within a Clinical Laboratory Improvement Amendments-certified and College of American Pathologists-accredited laboratory for 28 genes that are commonly mutated, amplified and/or deleted in malignant neoplasms involving the bile duct system.
A large prospective analysis of ERCP-obtained biliary specimens revealed BiliSeq improved the sensitivity of pathological evaluation for both biliary brushing and biliary biopsy specimens.
Among patients with primary sclerosing cholangitis, BiliSeq was superior to serum CA19-9 and pathological evaluation in detecting at least high-grade biliary dysplasia.
BiliSeq identified potentially actionable genomic alterations with demonstratable chemotherapeutic response for a subset of patients.
How might it impact on clinical practice in the foreseeable future?
These results highlight the diagnostic applicability of NGS-based assays to ERCP-obtained biliary specimens in early detection and management of patients with malignant bile duct strictures.
Strictures involving the bile duct represent a heterogeneous group of benign and malignant conditions.1 Benign bile duct strictures may be due to primary sclerosing cholangitis (PSC), IgG4-related sclerosing cholangitis, iatrogenic injury, infection and other less common aetiologies. Malignant causes include carcinomas involving the pancreatobiliary ducts, ampulla of Vater, liver and, rarely, metastatic cancers. However, differentiating between benign and malignant bile duct strictures can be challenging and relies on a multidisciplinary approach of clinical examination, biochemical testing (eg, serum CA19-9), radiographic imaging, endoscopic procedures and pathological evaluation with ancillary studies.2 This distinction is more arduous considering the aetiology of some benign strictures can be a predisposing factor for malignancy. For example, PSC is associated with a 400-fold increased risk of cholangiocarcinoma as compared with the general population.
Endoscopic retrograde cholangiopancreatography (ERCP) plays a key role in the evaluation of a bile duct stricture by defining the morphometric imaging features and extent of disease involvement.3 However, there are no pathognomonic features seen during fluoroscopy that can differentiate benign from malignant. Furthermore, direct cholangioscopic grading is inaccurate and suffers from poor interobserver agreement.4 5 During ERCP, bile duct brushings and forceps biopsies are the methods of choice for pathological confirmation; however, the sensitivity of detecting malignancy can range between 8% and 67%.4 6–11 To improve the detection of malignant strictures, adjunctive techniques were developed and include digital image analysis, KRAS mutational testing and fluorescence in situ hybridisation.12–18 However, the sensitivities of these assays range between 14% and 60%. Considering the low sensitivity of pathological evaluation and available ancillary studies, it is not surprising that patients often undergo multiple ERCP procedures for diagnostic purposes. As a result, therapeutic decisions for malignant strictures are often delayed by several weeks to months, which places patients at risk for disease progression. Conversely, a misdiagnosis of a benign bile duct stricture can result in unnecessary surgical resection. It is reported that up to 15% of patients undergoing surgery for a suspected malignant stricture have benign disease.19 20 This is alarming given these surgical resections are associated with a relatively, high morbidity and mortality.21–24 Thus, more innovative preoperative approaches are needed to improve the stratification of patients with benign and malignant strictures.
Over the past several years, dramatic progress has been made in understanding the genomic landscape of neoplasms arising or secondarily involving the bile duct system. Whole exome and whole genome sequencing studies have reported recurrent genomic alterations in a relatively small number of oncogenes and tumour suppressor genes, such as KRAS, TP53, CDKN2A and SMAD4.25–34 Moreover, a subset of these genomic alterations, such as ATM and ERBB2, may confer susceptibility to specific anticancer therapies. In parallel, the introduction of novel molecular diagnostics has opened new opportunities for the study of preoperative specimens.35 36 Bile duct specimens are often characterised by small quantities of diagnostic material and composed of heterogeneous cell populations, which may obscure or mimic a malignant process. It is therefore crucial for a molecular assay to be highly sensitive to detect small proportions of mutated cells within these specimens. Next-generation sequencing (NGS) combines a high analytical sensitivity with multigene analysis and thus represents an attractive option for the assessment of bile duct strictures.
In this study, we developed a highly sensitive, targeted NGS assay, known as BiliSeq, for 28 genes that are commonly mutated, amplified and/or deleted in malignant neoplasms involving the bile duct system.25–34 37 This test was performed within a Clinical Laboratory Improvement Amendments (CLIA)-certified and College of American Pathologists (CAP)-accredited clinical laboratory using ERCP-obtained bile duct brushings and/or biopsies. Rather than extracting DNA from cytological smears/slides, alcohol fixative (eg, CytoLyt) or formalin-fixed paraffin-embedded (FFPE) tissue, which may decrease overall yields and quality, a dedicated bile duct brushing and/or biopsy was submitted directly for BiliSeq testing after routine specimens were obtained for pathological evaluation. Our objectives were to prospectively evaluate BiliSeq testing on a large cohort of patients to: (1) determine its accuracy in the detection of malignant strictures, (2) compare its performance as an adjunct to other diagnostic modalities and (3) evaluate the ramifications on patient management when genomic alterations are detected in bile duct specimens.
Materials and methods
A training cohort of biliary specimens from 57 patients was prospectively collected by standard ERCP at the University of Pittsburgh Medical Center (UPMC) from patients with a bile duct stricture between September 2014 and April 2016 and funded by the University of Pittsburgh Department of Pathology. On completion of the training cohort, a validation cohort of 195 patients was prospectively accrued between September 2016 and August 2018 and partially funded by the Institute for Precision Medicine at the University of Pittsburgh. In total, 163 biliary brushings and 172 biliary biopsies were collected for pathological evaluation, and 160 brushings and 135 biopsies were collected for BiliSeq testing. Bile duct specimens were submitted to the UPMC Anatomic Pathology laboratory for routine cytopathological and pathological evaluation and the UPMC Molecular & Genomic Pathology (MGP) Laboratory for BiliSeq testing within 24–48 hours of the ERCP procedure. Medical records were reviewed to document patient demographics, clinical presentation, ERCP findings, concurrent endoscopic ultrasound (EUS) and EUS-fine needle aspiration (FNA), serum CA19-9 levels and pathological diagnoses of corresponding brushings and/or biopsies. A separate analysis was performed for repeat ERCP procedures that yielded 24 biliary brushings and 29 biliary biopsies for pathological evaluation and 24 brushings and 27 biopsies for BiliSeq testing. The aetiology of the stricture was based on diagnostic pathology (n=145) from surgical resected specimens or biopsy specimens (either autopsy, open or laparoscopic surgical resection/biopsy, percutaneous needle or ERCP-obtained biliary biopsy) and/or clinical criteria (n=75) after ≥12 months of follow-up. As the reported specificity of biliary brushings is <100%, the findings from pathological evaluation of brushing specimens were not considered as diagnostic pathology. Furthermore, a negative ERCP-obtained biopsy was not considered supportive of a benign aetiology. A stricture was designated as a benign cholangiopathy if a repeat imaging or ERCP at ≥12 months follow-up documented the resolution or stability of prior ductal abnormalities. In contrast, a clinical diagnosis of malignancy was defined by the combination of a mass on radiographic imaging in the absence of acute cholangiopathy, and clinical or radiographic progression after ≥12 months of follow-up, or death clinically and radiographically determined to be due to a malignancy involving the bile duct. For surgical resection and biopsy specimens, diagnoses were rendered on the basis of standard histomorphological criteria.38 Diagnostic and clinical follow-up information including the method of follow-up for each patient is described in detail within online supplementary table 1.
Samples were obtained during the ERCP procedure and included cytological brushings, forceps biopsies or both. For cytological brushes, samples were collected with multiple to-and-fro motions at the site of the stricture. The brush used for cytopathological examination and was placed into 15 mL of ThinPrep CytoLyt solution (Hologic, Marlborough, Massachusetts, USA) and transferred for specimen processing within 24 hours. A separate brush was used for BiliSeq testing and placed into a collection vial containing a detergent-based DNA lysis buffer. The collection vial was then transported to the UPMC MGP laboratory within 24 hours and processed accordingly. A similar approach was used for forceps biopsies with separate biopsies submitted for pathological examination and BiliSeq testing. No minimum specimen cellularity was required for testing. All specimens whether a brushing or biopsy were processed the same.
NGS was performed prospectively as part of clinical care and within a 10-day turnaround in the UPMC MGP laboratory, which is both CLIA-certified and CAP-accredited. Genomic DNA was isolated similarly for both brushing and biopsy specimens using the MagNA Pure LC Total Nucleic Acid Isolation Kit (Roche, Indianapolis, Indiana, USA) on Compact MagNA Pure (Roche). In cases where both a brushing and biopsy specimen was submitted, the DNA from both specimens was combined for subsequent analysis. Extracted DNA was quantified on the Qubit 2.0 Fluorometer using the dsDNA HS Assay Kit (Thermo Fisher Scientific, Waltham, Massachusetts, USA). Amplification-based targeted NGS was performed with primers for genomic regions of interest (https://mgp.upmc.com/Applications/mgp/Content/OncoSeq_Hotspots.pdf) within the following genes: AKT1, ALK, ATM, BRAF, CDKN2A, CTNNB1, EGFR, ERBB2, ERBB4, FGFR1, FGFR2, FGFR3, GNA11, GNAQ, GNAS, HRAS, IDH1, IDH2, KIT, KRAS, MET, NRAS, PDGFRA, PIK3CA, PTEN, SMAD4, TP53 and VHL as previously described.39 Amplicons were barcoded, purified and ligated with specific adapters. DNA quantity and quality check were performed using the 2200 TapeStation (Agilent Technologies, Santa Clara, California, USA). The Ion One Touch 2 and One Touch ES were used to prepare and enrich templates and enable testing via Ion Sphere Particles on a semiconductor chip. Massive parallel sequencing was carried out on an Ion Proton System according to the manufacturer’s instructions (Thermo Fisher Scientific) and analysed with the Torrent Suite Software V.3.4.2 for point mutations, small insertions/deletions and copy number alterations. Each variant was prioritised according to the 2017 Association for Molecular Pathology (AMP)/American Society of Clinical Oncology (ASCO)/College of American Pathologists (CAP) joint consensus guidelines for interpretation of sequence variants in cancer using a tier-based system.40 Tier I, Tier II and Tier III variants were reported; however, only Tier I and Tier II variants were used for subsequent analysis. The presence of a genomic alteration that qualified as a Tier I or Tier II variant was considered to be a ‘positive’ BiliSeq result, while a ‘negative’ BiliSeq result was determined to be the absence of a Tier I or Tier II variant. The limit of detection of the assay was at 3% mutant allele frequency (AF). The minimum depth of coverage for testing was 500×. For each mutation identified, a AF was calculated based on the number of reads of the mutant allele versus the wild-type allele and reported as a percentage.35 Copy number variation analysis was performed as previously described.41 A gene amplification was defined by the presence of ≥6 copies of a variant as previously described and validated using fluorescence in situ hybridisation (FISH) analysis.41 42
Classification of targetable genomic alterations
Clinically relevant genomic alterations were defined as alterations that are targetable by anticancer drugs currently available on the market or in registered clinical trials. Genomic alterations that are potentially targetable with reported kinase inhibitors (in the setting of wild-type KRAS, NRAS and HRAS genomic alterations) include: ALK, BRAF, EGFR, ERBB2, FGFR1, FGFR2, FGFR3 and MET.43–46 Genomic alterations within ATM may be targeted with platinum-based regimens and poly-(ADP-ribose) polymerase inhibition.47 Also, pharmacological inhibitors highly specific to individual IDH1 and IDH2 mutations have been developed.48 49
The sample size estimate/power calculation for the training cohort used within this study was based on a 60% prevalence of malignancy among bile duct strictures at our institution and corresponding 40% sensitivity for detecting malignancy using pathological evaluation of ERCP-obtained specimens. Therefore, a sample size of at least 50 patients would yield a >80% power with an alpha of 0.05. Statistical analyses to assess differences in mutational status were compared using Fisher’s exact test for dichotomous variables. For the training cohort, receiver operating characteristic (ROC) curves were generated to compare biomarker performances and to establish cut-off levels, using the Youden index. Final cut-offs were selected to maximise the area under the curve (AUC) as calculated by the trapezoidal method. However, biological rationale and the simplicity of the model were also considered. Sensitivity and specificity were calculated using standard 2×2 contingency tables for cases with confirmed diagnostic pathology. Unless otherwise stated, repeat testing was not used to calculate sensitivity and specificity for individual biomarkers. All statistical analyses were performed using the SPSS Statistical software, V.25 (IBM, Armonk, New York, USA) and statistical significance was defined as a p value of <0.05.
Performance of serum CA19-9, pathological evaluation and BiliSeq testing within training and validation cohorts
A training cohort of 57 patients that underwent ERCP for a bile duct stricture were prospectively accrued for biomarker analysis and summarised in table 1, online supplementary data and table 1. Serum CA19-9 studies, bile duct brushings and/or biopsies for pathological evaluation and BiliSeq testing were performed for 56 of 57 (98%) patients with only one patient that did not have a CA19-9 measurement at the time of clinical presentation. Considering the lack of a consensus CA19-9 cut-off as a marker of malignant strictures, an exploratory study of CA19-9 levels was performed (online supplementary figure 1). By ROC curve analysis, serum CA19-9 could distinguish malignant from benign strictures with an AUC of 0.727. At a threshold of ≥44 U/mL chosen using Youden’s index, an elevated CA19-9 yielded a sensitivity of 62% and specificity of 77% for detecting malignancy (table 2). Specimens obtained during ERCP for pathological evaluation included biliary brushings for 22 (39%) patients, biliary biopsies for 22 (39%) patients and both biliary brushings and biopsies for 13 (22%) patients. A pathological diagnosis of at least suspicious for adenocarcinoma was associated with a sensitivity and specificity of 46% and 100%, respectively, for malignancy.
A paired but separate biliary brushing and/or biopsy specimen was also submitted for BiliSeq testing. However, for the 13 patients who had both brushing and biopsy specimens for pathological evaluation, only two patients had paired brushing and biopsy specimens obtained for BiliSeq testing. For the remaining 11 patients, a corresponding brushing specimen, but not biopsy specimen, was submitted for BiliSeq testing. In total, 59 bile duct specimens from 57 patients were evaluated by BiliSeq. Genomic alterations were only detected in malignant strictures and not present in cases with benign cholangiopathy (figure 1). The sensitivity and specificity of a genomic alteration identified by BiliSeq was 63% and 100%, respectively.
A validation cohort was prospectively collected and consisted of 195 patients with follow-up for 163 (84%) patients (table 1, online supplementary data and table 1). A serum CA19-9 was measured at the time of presentation for 144 of 163 (88%) patients. Specimens submitted for pathological evaluation included biliary brushings for 49 (30%) patients, biliary biopsies for 57 (35%) patients and both biliary brushings and biopsies for 57 (35%) patients. A paired but separate brushing and/or biopsy specimen was also submitted for BiliSeq testing. For the 57 patients that had both brushing and biopsy specimens for pathological evaluation, 27 (47%) also had separate brushing and biopsy specimens submitted for BiliSeq testing. Of the remaining 30 patients, corresponding brushings or biopsies were obtained for 24 patients and 6 patients, respectively, for BiliSeq testing. In total, 190 bile duct specimens from 163 patients were submitted for BiliSeq testing.
Based on cut-off of ≥44 U/mL, serum CA19-9 within the validation cohort had a sensitivity of 81% and specificity of 65% for at least high-grade biliary dysplasia (table 2). Similar to the training cohort, the performance of pathological evaluation and BiliSeq testing was associated with sensitivities and specificities of 49% and 98%, and 76% and 100%, respectively. An overall analysis of both training and validation cohorts that included 220 patients with follow-up revealed sensitivities and specificities for serum CA19-9, pathological evaluation and BiliSeq testing of 76% and 69%, 48% and 99%, and 73% and 100%. The combination of pathological evaluation and BiliSeq testing increased the sensitivity of both assays to 83% and maintained a high specificity of 99%. The addition of serum CA19-9 to pathological evaluation and BiliSeq testing further improved the sensitivity to 95%, but the specificity decreased to 68%.
A subanalysis of bile duct specimen subtypes, history of PSC and repeat ERCP with biomarker testing
A total of 132 biliary brushings and 114 biliary biopsies had paired pathological evaluation and BiliSeq testing (table 3). The sensitivities and specificities for pathological evaluation of biliary brushings and biliary biopsies was 35% and 98%, and 52% and 100%, respectively. In comparison, BiliSeq testing of biliary brushings and biliary biopsies was associated with higher sensitivities of 71% and 75%, respectively, and specificities of 100%. Among 114 biliary biopsies, 28 specimens were obtained by digital cholangioscopy, while the remaining specimens were acquired through cholangiography. No statistically significant differences in sensitivity and specificity were identified between cholangioscopic-guided and cholangiographic-guided biliary biopsies (online supplementary table 2). Matching biliary brushing and biliary biopsy specimens for both pathological evaluation and BiliSeq testing were submitted for 29 patients. Pathological evaluation of brushings and biopsies, and BiliSeq testing of brushings and biopsies for these cases had sensitivities of 26%, 39%, 70% and 78%, respectively, and specificities of 100%.
EUS-FNA may improve the diagnosis of malignant distal bile duct strictures. In total, 141 patients with follow-up presented with a distal bile duct stricture and a concurrent EUS was performed for 86 patients and FNA for 61 patients. Among the 61 patients with concurrent ERCP and EUS-FNA, the sensitivities and specificities of detecting a malignant distal bile duct stricture based on pathological evaluation of ERCP-obtained biliary specimens and EUS-FNA specimens were 59% and 95%, and 58% and 100%, respectively. In comparison, BiliSeq testing yielded a sensitivity of 76% and specificity of 100%. Furthermore, combining pathological evaluation of ERCP-obtained specimens and BiliSeq testing was associated with a 90% sensitivity and 95% specificity for malignancy. The sensitivity increased to 93% with the addition of EUS-FNA evaluation and the specificity remained at 95%.
Thirty-seven of 220 (17%) patients had a documented history of PSC. This cohort consisted of 12 bile duct strictures with at least high-grade biliary dysplasia and 25 biliary strictures with benign cholangiopathy. As per the American Association for the Study of Liver Diseases (AASLD) guidelines, a serum CA19-9 of >129 U/mL is a marker of suspected cholangiocarcinoma within this high-risk population.50 Therefore, based on a cut-off of 129 U/mL, serum CA19-9 was associated with a 67% sensitivity and 84% specificity for at least high-grade biliary dysplasia. Pathological evaluation had a sensitivity of only 8% and specificity of 100%. In contrast, the sensitivity and specificity of BiliSeq testing was 83% and 100%, respectively. Combining serum CA19-9 to BiliSeq testing increased the sensitivity to 92% but decreased the specificity to 84%. The addition of pathological evaluation to BiliSeq testing did not improve the sensitivity of BiliSeq alone.
A repeat ERCP with biliary brushings and/or biopsies for pathological evaluation and Biliseq testing was performed for 20 of 220 (9%) patients (online supplementary table 3). This procedure was repeated once for 16 patients, twice for two patients and thrice for two patients. On the basis of follow-up, 15 patients had a malignant stricture and five patients had a benign stricture. Among the 15 patients with a malignant stricture, 8 patients had genomic alterations detected by BiliSeq on initial and repeat bile duct specimens. However, corresponding pathological evaluation of initial biliary brushings and/or biopsies were negative for malignancy. In fact, a preoperative diagnosis of at least suspicious for malignancy was never rendered for 5 of 8 malignant strictures. For the remaining seven patients, repeat biomarker testing was positive for malignancy by pathological evaluation and detectable genomic alterations (‘positive’) by BiliSeq testing for two patients; negative by pathology, but positive by BiliSeq for two patients; and both diagnostic modalities were negative for one patient. Inclusion of repeat ERCPs marginally increased the sensitivity and specificities of pathological evaluation, BiliSeq testing and the combination of both assays to 51% and 99%, 75% and 100%, and 87% and 99%, respectively.
Comparative BiliSeq testing of paired ERCP-obtained bile duct specimens and diagnostic pathology specimens
Diagnostic pathological material was available for 145 of 220 (66%) patients and consisted of 134 malignancies and 11 benign cholangiopathies. BiliSeq testing was performed on the diagnostic pathology specimen for 54 of 134 (40%) malignancies (online supplementary table 4). Genomic alterations were detected in all 54 cases that also included 12 patients with ERCP-obtained bile duct specimens that were negative for genomic alterations by BiliSeq. Among the remaining 42 tumours, the genomic profile was different for four cases that included additional genomic alterations than those detected by BiliSeq using ERCP-obtained bile duct specimens.
Therapeutically relevant genomic alterations and treatment data
On the basis of previously reported actionable targets and predictive markers of chemotherapeutic response, BiliSeq testing of ERCP-obtained bile duct specimens identified therapeutically relevant genomic alterations for 20 of 150 (13%) cancers (online supplementary figure 2). Alterations in genes involved in the receptor tyrosine kinase (RTK)/Ras/Mitogen-activated protein (MAP) kinase signalling pathway, which included ERBB2 (n=6), MET (n=4), ALK (n=2), FGFR2 (n=3) and FGFR3 (n=2), were detected in the majority of cases (n=13, 65%). Other potentially actionable alterations were found in ATM (n=2), IDH1 (n=1) and IDH2 (n=2).
Among the ERBB2-altered malignancies, two patients, whose adenocarcinoma harboured a copy number gain in ERBB2, received targeted chemotherapy to HER2/neu instead of standard chemotherapy alone. The first patient (online supplementary table 1, patient 157) is an elderly woman presenting with jaundice, worsening abdominal pain for a week, elevated liver function tests, a serum CA19-9 of 55 U/mL and a CT scan that showed gallstones and intrahepatic ductal dilatation. An ERCP demonstrated an intrahepatic bile duct stricture that was brushed for both pathological evaluation and BiliSeq testing. Pathological evaluation of the biliary brushings revealed highly atypical epithelial cells, suspicious for adenocarcinoma. However, BiliSeq testing detected an ERBB2 copy number gain and a TP53 mutation (p.C242fs*5 and c.723delC). Within 2 weeks, a subsequent triphasic CT scan revealed a 2.1 cm metastatic liver lesion in segment 4B/5 (figure 2A), and her serum CA19-9 was 422.4 U/mL. A diagnosis of malignancy was confirmed after repeat ERCP with both bile duct brushings and biopsy as an invasive moderately-differentiated adenocarcinoma. Clinically considered to be an intrahepatic cholangiocarcinoma, the patient was started on gemcitabine, cisplatin and trastuzumab. Within 3 months of treatment, her serum CA19-9 decreased to 14 U/mL and the metastatic liver lesion decreased to 0.6 cm. Furthermore, a follow-up ERCP noted an improvement in the patient’s bile duct stricture (online supplementary figure 3). One year after diagnosis, the patient is currently alive and well. Her serum CA19-9 continues to be within normal limits, and the metastatic lesion measures 0.3 cm in greatest dimension (figure 2B).
The second patient (online supplementary table 1, patient 166) is a middle-aged men who presented with jaundice, upper abdominal pain, elevated liver function tests, an elevated serum CA19-9 of 2383 U/mL, thickening of his perihilar bile duct on CT and a hilar stricture on MRCP. A Bismuth IIIa stricture was identified on ERCP and bile duct brushings and biopsy were positive for adenocarcinoma. The patient’s corresponding BiliSeq testing revealed a copy number gain in ERBB2 and copy number losses in TP53 and CDKN2A. The patient was referred for a liver transplantation, but on further workup was found to have a 2.6 cm metastatic lesion within segment 7/8 (figure 2C). At a serum CA19-9 of 7959 U/mL, the patient started a regimen of gemcitabine, cisplatin and trastuzumab for a clinical perihilar cholangiocarcinoma. After 3 months, the patient’s serum CA19-9 normalised to 9.4 U/mL. Eight months after diagnosis, the patient is currently alive and well. His serum CA19-9 continues to be within normal limits and the metastatic lesion decreased to 0.7 cm in greatest dimension (figure 2D).
Despite improvements in cross-sectional imaging, serum studies, endoscopic technologies and pathological evaluation, the distinction between benign and malignant bile duct strictures remains a diagnostic conundrum. In the current study, we developed and independently clinically validated a targeted NGS assay for the evaluation of bile duct strictures. We found our targeted NGS assay, BiliSeq, had a higher sensitivity (73% vs 48%) and a higher specificity (100% vs 99%) than pathological evaluation alone for detecting at least high-grade biliary dysplasia involving the bile duct. Moreover, the combination of both tests (BiliSeq and pathological evaluation) had a higher sensitivity than either test alone (83%), while maintaining a high specificity (99%).
To date, a few studies have investigated the application of targeted NGS to bile duct strictures. Recently, Bankov et al published the results of an NGS assay of similar design to BiliSeq using a retrospective cohort of 16 patients with indeterminant bile duct biopsies and cholangiocarcinoma on follow-up.51 The authors did not evaluate biliary brushing specimens and DNA obtained for NGS was extracted from FFPE tissue blocks. The authors acknowledged that due to the low input and low quality of DNA from FFPE tissue, a DNA enrichment technique was required for subsequent analysis. While a benign control group was not used for comparison, the sensitivity of their assay was 81%, which is comparable with BiliSeq. A more comprehensive assessment of NGS testing on bile duct specimens was reported by Dudley et al.6 The authors evaluated a cohort of 73 bile duct and 8 main pancreatic duct brushing specimens. Contrary to Bankov et al, their study cohort did not include bile duct biopsies. Furthermore, rather than obtaining a dedicated brushing for DNA isolation, DNA was extracted from CytoLyt-preserved specimens. As alcohol fixation may result in DNA degradation, it is not unexpected that 8 of 73 (11%) brushing specimens failed NGS testing. Within our study, none of the biliary brushing and/or biopsy specimens failed BiliSeq testing. Regardless, among the remaining 65 biliary brushing specimens, Dudley et al reported a sensitivity of 68% and a specificity of 97%, which was superior to pathological evaluation. The authors did however identify one false positive result. For this case, the duration of follow-up after NGS testing was unclear and, per the authors, the possibility of dysplasia within the bile duct system could not be excluded.
Although BiliSeq did not identify any false positive cases, there were 37 (25%) false-negative cases after accounting for repeat testing. In order to determine the cause of false-negative NGS results, we performed BiliSeq testing on available diagnostic pathological material from 12 of these 37 malignancies. A genomic alteration in at least one of the 28 genes was detected within our molecular panel. Considering the collection of bile duct specimens is technically difficult and sampling failures are known to occur, inadequate sampling of the stricture may be a possible explanation. Alternatively, low specimen tumour cellularity may also be a factor. The lowest limit of detection for our NGS assay is approximately 3% of mutant alleles or, for a heterozygous mutation, such as KRAS, 6% tumour cellularity. Therefore, if the fraction of tumour cells was lower than 6%, then BiliSeq would be unable to detect a pathogenic genomic alteration. While this finding could be viewed as a technical constraint, controlling the sensitivity of BiliSeq may be beneficial, especially for specific genes. Previous studies using surgical resection material have demonstrated KRAS mutations in cases of low-grade biliary dysplasia and benign cholangiopathy.52 Therefore, increasing the sensitivity of BiliSeq for KRAS mutations may result in a decrease in specificity for at least high-grade biliary dysplasia. It is also important to emphasise that a subset of malignant neoplasms involving the bile duct may harbour genomic alterations that are not assayed by our test. For example, mutations in chromatin remodelling genes (eg, ARID1A) and FGFR2 fusion genes are present in up to 20% of intrahepatic cholangiocarcinomas.26 30 32–34 Thus, the sensitivity of BiliSeq could potentially be improved with target amplification/enrichment techniques and an expanded gene panel.
Another intriguing, but preliminary observation, from our study is the striking difference in sensitivity between BiliSeq and pathological evaluation for at least high-grade biliary dysplasia among patients with PSC. PSC is an autoimmune condition and associated with a lifetime incidence of cholangiocarcinoma that ranges from 6% to 36%.53–56 The risk of cholangiocarcinoma is highest within the first 2 years of a PSC diagnosis and suggests a possible window of opportunity for early detection.55 57 However, a small cholangiocarcinoma in a background of an inflammatory stricture is particularly difficult to sample and challenging to diagnose by pathological evaluation. Although the number of patients with PSC within our cohort was relatively small, the sensitivity of pathological evaluation was only 8%. In comparison, BiliSeq achieved an 83% sensitivity for at least high-grade biliary dysplasia and provides compelling evidence to further explore its potential to improve clinical outcomes for this high-risk patient population.
Besides NGS testing, several adjunctive techniques have been developed clinically to evaluate bile duct strictures, such as digital image analysis, single gene sequencing and multifocal FISH for chromosomal polysomy. Among these ancillary studies, multicolour FISH has become an adopted tool by many academic institutions for over a decade.15 Analogous to BiliSeq, the premise of multicolour FISH is the finding that dysplastic and malignant neoplasms involving the bile duct are often characterised by a high frequency of numerical chromosomal abnormalities. However, this test typically achieves sensitivities of only 35%–60% with minimal technological improvement in the past few years.13 15 18 Multicolour FISH is expensive, labour intensive, prone to subjective interpretation errors and requires significant technical expertise. In fact, the interpretation of multicolour FISH requires an experienced pathologist, since a paired morphological assessment of the specimen is needed to ensure that the epithelial cells of interest are being evaluated for chromosomal abnormalities. While we did not perform a formal comparison between BiliSeq and multicolour FISH, Dudley et al6 found their NGS assay had a higher sensitivity than multicolour FISH. This is not surprising as NGS testing detects copy number alterations and single nucleotide variants in multiple genes. Furthermore, with decreasing costs and the ability to batch multiple specimens in a single run, NGS testing has become widely available to both academic and non-academic institutions. Thus, NGS testing may be a more preferable option than multicolour FISH in the evaluation of bile duct specimens. Demonstrating the clinical feasibility of BiliSeq to ERCP-obtained biliary specimens also lays the foundation for further applications of NGS testing to non-invasive specimen types, such as bile and serum (eg, circulating tumour DNA and exosomes).58–61
In addition to early detection, a major impetus for widespread adoption of NGS technologies has been solid tumour genotyping. Malignant neoplasms arising or secondarily involving the bile duct system are typically recalcitrant and current standard-of-care chemotherapeutic regimens are associated with limited efficacy. On relapse or progression after first-line or second-line chemotherapy, comprehensive genomic profiling for a subset of tumours has identified alterations that are potentially targetable or predictive markers of response to specific therapy.26 28 For example, ERBB2 amplification has been reported in intrahepatic cholangiocarcinoma, perihilar/distal cholangiocarcinoma and gallbladder adenocarcinoma with a prevalence of 5%, 17% and 19%, respectively.28 62 Among eight patients with ERBB2-amplified or HER2/neu-overexpressed gallbladder adenocarcinoma, Javle et al46 reported three had stable disease, four had partial responses and one had a complete response to targeted therapy (trastuzumab or trastuzumab with pertuzumab). Of note, five of the eight patients received first-line chemotherapy or chemoradiation prior to treatment with targeted therapy. Within our study, BiliSeq testing identified two patients with ERBB2 amplification: one intrahepatic cholangiocarcinoma and one perihilar cholangiocarcinoma. Both patients received trastuzumab in conjunction with standard first-line chemotherapy (gemcitabine and cisplatin), both exhibited measurable radiographic response and normalisation of serum CA19-9, and both are currently alive and well. While we recognise the benefits of standard chemotherapy, it seems reasonable to assume the inclusion of trastuzumab within the chemotherapeutic regimen to be responsible for the significant and prolonged treatment effects observed in both patients. It should also be stressed that the identification of a targetable alteration at the time of diagnosis facilitated a personalised approach to treatment. Moreover, ERBB2 was not the only genomic alteration detected by BiliSeq to potentially confer susceptibility to anticancer therapy.
We acknowledge that there are several limitations to this study. Although a large number of bile duct specimens were analysed, follow-up diagnostic pathology was available for only 66% of patients. However, the majority of patients with malignant neoplasms involving the bile duct system often have unresectable disease at presentation or are poor surgical candidates. In addition, in order to ensure sufficient DNA for NGS testing, a separate bile duct specimen than the specimen used for pathological evaluation was used for BiliSeq testing. This protocol may explain the incongruent results between pathological evaluation and BiliSeq testing for a subset of cases. Furthermore, this study does not address the optimal gene panel or approach of integrating NGS-based molecular testing to the evaluation of a bile duct stricture. To minimise the need for repeat sampling, we routinely perform BiliSeq on all patients with potentially malignant strictures at the time of initial ERCP. A positive BiliSeq result hastens the referral process for appropriate surgical and/or oncological subspecialists. A negative BiliSeq result provides additional reassurance to negative pathological evaluation. Because none of these diagnostic modalities achieves a sensitivity of >95%, few patients can be fully dismissed from surveillance based on a negative BiliSeq. A similar approach has been adopted by investigators using multicolour FISH. It is also important to recognise that recent advancements in cholangioscopy and increasing use of EUS-FNA may potentially improve the detection of malignant strictures.63 64 While a minority of patients within this study were evaluated by cholangioscopy, we found similar diagnostic performance characteristics between cholangioscopy and cholangiography.65 The application of EUS-FNA to bile duct strictures, especially distal strictures, is reported to enhance tissue acquisition over traditional ERCP methods.66 However, patients in our study cohort evaluated using both techniques did not show any statistically significant differences in sensitivity. In fact, BiliSeq yielded a higher sensitivity than ERCP or EUS-FNA alone and, in combination with either method, increased the sensitivity of detecting a distal malignant stricture to at least 90%. Lastly, our definition of targetable genomic alterations is based on an aggregation of a myriad of studies. The true clinical utility of any given drug to a particular genomic alteration is unknown until a prospective clinical trial is performed. However, we were able to demonstrate at least anecdotal evidence of clinical benefit by pairing genomic alterations, such as ERBB2 amplification, with specific anticancer therapy (eg, trastuzumab).
In summary, we report the largest prospective study to examine the role of NGS-based molecular (BiliSeq) testing to the evaluation of bile duct strictures. Our results support the clinical utility of combining BiliSeq to standard pathological evaluation of bile duct specimens. Additionally, BiliSeq has the potential to identify targetable genomic alterations and, therefore, stratify patients for specific chemotherapeutic regimens. Future studies are however required to explore the integration of BiliSeq and similar NGS-based assays into current surveillance and management guidelines for patients with bile duct strictures.
The authors would like to thank Mrs Kate Smith for outstanding administrative assistance. This study was supported in part by the Institute for Precision Medicine at the University of Pittsburgh, the UPMC Hillman Cancer Center, Shear Family Foundation, the Pittsburgh Liver Research Center at the University of Pittsburgh and the University of Pittsburgh Medical Center, and the Sky Foundation (to ADS).
Contributors Study concept and design: ADS, MNN and AS. Acquisition of data: all authors. Analysis and interpretation of data: ADS, MNN and AS. Drafting of the manuscript: ADS and AS.
Funding This study was funded by the University of Pittsburgh Institute of Precision Medicine.
Competing interests ADS has received an honorarium from Foundation Medicine, Inc.
Ethics approval Study approval was obtained from the University of Pittsburgh Institutional Review Board (IRB# PRO17030748).
Provenance and peer review Not commissioned; externally peer reviewed.
Patient consent for publication Obtained.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.