Objective: Health administrative databases can be used to track chronic diseases. The aim of this study was to validate a case ascertainment definition of paediatric-onset inflammatory bowel disease (IBD) using administrative data and describe its epidemiology in Ontario, Canada.
Methods: A population-based clinical database of patients with IBD aged <15 years was used to define cases, and patient information was linked to health administrative data to compare the accuracy of various patterns of healthcare use. The most accurate algorithm was validated with chart data of children aged <18 years from 12 medical practices. Administrative data from the period 1991–2008 were used to describe the incidence and prevalence of IBD in Ontario children. Changes in incidence were tested using Poisson regression.
Results: Accurate identification of children with IBD required four physician contacts or two hospitalisations (with International Classification of Disease (ICD) codes for IBD) within 3 years if they underwent colonoscopy and seven contacts or three hospitalisations within 3 years in those without colonoscopy (children <12 years old, sensitivity 90.5%, specificity >99.9%; children <15 years old, sensitivity 89.6%, specificity >99.9%; children <18 years old, sensitivity 91.1%, specificity 99.5%). Age- and sex-standardised prevalence per 100 000 population of paediatric IBD has increased from 42.1 (in 1994) to 56.3 (in 2005). Incidence per 100 000 has increased from 9.5 (in 1994) to 11.4 (in 2005). Statistically significant increases in incidence were noted in 0–4 year olds (5.0%/year, p = 0.03) and 5–9 year olds (7.6%/year, p<0.0001), but not in 10–14 or 15–17 year olds.
Conclusion: Ontario has one of the highest rates of childhood-onset IBD in the world, and there is an accelerated increase in incidence in younger children.
Statistics from Altmetric.com
Inflammatory bowel disease (IBD) is an important childhood chronic disease, with 20–30% of patients presenting under 20 years old.1 International data on trends in incidence and prevalence of childhood-onset IBD are conflicting. Some jurisdictions report increased rates of paediatric Crohn’s disease (CD) (but not ulcerative colitis (UC)),2 3 4 while others report increased rates of UC but not CD,5 or stable incidences.6 7 The incidence of CD in Canadian provinces studied to date is amongst the highest reported worldwide (13.4 per 100 000 in all age groups),8 although there is little known about paediatric IBD. There are no data on IBD in Ontario, Canada’s most populous province with 38% (12.2 million people) of the national population.9
Canada’s single-payer health system, in which all legal residents have universal access to all healthcare services, represents a unique opportunity to capture all cases of IBD within a large jurisdiction. Ontario’s health administrative databases are a large repository of all healthcare encounters for every legal resident, and these data have been used to develop surveillance programs for multiple chronic diseases including diabetes10 and asthma.11 These population-based cohorts have been used to assess epidemiology, health services use and outcomes.12 13 14 Critical to the accuracy of such data, however, is the rigorous validation of the best combination of health administrative data codes (known as a diagnostic algorithm) which most accurately define true disease.
Health administrative data have been used to describe epidemiological trends among primarily adult populations of patients with IBD in one Canadian study8 as well as among privately insured American patients.15 16 All three studies used algorithms for assessment of disease predominantly validated in adults.15 16 17 Administrative data algorithms have been reported to have differing accuracies across age groups,18 and differing healthcare patterns in children necessitate validation of a paediatric-specific algorithm.19
Our goal was to develop and validate a diagnostic algorithm using health administrative data to identify individuals with childhood-onset IBD and then to use the algorithm to estimate the incidence and prevalence of paediatric IBD in Ontario.
Administrative data sources
We used the health administrative databases available at the Institute for Clinical Evaluative Sciences (ICES; Toronto, Ontario, Canada). This study was approved by the research ethics boards of the Hospital for Sick Children (SickKids), Sunnybrook Health Sciences Centre and all institutions involved in the validation study. The databases used in this study included: hospital discharge abstract data mandatorily collected from all hospitals and reported to the Canadian Institute for Health Information, billing claims for all physician services provided from the Ontario Health Insurance Plan, and the Registered Persons Database (demographic data including region of residence). Hospital data prior to 2002 and all physician claims have diagnoses associated using codes from the International Classification of Disease (ICD)-9.20 Hospitalisations after 2002 used ICD-10 codes.21
Algorithm development sample
We used the IBD clinical database from SickKids to identify patients with childhood-onset IBD in Toronto. The database, created in 1980 to track prospectively all cases of IBD seen at SickKids, contained an estimated 90% of patients aged 0–15 years old diagnosed with IBD and residing in the census metropolitan area of Toronto in fiscal years 1991–1995.22 The database used Access 2003 (Microsoft Corporation, Redmond, California, USA). Prior to 1996, all paediatric gastroenterologists in Toronto practised at SickKids, and an earlier survey of adult gastroenterologists suggested that <10% of children with IBD <15 years old were independently managed by them,22 and the database is therefore considered population based for Toronto children <15 years old diagnosed from 1991 to 1995. SickKids’ patient information was linked to the ICES administrative data by health card number. Patients were excluded if they did not reside within Toronto for the entire period of 1991–2000. The remaining population of that age residing within Toronto from 1991 to 1995 was assumed not to have IBD and was used as the negative reference standard.
We determined the diagnostic accuracy (sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV)) of various combinations of physician office and procedure billings and hospital records using the diagnosis codes for CD (ICD-9: 555.x, ICD-10: K50.x) or UC (ICD-9: 556.x, ICD-10: K51.x). The 95% CIs were calculated according to the efficient-score method corrected for continuity.23 We determined whether sigmoidoscopy or colonoscopy prior to diagnosis improved the accuracy of the algorithm. The final algorithm was selected and agreed upon by a committee of five experts in the fields of paediatric and adult gastroenterology, health services research, epidemiology and biostatistics (EIB, AG, AMG, LR and TT). The committee decided on the algorithm with the highest possible PPV (to minimise the false-positive rate) while maximising sensitivity over the shortest possible duration to achieve accurate diagnosis.
To validate the algorithm for patients <18 years old from other regions of Ontario and those treated in a variety of practice settings, 31 practices across the province (3 academic paediatric gastroenterology, 3 community-based paediatric gastroenterology, 18 community-based adult gastroenterology, 3 adult surgery, 4 consultant paediatrics) were contacted to determine whether their providers diagnosed patients aged <18 years with IBD (to ensure accuracy of the algorithm for older age groups) from 2001 to 2005 (to ensure accuracy of the algorithm for patients diagnosed in a later time period). Sites approached included three tertiary care paediatric hospitals, and other sites were randomly selected using a provincial directory of practices. The final choices of sites to contact were based on geographic and practice diversity to ensure representation from northern, central, southern and eastern Ontario, with 3–4 practices from each region randomly selected from the directory. In participating centres, all available charts of patients diagnosed with IBD from 2001 to 2005 were reviewed to ensure accurate diagnosis based on published criteria,24 using clinical, histological and radiological information. For every IBD chart, two charts of patients without IBD were randomly selected and reviewed to act as the negative reference standard. Preference in chart review was given to patients without IBD who underwent colonoscopy because they were more likely to be misclassified as having IBD in the administrative data.
Chart extractions were conducted by two IBD specialists (EIB and DRM) and two experienced IBD research assistants. The research assistants were trained by the principal investigator (EIB). Ten charts from each practice were blindly reviewed by both to ensure consistency of diagnosis. In the cases of both assistants, there was 100% agreement with the investigator on the diagnoses. Clinical information was linked to health administrative data by health card number. Using chart information as the reference standard, we determined the parameters of diagnostic accuracy of the previously developed algorithm (sensitivity, specificity, positive likelihood ratio (LR+) and negative likelihood ratio (LR−)).
We determined whether patients had CD or UC using the latest clinical, histological and radiological information obtained from each chart. A diagnosis of CD, UC or IBD type undefined (IBD-U) was assigned by the data extractor based on published guidelines.25 Patients with IBD-U were excluded. Chart-based diagnoses were compared with ICD codes and an algorithm developed by assessing the combination of most recent ICD codes best able to distinguish CD from UC. The number of health services records achieving the highest possible area under the receiver operating characteristic curve (AUROC) while minimising unclassified patients was chosen, and the point of best cut-off was determined.26 Patients who could not be classified as CD or UC by our algorithm were labelled “unclassifiable”.
Estimates of incidence and prevalence
The Ontario Crohn’s and Colitis Cohort (OCCC) was created by using the validated algorithm to identify all children (6 months to 18 years) living in Ontario with IBD from 1994 to 2005, using data from the period 1991–2008. The date of incidence was assigned as the date of the first health services contact with a diagnosis of IBD within the cluster of healthcare utilisation qualifying the patient as having IBD. A 3-year look-back period (with no diagnoses of IBD at the time of physician contact or hospitalisation) was used to ensure that cases were truly incident. This was based on the expert opinion of clinicians on the committee due to the lack of likelihood that a child with IBD would be lost to follow-up for >3 years. Patients with previous diagnoses of IBD but which were not part of the diagnostic cluster of the algorithm were considered prevalent, but not incident cases.
The sex- and age-adjusted annual prevalence and incidence rates per 100 000 population were determined for 1994–2005, with corresponding 95% CIs based on gamma distribution.27 We used the Canadian censuses from 1991, 1996, 2001 and 2006 to calculate annual intercensal population estimates of children aged <18 years.28 Using multivariable Poisson regression, we modelled the relationship between year of diagnosis (as the main predictor variable) to changes in prevalence and incidence over time, controlling for age group and sex. Due to a significant interaction between age group and year of incidence, we stratified the regression by age group. All statistical analysis was conducted using SAS version 9.1.3 (SAS Institute, Cary, North Carolina, USA).
Algorithm development—study population
Within the SickKids IBD database, 183 Toronto children were identified as having IBD diagnosed between 1991 and 1995 and acted as the positive reference standard. Between 1991 and 1995, 936 514 children under 15 years old resided in Toronto and acted as the negative reference standard.
Algorithm development—diagnostic properties
The most accurate algorithm consisted of two steps, based on whether sigmoidoscopy or colonoscopy was performed (see table 1). Those that did not undergo endoscopy required more stringent criteria for accurate ascertainment. If a patient underwent endoscopy, four physician contacts or two hospitalisations with a CD or UC diagnosis within 3 years were required for accurate diagnosis. If a patient never underwent endoscopy, seven physician contacts or three hospitalisations were required. This two-step algorithm accurately predicted a true IBD diagnosis (sensitivity 89.6% (95% CI 84.0% to 93.5%), specificity >99.9% (95% CI 99.9% to 100%], PPV 59.2% (95% CI 53.1% to 65.0%), NPV >99.9% (95% CI 99.9% to 100%]). For patients <12 years old, the algorithm accurately predicted IBD with higher PPV (PPV 76.0% (95% CI 68.9% to 82.0%)). As some jurisdictions may not have endoscopic procedure data, we also determined the best single-step algorithm, ignoring whether a patient underwent diagnostic endoscopy (see table 2). A single healthcare encounter for IBD (either physician contact, procedure or hospital admission) was a poor predictor of the positive diagnosis of IBD (PPV 7.9% (95% CI 6.8% to 9.0%)).
Algorithm validation by chart review
Of the 31 sites contacted, seven practitioners failed to respond, three refused to participate and nine practices had not diagnosed a child <18 years old with IBD during the relevant time period. Twelve medical practices participated in the validation study (3 academic paediatric gastroenterology, 3 community paediatric gastroenterology, 5 adult gastroenterology, 1 consultant general paediatrics). With chart review, 599 patients were confirmed to have IBD, of which 593 could be linked to administrative data (342 with CD, 226 with UC, 26 with IBD-U). A total of 551/593 (92.9%) underwent diagnostic sigmoidoscopy/colonoscopy. Of patients without IBD, 1251 charts were reviewed, 1241 of which could be linked to administrative data. Of these, 370 (29.8%) had sigmoidoscopy/colonoscopy. Based on chart review at 11/12 participating centres, the most common non-IBD diagnosis given was gastro-oesophageal reflux disease (18.2%), with other common diagnoses including idiopathic/functional abdominal pain (15.7%), irritable bowel syndrome (10.3%) and chronic constipation (9.4%). The diagnostic properties achieved for each algorithm using chart information from the different centres is provided in tables 1 and 2. The two-step algorithm achieved a sensitivity of 91.1% (95% CI 88.4% to 93.2%), specificity of 99.5% (95% CI 98.9% to 99.8%), LR+ of 188.3 (95% CI 84.7 to 418.6) and LR− of 0.0898 (95% CI 0.0694 to 0.116).
The seven most recent available physician billing claims accurately determined a diagnosis of CD (AUROC 0.9618). Patients with CD were distinguished from patients with UC if five of their last seven diagnoses were for CD (sensitivity 95.1%, specificity 86.0%, PPV 92.0%, NPV 91.2%). Conversely, patients with UC were labelled as such if five of their last seven physician claims had a diagnosis of UC. Otherwise, patients were labelled unclassifiable. Using this strategy, 5.4% of patients diagnosed with CD or UC were inaccurately deemed to be unclassifiable compared with their charts. If patients did not have a record of seven physician contacts, they were labelled CD if all of their diagnostic codes were for CD, UC if all of their diagnoses were UC, or unclassifiable.
Estimates of incidence and prevalence
Table 3 describes prevalence and incidence among children <18 years old by year of diagnosis. Table 4 presents crude incidence and prevalence stratified by sex, age group and diagnosis (total IBD, CD or UC). Figures 1 and 2 illustrate trends over time from 1994 to 2005. Age- and sex-standardised prevalence of IBD per 100 000 population has increased from 42.1 (95% CI 39.6 to 44.8) in 1994 to 56.3 (95% CI 53.6 to 59.1) in 2005 (p<0.0001). Prevalence of CD has increased from 23.9 (95% CI 22.0 to 25.9) to 31.6 (95% CI 29.6 to 33.7) (p<0.0001). Prevalence of UC has increased from 16.2 (95% CI 14.6 to 17.8) to 19.7 (95% CI 18.1 to 21.4) (p<0.0001).
The OCCC contains 3169 incident cases of paediatric IBD diagnosed between 1994 and 2005. The incidence of IBD per 100 000 population has increased from 9.5 (95% CI 8.4 to 10.8) in 1994 to 11.4 (95% CI 10.2 to 12.7) in 2005 (p = 0.03). The incidence of CD has changed from 5.0 (95% CI 4.1 to 5.9) to 6.0 (95% CI 5.2 to 7.0) (p = 0.19). The incidence of UC has remained comparatively unchanged from 4.1 (95% CI 3.3 to 5.0) to 4.2 (95% CI 3.5 to 5.1) (p = 0.55).
Results of the adjusted regression models stratified by age group and diagnosis (CD or UC) are presented in table 5. Significant increases in incidence are seen in patients with IBD in 6 month to 4 year olds (5.0% per year, p = 0.03) and 5–9 year olds (7.6% per year, p<0.0001). When stratified by age group and diagnosis, the only group with a statistically significant increase in incidence was patients with CD aged 5–9 (8.7% per year, p<0.0001). No statistically significant interaction existed between sex and year of diagnosis. Table 5 also describes the male predominance of patients with CD in the younger age groups (5–9 years and 10–14 years), with a balanced incidence between males and females in 15–17 year olds. UC is more likely in males at younger ages (<10 years), while it is more common in females in preadolescence and adolescence.
Canadian health administrative data provide an outstanding resource for population-based chronic disease surveillance. We developed a novel algorithm for identifying children with IBD within Ontario’s administrative data. The strengths of our algorithms include an accurate determination of the PPV in light of the population-based sample with accurate estimation of prevalence in Toronto, specific applicability to children and its validation across paediatric age ranges, in both ambulatory and hospitalised populations, in multiple geographic regions and across different practice types. The higher PPVs seen in younger patients emphasise the need to validate algorithms in all age groups to which they will be applied. Our development and validation of a definition specific for children and youth should allow for better case ascertainment in Canada and other jurisdictions with comparable physician claims and hospitalisation data.
To our knowledge, the OCCC is the largest population-based cohort of paediatric patients with IBD in the world. We found an increased prevalence and incidence over the 12 years of surveillance. This is in keeping with population-based studies from other jurisdictions which have reported increased rates of IBD in children, particularly CD.2 3 4 A recent Norwegian study demonstrated a doubling of the incidence of IBD over 15 years in children <16 years old compared with a historical cohort.29 One other study reported incidence trends by age in paediatric IBD. Armitage et al30 31 reported significantly increased incidence rates between 1981 and 1995 in both CD and UC, with females 7–11 years and both sexes 12–16 years at higher risk in later years. Our study was sufficiently powered to examine incidence trends in all age groups including the youngest children, despite the rarity of IBD in that population. We were also able to determine the ages at which the gender ratio and CD:UC ratio change, and we found no difference in the rate of increase in incidence by sex. The preponderance of male children with IBD has been well documented, and we demonstrated that an adult pattern, in which females are slightly more likely to develop IBD, occurs after the onset of puberty. Changes in gender ratio after puberty have previously been demonstrated in other immune-mediated conditions such as myasthenia gravis32 and type 1 diabetes.33
Ontario appears to have higher rates of IBD than those reported in other large and well-designed population-based studies from the USA,34 Scotland35 and other parts of the UK.30 The incidence and prevalence rates for children <18 years old are lower than those reported for some other Canadian provinces, although those provinces reported rates for <20 year olds.8 Given the peak ages of occurrence of IBD, inclusion of 18 and 19 year olds would be expected to increase rates for any “paediatric” cohort that includes these older adolescents. Additionally, the observed difference may be influenced by environmental factors, migration patterns or the accuracy of the diagnostic algorithm, which was validated primarily in adults,17 and may have performed differently in the <20 year age group.
It is noteworthy that the increased incidence demonstrated by this study appears to have occurred primarily after 2001 (see fig 2). This may be explained by an as yet undetermined environmental change or by evolving immigration patterns in recent years. In fact, the proportion of immigrants to Ontario from South Asia (India, Pakistan, Sri Lanka, etc.) has more than doubled between 1981 and 2000,36 and this group was reported to have increased rates of IBD following arrival in Canada.37 The proportion of immigrants to Ontario from other regions of the world including China, the Middle East, Europe and Africa has remained stable or decreased.36 This changing pattern of immigration may explain the stable rates of paediatric IBD in other jurisdictions (with smaller proportions of South Asian patients).
A number of limitations to this study exist. These include the lack of medication data to aid in the identification of patients with IBD. We were unable to determine whether medication data improved the test properties of our algorithm, and we would encourage jurisdictions with such data to assess whether they improve identification of children with IBD. We were able to identify that a physician billing for colonoscopy improves the confidence with which we can identify children with IBD within health administrative data. Published guidelines allow for the diagnosis of IBD without colonoscopy (using radiology or surgical pathology)25; however, the vast majority of young patients in Ontario underwent colonoscopy. However, should the investigation of IBD change in the future (and the frequency of colonoscopy decrease), the two-step algorithm may no longer apply. As such, we have reported the accuracy of algorithms excluding colonoscopy (table 2). Unlike other studies validating IBD diagnostic algorithms, we accurately reported PPVs because we attempted to identify all patients with and without IBD within a jurisdiction using administrative data. The lower PPVs seen in older patients from the Toronto cohort can be explained in a number of ways. Our algorithm may be less robust in older adolescents; however, this was not observed in the chart validation portion of this study. More probably, the reference standard SickKids IBD database contains a lower percentage of mid-adolescent patients with IBD residing in Toronto than had previously been documented.22 We feel this is more likely because no matter how restrictive our algorithm, we were unable to achieve a PPV of >59.8% in <15 year olds. To be certain of the accuracy of our algorithm, we therefore confirmed it with a second reference standard: patient charts from multiple practices across the province. We achieved excellent accuracy in this sample, providing reassurance that the algorithm would accurately identify children with IBD when applied to administrative data.
It is important to view our data as physician-identified prevalence. It is possible that the reported increase in IBD may not be due to more prevalent disease but rather due to improved physician detection of the disease due to changes in physician practice patterns, improved access to diagnostic procedures or more awareness of the possibility of early-onset IBD. However, if our findings were due to earlier diagnosis of IBD (without increased overall incidence), we would expect that the incidence trends in the >10 year old groups would have decreased as they increased in the <10 year old groups, which was not the case.
In summary, we have reported on the development and rigorous validation of an algorithm to allow accurate identification of a population-based cohort of Ontario children with IBD using health administrative data from multiple sources. We report a significant rise in the prevalence and incidence of IBD, especially in children under the age of 10 years. Overall, the quality and availability of health administrative data are improving in North America and elsewhere. We encourage researchers to apply this algorithm to administrative data from other jurisdictions after validation, which will allow for future collaborative research examining paediatric IBD internationally. The population-based nature of the OCCC makes it ideal to act as an IBD surveillance programme in order to track trends over time and answer important epidemiological and health services questions about patients with childhood-onset IBD.
The authors wish to thank the physicians involved in the chart validation portion of this study: Drs Susan Kovacs (Toronto), Carol Durno (Toronto), Latifa Yeung (Toronto), Steven Brien (Peterborough), Teresa Bruni (Thunder Bay), Naoki Chiba (Guelph), William McMullen (Mississauga), Brian Murat (Huntsville),and Kimberly Tilbe (Sudbury). We are grateful to Courtney Francoeur and Hossai Muniri, RN for their assistance, as well as Peter Austin, PhD for his guidance.
Funding This research was funded by a Clinical Research Award from the American College of Gastroenterology and was made possible with the support of the Institute for Clinical Evaluative Sciences which receives funding from the Ontario Ministry of Health and Long-Term Care (MOHLTC). The results and conclusions are those of the authors; no official endorsement by the Ontario MOHLTC should be inferred. EB is a Canadian Institutes of Health Research (CIHR) training fellow in the Canadian Child Health Clinician Scientist Program, in partnership with the SickKids Foundation and the Child & Family Research Institute of British Columbia, and was also supported by fellowships from the North American Society for Pediatric Gastroenterology, Hepatology and Nutrition-Children’s Digestive Health and Nutrition Foundation, and the Clinician Scientist Training Program of the Research Institute of the Hospital for Sick Children. AG was supported by a CIHR New Investigator Award.
Competing interests None.
Provenance and Peer review Not commissioned; externally peer reviewed.
Ethics approval This study was approved by the research ethics boards of the Hospital for Sick Children (SickKids), Sunnybrook Health Sciences Centre and all institutions involved in the validation study.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.