Article Text


Prospective comparison of faecal incontinence grading systems


Background Existing scales for assessing faecal incontinence have not been validated against clinical assessment, or with regard to reproducibility. They also fail to take into account faecal urgency, and the use of antidiarrhoeal medications.

Aims To establish the validity, and sensitivity to change, of existing scales and a newly designed incontinence scale.

Methods (1) Twenty three patients (21 females, median age 57 years) were prospectively evaluated by two independent clinical observers, using three established scales (Pescatori, Wexner, American Medical Systems), a newly devised scale which also includes details about urgency and antidiarrhoeal drugs, and by a 28 day diary. (2) A further 10 female patients were assessed by the same scales before and after surgery for faecal incontinence.

Results (1) Assessments by two independent clinicians correlated well. All four scales and a diary card correlated highly and significantly with the clinical impression, with the new scale reaching the highest correlation (r=0.79, p<0.001). (2) All except one score changed significantly in response to surgical treatment; the new scale showed the greatest change, at the highest level of significance (p=0.004), and correlated best with the clinicians’ assessment of change (r=0.94, p<0.001).

Conclusions Existing scales for the assessment of faecal incontinence correlate well with careful clinical impression of severity, and serve as useful and reproducible measures for comparison of patients and treatments. A newly devised scale has shown high clinical validity and utility.

  • faecal incontinence
  • grading systems

Statistics from

A scoring system for the assessment of severity of faecal incontinence is required to gain an objective comparison of outcomes of both conservative and surgical treatments. A number of scales have been published,1-5 but their reproducibility and value have not been compared. This paper presents a validation study of three commonly used continence grading scales (tables 1, 2, and 3). These scales have not been compared with a diary system, which formed a further aim of this study.

Table 1

The Pescatori score3

Table 2

The Wexner score4

Table 3

The American Medical Systems score5

This study introduces a new scoring system, which combines components of these scales, and also contains an assessment of faecal urgency and the need to take antidiarrhoeal medication. We have been impressed by how patients may avoid incontinence by remaining close to a toilet; previous scales have not taken this urgency into account and may therefore underestimate the severity of the condition. Antidiarrhoeal drugs may also mask the underlying condition, and have therefore been taken into account in developing this modified scale. The latter is somewhat akin to the incorporation of antidiarrhoeal drug use into the Crohn’s disease activity index (CDAI).6

Creating a faecal incontinence scoring system which is both reproducible and simple to use is complex due to the variable nature of the condition. Unlike urinary incontinence, where only liquid is lost, faecal incontinence may be for solid or liquid stool or for flatus alone. Frequency and quantity of stool lost must be included in the scoring system. Faecal incontinence may be passive—that is, without the patient’s awareness, or urgent—that is, the inability to defer defecation, and both of these should be reflected in the scale. Finally an inication of the effect of the incontinence on lifestyle adds information which may best indicate the need for treatment. This includes the need to use pads or plugs and the ability or confidence to perform work and leisure activities. These factors are all taken into account when a focused history is taken from a patient with faecal incontinence, and this has been used for comparison with the established and new scales.

Patients and methods


The Wexner Continence Grading Scale4 has become a widely used for the assessment of severity of faecal incontinence. It is simple to use and easily understood by patients. We felt that there were three areas in which this scale could be improved. Firstly, the scale does not take account of faecal urgency, which can be present without overt faecal incontinence. Secondly, the need to wear a pad is given equal weighting to the occurrence of incontinence. However the use of a pad may not be a measure of the severity of faecal incontinence, but rather reflect the patient’s degree of fastidiousness. The use of a pad also often relates to the presence of coexistent urinary leakage. Finally, in the comparison of degree of incontinence preoperatively and postoperatively, the introduction of antidiarrhoeal drugs should be taken into account. These are often given as a part of the treatment package and a failure to recognise this could give a false impression of the surgical success rate. In developing a new scale, we felt that the Wexner scale formed an excellent basis, but with these modifications.

Our new scale (see table 4) has introduced an assessment of the ability to defer defecation and an additional score for the use of antidiarrhoeals, and reduced the emphasis on the need to wear a pad.

Table 4

The newly developed incontinence score


Twenty three consecutive patients (21 females, median age 57 years, range 30–78 years) with faecal incontinence, referred for anorectal physiological testing, were prospectively evaluated. It was calculated that a sample size of 23 would be sufficient to detect a correlation of 0.55 or better at the 5% significance level with 80% power. Eight had passive incontinence, seven had urge incontinence, and eight had both passive and urge incontinence. A further healthy female volunteer, aged 57 years, without faecal incontinence, was added to the group as a negative control. This was mainly for the purpose of ensuring that all questions were unambiguous.

We chose the most recently developed and commonly used scores for evaluation: the Pescatori3 (table 1), Wexner4(table 2), American Medical Systems (AMS)5 (table 3), and our new scale (table 4).

Two investigators (CJV and EC) independently took a detailed history from each patient and had access to examination findings, anorectal physiological tests, and the endoanal ultrasound. Each then gave the patient a “clinical score”, on the scale of 0 to 20, designed to reflect the severity of faecal incontinence based on the clinical information without use of a formal scoring systems. A third investigator (JAC) did not take a history or have access to clinical information but assisted the patients with the written incontinence scoring systems. All three investigators were blinded to each other’s results.

As a separate measure, each patient was sent home with a 28 day scored diary (fig 1). Items were each allocated a numerical value based on our perceived estimate of the severity of a particular symptom, ranging from 0.5 to 2, with a possible maximum score of 10 each day, and a possible maximum for the 28 days of 280.

Figure 1

Diary card. The patients were sent home with 28 of these diary cards and requested to fill out one each night for four weeks. Each positive answer resulted in a numerical score as listed. Maximum score per day = 10 = worst incontinence.


Retesting using each of the four scoring systems was performed on a randomly selected subset of 13 of the 24 patients at a median of 14 days (range 8–20 days) after the first test. Retesting 13 patients allowed estimation of correlation of 0.7% or better at the 5% significance level with 80% power.


A further 10 female patients (median age 57 years, range 31–64), were prospectively evaluated using the four scoring systems before and six weeks after surgery for faecal incontinence. The improvement in incontinence scores was then correlated with the investigators’ assessment of improvement. Five patients underwent an overlapping anterior sphincter repair7 for obstetric damage and five underwent implantation of an artificial bowel sphincter.8


To compare the clinical assessment with the four incontinence scales and the diary card, all scores were converted to percentages. The data were found to be normally distributed using the Shapiro Francia W′ test. Statistical analysis by pairedt test compared the mean of the clinical impression scores of investigators 1 (CJV) and 2 (EM). Analysis of variance (ANOVA) was used to determine interobserver reliability using the variance between observers, between patients, and error. The mean of the clinical impression scores for the two observers was then correlated with each of the incontinence scoring systems using the Pearson correlation. The test-retest reliability was calculated as the proportion of the total variability (patients + occasions + error) due to variation between patients. A value of p<0.05 was considered significant.



Table 5 details the clinical and grading scale scores. Eighteen of 23 patients (78%) completed and returned the 28 day diary card. The mean (SD) diary score was 91 (59), range 0 to 205. There was no significant difference between the mean clinical impression scores of investigators 1 and 2 (difference in means 4.2, 95% confidence interval −0.8 to 9.1, p=0.09, paired ttest). There was no significant bias between the two observers. The interobserver reliability was 0.88 (ANOVA). The mean of the two clinical impression scores was correlated with each of the incontinence grading scales. Table 6 summarises correlation coefficients. The control scored zero on clinical assessment and on all of the scoring systems evaluated. There were significant correlations between the mean clinical impressions and all the incontinence grading systems. The highest correlation was with our newly devised scale and the Wexner scale, and the lowest with the AMS score.

Table 5

Patient score on the clinical assessments and grading scales

Table 6

Correlation of scoring systems with clinical assessment


Table 7 summarises the estimated variance components and test-retest reliability of the four scales. Values of zero equated to occasions where there was a negative value for variance components. The newly devised scale had the highest test-retest reliability.

Table 7

The estimated variance components and test-retest reliability of the four scales


There were significant changes for all scores except for the AMS score, but the largest and the most significant change was for the newly devised scale (table 8). There were significant correlations between all incontinence grading systems and the investigators’ assessment of improvement, the highest correlation being for the newly devised scale and the lowest for the AMS scale (table9).

Table 8

Pre- and postoperative assessments (sensitivity of scales to change)

Table 9

Correlations between incontinence grading systems and the investigators’ assessment of improvement


Browning and Parks produced one of the first scoring systems for faecal incontinence.1 That scale had the advantage of simplicity but only assessed whether the patient was incontinent for solid or liquid stool, or flatus. A patient with daily loss of large volumes of liquid stool was scored as less severely incontinent than one with infrequent loss of a small amount of both solid and liquid stool.

Millar et al devised a score which took into account both the degree and frequency of incontinence.2This score was further modified by Pescatori et al to increase the sensitivity of the frequency scale.3 This scoring system was limited to a score out of only six points and did not take account of the amount of stool lost. Williams and colleagues9 and Baeten and colleagues10 used similar scales when evaluating the outcome of treatment with the dynamic graciloplasty.

Wexner developed the first incontinence scoring system to take into account usage of pads and lifestyle alteration as well as the consistency and frequency of incontinence.4 The recently developed scoring system from American Medical Systems5has been used to evaluate the newly designed artificial bowel sphincter. It used a more complex scoring questionnaire, asking the patient for a retrospective evaluation of the previous four weeks. It included consistency of stool lost, frequency, and effect on lifestyle. However it was complex and the final scores ranged from 0 to 120 with a choice of six different frequencies of incontinence.

This study has shown that our new scale closely correlates with a detailed clinical assessment by two independent observers. It has also shown this scoring system to be superior to the other scores with respect to reproducibility and sensitivity to change produced by definitive treatment. We have also shown that three out of the four tested clinical scales and the prospectively collected diary card correlated well with clinical evaluation of two observers.

Clinical assessment of severity of faecal incontinence varies between clinicians according to their expertise. This causes difficulties when comparing results of published data, often making comparisons of treatment modalities meaningless. Many attempts have been made in the past to develop scoring systems but their clinical applicability has not been validated adequately. This study has established the validity of these scoring systems, and refined a well established scale to take into account important clinical parameters.


We are grateful to Mrs Caroline Dore, Department of Medical Statistics and Evaluation, Imperial College School of Medicine, for her statistical analysis of this study.


View Abstract


  • Abbreviations:
    Crohn’s disease activity index
    American Medical Systems
    analysis of variance
    artificial bowel sphincter

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.