Computational modelling
ELDA: Extreme limiting dilution analysis for comparing depleted and enriched populations in stem cell and other assays

https://doi.org/10.1016/j.jim.2009.06.008Get rights and content

Abstract

ELDA is a software application for limiting dilution analysis (LDA), with particular attention to the needs of stem cell assays. It is the first limiting dilution analysis software to provide meaningful confidence intervals for all LDA data sets, including those with 0% or 100% responses. Other features include a test of the adequacy of the single-hit hypothesis, tests for frequency differences between multiple data sets, and the ability to take advantage of cases where the number of cells in the sample is counted exactly. A webtool at http://bioinf.wehi.edu.au/software/elda/ provides an easy user interface.

Introduction

A limiting dilution assay is an experimental technique for quantifying the proportion of biologically active particles in a larger population (Finney, 1952, Fazekas de St. Groth, 1982, Taswell, 1987). It is a type of dose–response experiment in which each individual culture allows a negative or positive response. Replicates are conducted which vary in the number of active particles tested. The process of dilution of the dose is typically continued to extinction of the response, or close to it. The rate of positive and negative responses at each dose allows the frequency of biologically active particles to be inferred.

Limiting dilution assays have been actively used in a wide variety of biological and scientific contexts for more than a century, most notably for quantifying bacteria (Phelps, 1908), immunocompetent cells (Makinodan and Albright, 1962) or stem cells (Breivik, 1971). In immunology, limiting dilution assays were popularized by the work of Lefkovits and Waldmann (1979) as a systematic technique for the study of B-cells and T-cells and their interactions. In different application areas an individual assay can take on different forms. In stem cell or cancer research, an assay might actually consist of, for example, an in vivo transplantation or injection. In this article, we will use the term “culture” to refer to an individual assay, regardless of the application area.

We use the term limiting dilution analysis (LDA) to refer to the statistical analysis of data from limiting dilution assays. LDA typically assumes the Poisson single-hit model, which assumes that the number of biological active particles in each culture varies according to a Poisson distribution, and a single biologically active cell is sufficient for a positive response from a culture (Greenwood and Yule, 1917, Taswell, 1981). As a statistical technique, LDA applies equally to a range of experimental scenarios which produce dose–response data whether or not these are limiting dilution assays in the strict sense. From this wider point of view, the main requirements are that the cultures are independent and that the frequency of biologically active particles is constant.

The classical aim of LDA is to estimate the active cell frequency (Finney, 1952, Lefkovits and Waldmann, 1979, Taswell, 1981). Fisher (1922) showed that the estimator with the best possible precision can be derived by the statistical method of maximum likelihood estimation (MLE). The same estimation strategy was outlined even earlier by McCrady (1915). An efficient computational algorithm for MLE was worked out by Mather (1949) and Finney (1951). The MLE computations for LDA became available in general purpose statistical software after they were shown to fall within the framework of generalized linear models (GLM) by Nelder and Wedderburn (1972) and McCullagh and Nelder (1989). Free open-source GLM software has been available through the R project (www.r-project.org) since the late 1990's, although this software is designed for mathematicians and statisticians rather than biologists or immunologists. The GLM approach to LDA frequency estimation is also implemented in the Microsoft Windows software application L-Calc (Stem Cell Technologies, www.stemcell.com), and this version has proved highly popular (Omobolaji et al., 2008, Bowie et al., 2007, Chen et al., 2008, Eirew et al., 2008, Huynh et al., 2008, Janzen et al., 2006, Kent et al., 2008, Liang et al., 2007, Maillard et al., 2008, Oostendorp et al., 2008, Sambandam et al., 2005, Schatton et al., 2008, Walkley et al., 2007).

MLE is not the only efficient estimation strategy for LDA. Taswell (1981) showed that minimum chisquare (MC) estimation has equal or even better accuracy MLE in certain situations, when the number of distinct doses is small but the number of replicates is large. Strijbosch et al. (1987) argued that MLE could be further improved by incorporating a jackknife correction for bias. However the difference in performance between these methods is small. MLE remains our method of choice because it provides the most flexible and powerful framework for confidence intervals and hypothesis testing as well as estimation. Unfortunately, Lefkovits and Waldmann (1979) recommended a more statistically naïve method for LDA based on least square regression (LS). Taswell (1981) showed LS to be an order-of-magnitude less accurate than either MLE or MC. While LS gives acceptable results when the number of replicate cultures is very large (Lefkovitz and Waldman recommend a minimum of 60 replicate cultures per dose), it proves dangerously unreliable in the common situation that the data is less plentiful (Taswell, 1981).

There are at least two other distinct scientific aims which LDA might have, apart from the classical aim of estimating the active cell frequency. A second common aim is to check the validity of the single-hit hypothesis. A third possible aim, which has so far received little attention, is to compare the active cell frequency between different cell populations. Understanding these aims has a profound influence on the experimental design.

In stem cell research, a very common aim, perhaps the key aim, is to isolate as pure a population of stem cells as possible. In pursuit of this aim, it is common to sort cells according to different markers, and test for stem cell enrichment in the sorted subpopulations. In this process, a precise estimate of stem cell frequency may not be required in populations which are clearly depleted for these cells. Indeed, when an effective stem cell marker is discovered, the sorting process leads naturally to subpopulations which contain no stem cells, and hence give no positive cultures at any dose in an dilution assay (Vaillant et al., 2008). In this situation, it is of interest to establish that the subpopulation is significantly depleted relative to the enriched population or, even better, to place an upper bound on the stem cell frequency which could reasonably be in the depleted population. Pursuing a precise estimate of the active cell frequency would be meaningless. The converse situation also arises. Quintana et al. (2008) show that cancer stem cells are more comment than previously appreciated, and present many assays with 100% positive results. In this situation it is of interest to place a lower bound on the stem cell frequency. LDA methods have not so far covered these situations.

Having good statistical power to check the single-hit model requires that wide range of different dilutions are used, with a moderate to large number of replicate cultures and with a worthwhile number of both positive and negative results. Many lack of fit tests have been proposed (Stein, 1922, Moran, 1954a, Moran, 1954b, Armitage, 1959, Cox, 1962, Shortley and Wilkins, 1965, Gart and Weiss, 1967, Thomas, 1972, Lefkovits and Waldmann, 1979, Taswell, 1984, Bonnefoix and Sotto, 1994, Bonnefoix et al., 1996, Bonnefoix et al., 2001). Some of the tests are graphically motivated (Shortley and Wilkins, 1965, Gart and Weiss, 1967, Bonnefoix et al., 2001). Lefkovits and Waldmann (1979) also emphasize the need to plot the data to check the assumptions. Two major types of deviation from the model can detected. Firstly, there is the multi-hit possibility, whereby the single-hit hypothesis might be false, and some sort of mechanism involving multiple cells might in fact contribute to a positive culture response. In this case, the proportion of positive assays is likely to increase more rapidly than expected as the cell dose is increased. Secondly, the single-hit model might be correct but the assays may not be homogeneous in terms of the active cell frequency. In this case, the proportion of positive cultures is likely to increase more slowly as the dose increases than the classic model would predict, although rapid increase is also possible if the heterogeneity is correlated with dose. These two possibilities correspond respectively to curves bending down and curves bending up in the plots of Lefkovits and Waldmann (1979). However these two possibilities have not always been clearly been distinguished in the literature. Cox (1962) and Thomas (1972) test a particular multi-hit model, although this test is relatively difficult to implement and interpret. Shortley and Wilkins (1965) and Gart and Weiss (1967) concentrate on heterogeneity whereas Bonnefoix et al. (1996) concentrate on the single-hit hypothesis. However these are all regression based tests which are straightforward to implement and interpret, and have good properties in small samples. The Pearson goodness of fit tests proposed by Lefkovits and Waldmann (1979) and Taswell (1981) do not distinguish the two types of deviation. Pearson tests also have poor power (Bonnefoix and Sotto, 1994), and are unreliable when the number of replicate cultures is small (McCullagh, 1985).

In many immunological contexts, the only practical way to assess the single-hit hypothesis is by way of the statistical tests described above. However there are experimental situations for which it is worthwhile and practical to validate the assumption experimentally. Shackleton et al. (2006), Quintana et al. (2008), Leong et al. (2008) and Vermeulen et al. (2008) validate the single-hit hypothesis experimentally, by confirming a single input cell in each culture by microscope visualization, before the assay is conducted. The fact that any of the single-cell assays lead to a positive response is then proof that a single cell is sufficient. Where the single-hit hypothesis can be confirmed experimentally, as in these cases, the need to validate the hypothesis statistically in each and every assay is no longer compelling, although the need to check heterogeneity remains. If the single-hit model can be assumed, then the active cell frequency may be accurately estimated from a limited number of distinct dilutions, provided that a worthwhile number of positive and negative cultures are available from at least one dilution.

Counting the number of cells also has the consequence that the number of cells no longer follows a Poisson distribution, but rather is a fixed quantity. This means that the classical Poisson model of LDA does not apply.

This article describes a coherent approach to LDA which includes extreme data situations, multiple populations and non-Poisson situations. The approach is implemented in the ELDA (Extreme LDA) webtool for LDA. ELDA provides a convenient interface for users without any need to download or install software. ELDA implements the GLM approach to LDA, with a number of extensions to cover situations commonly seen in current stem cell and other medical research, but not covered by classical analysis. Hypothesis tests are provided, using standard GLM theory, to compare active cell frequencies between two or more cell populations. Although these tests use standard GLM theory, they have not been fully available previously in specialist LDA software. In a novel extension, one-sided confidence intervals provided for the active cell frequency when 0% or 100% positive responses are observed at all doses. The tradition assumption that the number of cells follows a Poisson distribution is also varied to allow for the possibility that the number of cells in the culture is observed exactly. We show that the GLM framework still applies, with a minor modification, even when the total number of cells is not Poisson but is fixed. The graphical displays recommended by Lefkovits and Waldmann (1979) are included but with efficient estimation of the active cell frequency.

We give tests of heterogeneity and the single-hit hypothesis which are adapted from Gart and Weiss (1967) and Bonnefoix et al. (1996) and which take advantage of the GLM framework. The GLM test has the best performance of the goodness of fit tests in small samples, and it also has to ability to distinguish heterogeneity of samples from multi-hit alternatives.

ELDA has already proved valuable for LDA in a wide variety of high-profile research areas (Diaz-Guerra et al., 2007, Hosen et al., 2007, Leong et al., 2008, Quintana et al., 2008, Shackleton et al., 2006, Siwko et al., 2008, Vaillant et al., 2008, Vermeulen et al., 2008).

The ELDA webtool is described in Section 2, and Section 3 gives examples of usage. These two sections are written for readers wishing to use the webtool. Section 4 gives details of the statistical methodology for readers wanting the mathematical background. Section 5 finishes with discussion and conclusions.

Section snippets

The ELDA webtool

ELDA is an online tool for limiting dilution analysis. Users simply cut and paste a table of data into the web page. There is no need to download software or to undertake any programming.

ELDA accepts an input data table of three or four columns, separated by any combination of commas, spaces or tabs (Table 1). Users can type the data directly into the webpage text field, or can simply cut and paste the whole table from any spreadsheet application. Each row of data gives results for a particular

Confidence intervals and tests

A key facility of ELDA is the ability to handle extreme data situations. Table 1 shows a small data example which illustrates some of the capabilities of the software. This gives data on the frequency of repopulating mammary cells from a tumorigenic mouse model (Vaillant et al., 2008). Here a positive assay is one which results in a visible mammary epithelial outgrowth. In this experiment, the wild-type cells did not produce any outgrowths, although this might be due to insufficient cell

Generalized linear models

In this section, we outline the statistical methodology behind the ELDA software. We begin by outlining the GLM approach to LDA. Alternative introductions to GLMs can be found in Collett (1991) and Bonnefoix et al. (1996).

The fundamental property of limiting dilution assays is that each culture results in positive or negative result. Write pi for the probability of a positive result given that the expected number of cells in the culture is di. If ni independent cultures are conducted as dose di

Discussion and conclusion

Despite more than a century of methodological development for LDA, the best methods have not generally been available to immunologists because of lack of easily accessible software.

The ELDA webtool gives researchers access to optimal LDA statistical techniques without the need to install software or to undertake any programming. The aims are (i) to give confidence intervals for the active cell frequency, (ii) to compare the active cell frequency across multiple cell subpopulations and (iii) to

Acknowledgments

Thanks to Mark Shackleton, Francois Vaillant, Jane Visvader and Geoff Lindeman for valuable discussions and feedback and for the use of unpublished data. Keith Satterley created the original web interface for ELDA.

References (56)

  • BonnefoixT. et al.

    Graphical representation of a generalized linear model-based statistical test estimating the fit of the single-hit Poisson model to limiting dilution assays

    J. Immunol.

    (2001)
  • BowieM.B. et al.

    Identification of a new intrinsically timed developmental checkpoint that reprograms key hematopoietic stem cell properties

    Proc. Natl. Acad. Sci.

    (2007)
  • BreivikH.

    Haematopoietic stem cell content of murine bone marrow, spleen, and blood. Limiting dilution analysis of diffusion chamber cultures

    J. Cell. Physiol.

    (1971)
  • ClopperC.J. et al.

    The use of confidence or fiducial limits illustrated in the case of the binomial

    Biometrika

    (1934)
  • CollettD.

    Modelling binary data

    (1991)
  • CoxD.R.

    Further tests of separate families of hypotheses

    J. R. Stat. Soc.

    (1962)
  • CoxD.R. et al.

    Theoretical statistics

    (1974)
  • Diaz-GuerraE. et al.

    CCL2 inhibits the apoptosis program induced by growth factor deprivation, rescuing functional T cells

    J. Immunol.

    (2007)
  • EirewP. et al.

    A method for quantifying normal human mammary epithelial stem cells with in vivo regenerative ability

    Nat. Med.

    (2008)
  • Fazekas de St. GrothS.

    The evaluation of limiting dilution assays

    J. Immunol. Methods

    (1982)
  • FearsT.R. et al.

    A reminder of the fallibility of the Wald statistic

    Am. Stat.

    (1996)
  • FinneyD.J.

    The estimation of bacterial densities from dilution series

    J. Hyg.

    (1951)
  • Finney, D.J., 1952, Statistical Method in Biological Assay 1st, 2nd and 3rd Eds., Charles Griffin,...
  • FisherR.A.

    On the mathematical foundations of theoretical statistics

    Philos. Trans. R. Soc. Lond. Ser. A.

    (1922)
  • GartJ.J. et al.

    Graphically oriented tests for host variability in dilution experiments

    Biometrics

    (1967)
  • GreenwoodM. et al.

    On the statistical interpretation of some bacteriological methods employed in water analysis

    J. Hyg.

    (1917)
  • HolmS.

    A simple sequentially rejective multiple test procedure

    Scand. J. Stat.

    (1979)
  • HosenN. et al.

    Bmi-1-green fluorescent protein-knock-in mice reveal the dynamic regulation of bmi-1 expression in normal and leukemic hematopoietic cells

    Stem Cells

    (2007)
  • Cited by (0)

    View full text