Tackling the widespread and critical impact of batch effects in high-throughput data

Nat Rev Genet. 2010 Oct;11(10):733-9. doi: 10.1038/nrg2825. Epub 2010 Sep 14.

Abstract

High-throughput technologies are widely used, for example to assay genetic variants, gene and protein expression, and epigenetic modifications. One often overlooked complication with such studies is batch effects, which occur because measurements are affected by laboratory conditions, reagent lots and personnel differences. This becomes a major problem when batch effects are correlated with an outcome of interest and lead to incorrect conclusions. Using both published studies and our own analyses, we argue that batch effects (as well as other technical and biological artefacts) are widespread and critical to address. We review experimental and computational approaches for doing so.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Biotechnology / methods*
  • Biotechnology / standards
  • Biotechnology / statistics & numerical data
  • Computational Biology / methods
  • Genomics / methods*
  • Genomics / standards
  • Genomics / statistics & numerical data
  • Oligonucleotide Array Sequence Analysis / methods*
  • Oligonucleotide Array Sequence Analysis / standards
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data
  • Periodicals as Topic / standards
  • Research Design / standards
  • Research Design / statistics & numerical data
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, DNA / standards
  • Sequence Analysis, DNA / statistics & numerical data