Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses

  1. John L. Rinn1,3,6,7
  1. 1Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts 02142, USA;
  2. 2Department of Systems Biology, Harvard Medical School, Boston, Massachusetts 02115, USA;
  3. 3Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, Massachusetts 02138, USA;
  4. 4Computer Science and Artificial Intelligence Laboratory, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02140, USA;
  5. 5Howard Hughes Medical Institute, Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02140, USA
    1. 6 These authors contributed equally to this work.

    Abstract

    Large intergenic noncoding RNAs (lincRNAs) are emerging as key regulators of diverse cellular processes. Determining the function of individual lincRNAs remains a challenge. Recent advances in RNA sequencing (RNA-seq) and computational methods allow for an unprecedented analysis of such transcripts. Here, we present an integrative approach to define a reference catalog of >8000 human lincRNAs. Our catalog unifies previously existing annotation sources with transcripts we assembled from RNA-seq data collected from ∼4 billion RNA-seq reads across 24 tissues and cell types. We characterize each lincRNA by a panorama of >30 properties, including sequence, structural, transcriptional, and orthology features. We found that lincRNA expression is strikingly tissue-specific compared with coding genes, and that lincRNAs are typically coexpressed with their neighboring genes, albeit to an extent similar to that of pairs of neighboring protein-coding genes. We distinguish an additional subset of transcripts that have high evolutionary conservation but may include short ORFs and may serve as either lincRNAs or small peptides. Our integrated, comprehensive, yet conservative reference catalog of human lincRNAs reveals the global properties of lincRNAs and will facilitate experimental studies and further functional classification of these genes.

    Keywords

    Footnotes

    • Received July 12, 2011.
    • Accepted August 11, 2011.

    Freely available online through the Genes & Development Open Access option.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Life Science Alliance