Open Close
Reference
Citation
Boley, N., Stoiber, M.H., Booth, B.W., Wan, K.H., Hoskins, R.A., Bickel, P.J., Celniker, S.E., Brown, J.B. (2014). Genome-guided transcript assembly by integrative analysis of RNA sequence data.  Nat. Biotechnol. 32(4): 341--346.
FlyBase ID
FBrf0224665
Publication Type
Research paper
Abstract

The identification of full length transcripts entirely from short-read RNA sequencing data (RNA-seq) remains a challenge in the annotation of genomes. Here we describe an automated pipeline for genome annotation that integrates RNA-seq and gene-boundary data sets, which we call Generalized RNA Integration Tool, or GRIT. Applying GRIT to Drosophila melanogaster short-read RNA-seq, cap analysis of gene expression (CAGE) and poly(A)-site-seq data collected for the modENCODE project, we recovered the vast majority of previously annotated transcripts and doubled the total number of transcripts cataloged. We found that 20% of protein coding genes encode multiple protein-localization signals and that, in 20-d-old adult fly heads, genes with multiple polyadenylation sites are more common than genes with alternative splicing or alternative promoters. GRIT demonstrates 30% higher precision and recall than the most widely used transcript assembly tools. GRIT will facilitate the automated generation of high-quality genome annotations without the need for extensive manual annotation.

PubMed ID
PubMed Central ID
PMC4037530 (PMC) (EuropePMC)
Associated Information
Comments
Associated Files
Other Information
Secondary IDs
    Language of Publication
    English
    Additional Languages of Abstract
    Parent Publication
    Publication Type
    Journal
    Abbreviation
    Nat. Biotechnol.
    Title
    Nature biotechnology
    Publication Year
    1996-
    ISBN/ISSN
    1087-0156
    Data From Reference
    Genes (4)