Open Close
Reference
Citation
Yandell, M., Bailey, A.M., Misra, S., Shu, S., Wiel, C., Evans-Holm, M., Celniker, S.E., Rubin, G.M. (2005). A computational and experimental approach to validating annotations and gene predictions in the Drosophila melanogaster genome.  Proc. Natl. Acad. Sci. U.S.A. 102(5): 1566--1571.
FlyBase ID
FBrf0183655
Publication Type
Research paper
Abstract

Five years after the completion of the sequence of the Drosophila melanogaster genome, the number of protein-coding genes it contains remains a matter of debate; the number of computational gene predictions greatly exceeds the number of validated gene annotations. We have assembled a collection of >10,000 gene predictions that do not overlap existing gene annotations and have developed a process for their validation that allows us to efficiently prioritize and experimentally validate predictions from various sources by sequencing RT-PCR products to confirm gene structures. Our data provide experimental evidence for 122 protein-coding genes. Our analyses suggest that the entire collection of predictions contains only approximately 700 additional protein-coding genes. Although we cannot rule out the discovery of genes with unusual features that make them refractory to existing methods, our results suggest that the D. melanogaster genome contains approximately 14,000 protein-coding genes.

PubMed ID
PubMed Central ID
PMC545494 (PMC) (EuropePMC)
Related Publication(s)
Associated Information
Comments
Associated Files
Other Information
Secondary IDs
    Language of Publication
    English
    Additional Languages of Abstract
    Parent Publication
    Publication Type
    Journal
    Abbreviation
    Proc. Natl. Acad. Sci. U.S.A.
    Title
    Proceedings of the National Academy of Sciences of the United States of America
    Publication Year
    1915-
    ISBN/ISSN
    0027-8424
    Data From Reference