FlyBase Genome Annotators, (2011). Analysis of novel conserved protein-coding genes from the modENCODE consortium. 
FlyBase analysis
Research paper

Identification of functional elements and regulatory circuits by Drosophila modENCODE.
modENCODE Consortium et al., 2010, Science 330(6012): 1787--1797 [FBrf0212741]

All entries in Supplementary Dataset S2 from the integrative modENCODE analysis of the Drosophila genome (FBrf0212741) were assessed; after elimination of duplicates, there are 136, 55 indicated as "CompleteORF" and 81 indicated as "IncompleteORF". (In FBrf0212741, it is stated that there are 138 conserved protein-coding genes predicted; the discrepancy appears to be due to 3 independent entries that correspond to CG43051.) 63 of the predictions were annotated previously in FlyBase, including 4 pseudogenes; 39 were created as new gene models (r5.34), including 6 new pseudogenes, for a total of 102 gene models. All 55 predictions identified as "CompleteORF" correspond to gene models; 3 of these are pseudogenes. Most of the 34 predictions that FlyBase rejected correspond to fragments of transposable elements (27); there are also a number of mitochondrial fragments (4). A file with assessment data for each entry, including the release during which the gene model was added, is available from the FlyBase ftp site.

File date: 2011.1.26 ; File size: 40960 ; File format: xls ; File name: FlyBase_Analysis_modENCODE_coding_genes_260111.xls
