A non-redundant collection of cDNA clones derived from a variety of developmental stages and tissues
cDNA clones from new cDNA libraries (RH_cDNA, RE_cDNA and AT_cDNA) were analyzed together with clones from cDNA libraries analyzed in the previous DGCr1 build (LD_pBS_cDNA, LD_pOT2_cDNA, GM_pBS_cDNA, GM_pOT2_cDNA, HL_pBS_cDNA, HL_pOT2_cDNA, GH_cDNA, LP_cDNA and SD_cDNA) to generate a non-redundant set of full length cDNA clones. First, 5' ESTs from clones were sequenced and grouped by sequence to select clones that extended furthest upstream. The remaining clones were clustered on the basis of their 3' EST sequences. Duplicate clones were eliminated, as were clones lacking a poly(A) tail, and chimeric clones in which the 5' and 3' EST did not align to the genome with some proximity to each other. In this way, a total of 10,910 validated cDNA clones were selected to generate a second, more comprehensive version of the Drosophila Gene Collection.