A Database of Drosophila Genes & Genomes

FB2013_03, released May 7th, 2013
 

Dataset RP_cDNA

General Information
Name RP_cDNA Species D. melanogaster
Dataset type cDNA sequence FlyBase ID FBlc0000049
Source & Content
Consists of
Created by
Available from
Strain
Stage & tissue
Stage
Tissue/Position (including subcellular localization)
Reference
Comment:0-24 hr AEL
Cell Line
hide Recent Updates
Description
What does this section display?
This section contains items that were added to this record for each release. It currently only tracks new links between this FlyBase report and other FlyBase data classes (e.g. genes, references, stocks) or controlled vocabulary terms (e.g. GO, anatomy terms).
What does this section not display?
This section does not currently display links that were removed or gene model changes.
Update Feed
Click the icon below to subscribe to this FlyBase record and receive updates automatically through your feed reader.
FB2013_03
FB2013_02
All updates Click here to see a list of all updates to this record from FB2010_08 and on.
hide Description & Members
Description
Parent collections
Component collection(s)
Number in collection
Comment on number in collection
Members
hide Experimental protocol
Vector
Sample preparation
Collection preparation
Total RNA isolated from mixed stage embryos of the 'iso-1' isogenic strain was provided by the laboratory of Peter Cherbas (University of Indiana, Bloomington). Gene-specific 5' RLM-RACE products were generated using the FirstChoice RLM-RACE procedure (Ambion). Normalized pools of RACE products were created and either cloned and sequenced with an ABI3730 or directly sequenced on a 454 Life Science sequencer. Vector sequences were removed and the RNA adapter sequence was used to determine the orientation of the clone and was removed from the sequence. Each sequence represents a potential transcription start site and is oriented along the direction of transcription.
 
Primers were designed to target all FlyBase release 5.12 (October 2008) transcript models that overlap 5' ESTs from the RE (FBrf0152058) and LD (FBrf0127297) cDNA libraries, both constructed from mixed-stage embryos. Transcripts of genes expressed in the embryo based on whole-mount RNA in situ hybridization (FBrf0205271) and literature surveys were also targeted. In all, 8570 distinct primer pairs representing 7742 genes were designed.
Nested RACE PCR reactions were carried out using HotFirePol. For 1453 RACE reactions that lacked detectable product, second-round PCR conditions were used that including 5 extra amplification cycles. Individual PCR products were pooled to create a molar-normalized mixtures of 1,440 to 2,677 products using a Tecan Genesis 2 robot. The sample pools were then concentrated to approximatley 50 ng/ul.
Mode of assay
cDNA was characterized by high-throughput sequencing.
Assay platform
[[Roche Genome Sequencer 454 GS FLX::www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL9333]]
Data analysis
Clustering of RLM-RACE reads: the unique locations of all reads associated with a given target were collected, and the Euclidian distances between all pairs of distinct start locations were calculated. An iterative hierarchical clustering approach using the "complete" agglomeration method was applied to find the minimum number of clusters for which the number of TSS in a cluster is greater than three and the cluster span is less than 300 bp. To remove outliers, tags that mapped farther away than 1.5 times the inter-quartile range were excluded from a promoter region.
From 2.1 million raw RACE reads, 1.2 million were oriented, trimmed, mapped to the genome, and associated with a transcript. In total, 8418 transcripts of 7546 genes were identified.
Mapping of RLM-RACE sequences: RACE reads were processed to trim low-quality regions and adapter sequences; only sequences with a significant match to the 3’ end of the 3’-most 38 bases of the adapter sequence were retained. Only sequences that were aligned to the genome beginning at the most 5’ nucleotide were used in subsequent analyses. A read was associated with a particular transcript and RACE reaction by searching for interrogated transcripts on the same strand as the aligned sequence and within 5 kb of the 3' end of the aligned sequence. If there were multiple targeted transcripts within the region, then the sequence was assigned to the transcript and RACE reaction with the highest scoring BLAST hit to the inner transcript-specific PCR primer if applicable, or to the highest scoring BLAST hit to a targeted transcript. Each read was presumed to represent a transcription initiation event, but because the protocol relies on PCR amplification, each read does not necessarily represent an independent event.
hide Additional data
More information is available under:
Associated files
Additional sites
hide Comments
Clones were not preserved and are not available for distribution.
 
hide Synonyms & Secondary IDs
Reported As
Symbol Synonym
RP
 
RP_cDNA
 
Secondary FlyBase IDs
    hide References ( 2 )
    Research paper
    Hoskins et al., 2011, Genome Res. 21(2): 182--192
    Genome-wide analysis of promoter architecture in Drosophila melanogaster. [FBrf0213090]
    Supplementary material
    Hoskins et al., 2011, Genome Res. 21(2):
    Supplemental Material. [FBrf0213251]