Bergman, C., Daub, J., Griffiths-Jones, S. (2006.8.16). Annotation of snoRNAs in CAF1 genome assemblies. 
Personal communication to FlyBase
Annotation of snoRNAs in CAF1 genome assemblies
Non redundant sets of snoRNA predictions in the CAF1 Drosophila
genomes are available from:
Annotating D. melanogaster snoRNAs
Mapping of Drosophila melanogaster snoRNAs available from the public
databases (EMBL and RefSeq Genome).
D. melanogaster genome assembly used is from:
- 291 Sequences with snoRNA annotations available from EMBL and RefSeq
Genome databases obtained (14/07/06).
- snoRNA fragments, whole gene, multiple gene submissions accounted
for elsewhere in dataset manually removed.  
- 276 sequences mapped to the CAF1 assembly using BLAST (WUBLASTN 2.0
-- params "W=3 -kap").
- Overlapping hits were manually inspected and the longer of the two
snoRNA annotations retained.  Details of overlapping mappings are provided
below. Some of these have redundant Flybase entries.
- All BLAST annotations were extended to the full length of the snoRNA query
- 250 snoRNA annotations provided in file dmel_snorna.gff3
***** NOTES *****
- Numerical summary of 276 sequences -> 250 mappings in the CAF1 genome:
	 19 mappings excluded due to overlapping hits (redundant Flybase entries)
	  5 mappings excluded due to overlapping hits (non-redundant Flybase entries)
	  2 two sequences failed to map to the D. melanogaster CAF1 genome
	250 non redundant genome mapping
	276 database sequences
- In the ''Attributes' field. The Name tag used for the snoRNA is that
for the sequence that was mapped. The other synonym (in some cases the
more well known name) is given under the Alias tag. The Flybase
identifiers and Genbank accessions for the mapped and
overlapping/shorter sequences are provided where available under the
Dbxref tag.
- The 63 snoRNAs with NR Genbank accessions correspond to the Flybase
 release 4.3 gff annotations.
- Remaining unresolved annotations:
	(1) Sequences AJ784385 and NR_002093. These sequences overlap by
 only 30 bp. Currently we have mapped both to the CAF1 assembly as it was 
unclear why both entries exist and which should be retained. The longer 
sequence AJ784385 has been experimentally verified and is possibly the preferred 
annotation. It does not currently have a Flybase entry.
	Unresolved overlapping annotations:
	Genbank 	Name		Flybase			
	---------	---------	---------
	AJ784385	Me28S-Gm980	.
	NR_002093	 snoRNA:M  	FBgn0044508
	(2) 2 Genbank sequences failed to map to the D.melanogaster CAF1 genome. Neither 
of these sequences have Flybase entries.
	Unmapped sequences:
	Genbank 	Name		Flybase			
	---------	---------	---------
	AJ809564	psi28s-2996	.
	AJ784386	Me18S-Cm419	.
Tables of overlapping snoRNA mappings 
Column headers:
(1) Genbank accession of mapped sequence
(2) snoRNA name for mapped sequence
(3) Flybase identifier for the mapped sequence
(4) Genbank accession for duplicate, overlapping or shorter sequences
(5) Synonym for snoRNA currently used in FlyBase
(6) Current Flybase identifier for this snoRNA
19 instances of overlapping hits with redundant Flybase entries:
AJ809562        psi28s-1180     FBgn0065063     NR_002542        snoRNA:535       FBgn0083002
AJ629273        psi18S-1347c    FBgn0065075     NR_002546        snoRNA:203       FBgn0083052
AJ809560        psi18s-176      FBgn0065068     NR_002554        snoRNA:314       FBgn0083043
AJ629267        psi28S-1837a    FBgn0065079     NR_002477        snoRNA:11        FBgn0082996
AJ629276        psi28S-3327c    FBgn0065062     NR_002485        snoRNA:586       FBgn0082965
AJ809571        psi28s-2566     FBgn0065072     NR_002489        snoRNA:269       FBgn0082985
AJ809574        psi28s-3186     FBgn0065077     NR_002491        snoRNA:165       FBgn0082977
AJ809591        psi18s-1275     FBgn0065050     NR_002494        snoRNA:783       FBgn0083056
AJ629212        psi28S-2149     FBgn0065078     NR_002505        snoRNA:143       FBgn0082992
AJ629207        psi18S-1377a    FBgn0065067     NR_002506        snoRNA:328       FBgn0083051
AJ629277        psi28S-2179     FBgn0065054     NR_002459        snoRNA:734       FBgn0082991
AJ809593        psi28s-3342     FBgn0065061     NR_002461        snoRNA:644       FBgn0082964
AJ629266        psi28S-3436b    FBgn0065052     NR_002463        snoRNA:75        FBgn0082955
AJ629265        psi28S-3436a    FBgn0065056     NR_002464        snoRNA:708       FBgn0082956
AY805215        orphan_CD2      FBgn0065065     NR_002541        snoRNA:461       FBgn0082920
AJ629258        psi18S-841a     FBgn0065060     NR_002551        snoRNA:66        FBgn0083019
AJ809569        psi28s-2622     FBgn0065069     NR_002472        snoRNA:3         FBgn0082984
AJ629201        psi28S-2876     FBgn0065049     NR_002467        snoRNA:825       FBgn0082982
AJ809558        DmOr_aca5       FBgn0065074     NR_002540        snoRNA:227       FBgn0082921
4 instances of overlapping hits (multiple Genbank accessions under one Flybase entry)
AJ809549        psi28s-2648     FBgn0065055     NR_002512        snoRNA:72        .
AJ809592        psi28s-291      FBgn0065064     NR_002496        snoRNA:50        .
NR_001718        snoRNA:Z1        FBgn0015543     U46015  .       .
AF089836        snoRNA H1       FBgn0026169     NR_001911, AJ809561       .       .
Mapping annotated melanogaster snoRNAs to the other CAF1 genomes
Drosophila CAF1 genome assemblies from:
- A non redundant set of 250 snoRNA sequences were annotated in the
D. melanogaster CAF1 assembly previously.  2 additional
D. melanogaster snoRNA sequences avaliable from the public databases
which failed to map to the D. melanogaster CAF1 assembly were included
in this analysis of the other 11 genomes.
- Regions in the genome assemblies with similarity to each of the 252
D. melanogaster snoRNA sequences were identified using BLAST (WUBLASTN
2.0 -- params "W=3 -kap")
- All regions with BLAST hit evalues < 1e-6 were accepted as
significant hits.
- Regions of BLAST hit similarity > 1e-6 and < 1e-2 were accepted only
if syntenic with the D. melanogaster snoRNA genome mappings. The
MAVID/MERCATOR whole genome alignments were used for the synteny
analysis. Only BLAST hits which mapped to the same genome alignment
fragment with the same orientation as the D. melanogaster snoRNA were
accepted as significant.
- Only the regions of the HSP alignment have been provided in the
annotations. The genome co-ordinates provided have not been extended
to the full length of the query sequences and in many instances these
annotations will not represent the complete snoRNA gene.
- The highest scoring BLAST hit for each genome region was used to
provide the putative snoRNA annotations.
