From smount@XXXX Wed Apr 05 19:37:01 2000 Date: Wed, 5 Apr 2000 14:41:28 -0400 (EDT) From: Stephen M Mount <smount@XXXX> To: Michael Ashburner <ma11@XXXX> Subject: RE: drosophila snRNAs (fwd) Message-ID: <Pine.GSO.4.21.0004051440480.10337-100000@XXXX> Michael, This is what I sent to Celera for the annotation. The Excel file is being sent separately. Steve ---------- Forwarded message ---------- Date: Mon, 28 Feb 2000 21:16:48 -0500 (EST) From: Stephen M Mount <smount@XXXX> To: 'Skupski, Marian P.' <Marian.Skupski@XXXX> Subject: RE: drosophila snRNAs Marian, This email contains, in the body of the text, NOT as attachments, 1) established Drosophila snRNA sequences for U1, U2, U4, U5 and U6 2) hypothetical Drosophila snRNA sequences for U4atac, U6atac and U12 3) updated versions of the summary files I sent yesterday The latter includes the three additional U2 genes that were missing yesterday. You may want to correlate these with existing (published) snRNA genes for U1, U2, U4 and U6. Note also that I found U5 genes that have variant 3' termini and may not be real genes at all. I hope this does it for you. If not, feel free to contact me again. Steve Mount \################################################## 1) established Drosophila snRNA sequences for U1, U2, U4, U5 and U6 What follows are established Drosophila snRNA sequences; most are based on RNA sequencing. U3 is not a spliceosomal RNA, and I did not investigate it. > U1 RNA ACCESSION X04257 atacttacctggcgtagaggttaaccgtgatcacgaaggcggttcctccggagtgaggcttggccattgcacctcggctg agttgacctctgcgattattcctaatgtgaataactcgtgcgtgtaatttttggtagccgggaatggcgttcgcgccgtc ccga >U2 RNA atcgcttctcggccttatggctaagatcaaagtgtagtatctgttcttatcagcttaacatctgatagttcctccattgg aggacaacaaatgttaaactgatttttggaatcagacggagtgctaggggcttgctccacctctgtcgcgggttggcccg gtattgcagtaccgccgggatttcggcccaac >U4 RNA ACCESSION K03095 agcttagcgcagtggcaataccgtaaccaatgaagcctccctgaggtgcggttattgctagttgaaaactttaaccaacc cacgccatgggacgtgaaataccgtccactacggcaatttttggaagcccttacgagggctaa > U5 RNA. ACCESSION K03096 atactctggtttctcttcaatgtcgaataaatctttcgccttttactaaagatttccgtggagaggaacactctaagagt ctaaaactaattttttagtcagtcttgtcgcaagactggggcca > U6 RNA. ACCESSION X06669 gttcttgcttcggcagaacatatactaaaattggaacgatacagagaagattagcatggccccagcgcaaggatgacacg caaaatcgtgaagcgttccacattttt \################################################## 2) hypothetical Drosophila snRNA sequences for U4atac, U6atac and U12 What follows are HYPOTHETICAL Drosophila snRNA sequences; based on the genomic sequence. I found these with blast searches using modified parameters. > hypothetical fly U4atac; What's depicted here is the region of similariy; the actual RNA may extend beyond these nucleotides. accttccttgtcttggggagcagaaatgttcaatgaacgtctagtgaggacattgctgctgacaccaatgatgacacccc cgctcgccgatcgttcgcgattggagttcggaatttttgga > hypothetical fly U6atac; gtggtccaaacgtgttgtttggaaggagagcaagttagcactcccctagacaaggatggaacacataaacggtcggctag gcacagacaaaagccgtccacaaattttt > hypothetical fly U12 RNA; matches vertebrate U12 snRNA at both ends. gtgcctcaaactaatgagtaaggaaaaccaatcagccttgctaatcgcttggcagtattggcttctaggcaggg gggcgt gtcccgcgccccttgaagctcaaatttttgcaagggcacaggtcgtcccctcctcctccgcgtgggtggcgttcggccga gcgaaccggcgcctactttgcgtccggctagcgaggatctctgggtgccatcccacggctgggtgttgcgatctgccc Support for U4atac: gb|AC013956.1|AC013956 Drosophila melanogaster, \*** SEQUENCING IN PROGRESS \***, in ordered pieces Length = 65679 Score = 52.1 bits (116), Expect = 5e-06 Identities = 88/129 (68%), Gaps = 13/129 (10%) Strand = Plus / Minus Query: 2 accatccttttcttggggttgcgctactgtccaatgagcgcatagtgagggcagtactgc 61 ||| ||||| |||||||| || | ||| |||||| || |||||||| || | |||| Sbjct: 1325 accttccttgtcttgggga-gcagaaatgttcaatgaacgtctagtgaggacattgctgc 1267 Query: 62 taacgcc--tgaacaacacacccgcatcaactagagcttttgc---tttattttggtgca 116 | || || || | |||| ||||| || | || | || || | | || || | Sbjct: 1266 tgacaccaatg-atgacacccccgc-tcgcc--gatcgttcgcgattggagttcgg---a 1214 Query: 117 atttttgga 125 ||||||||| Sbjct: 1213 atttttgga 1205 \################################################## 3) updated versions of the summary files I sent yesterday snRNA promoter hits: AC018327.1C taattcccaactagttctagttgcgccctcatggaaa U1-82.3 001 AC015109.1C caattcccaactggttttagctgctcagccatggaaa U1-95.1 002 AC019896.1C caattcccaactgcttctggccgtttgctcatggaag U2 003 AC019896.1C gaattcccaactgcttctggccgtttggtcatggaag U2 004 AC015154.1C taattcccaactggttctggctacttccctatggaga U1-95.2 005 AC015154.1C aaattcccaaacagttctggcagatctctcaaggaga U1-95.3 006 AC017493.1C taattcccaactgcttctggccatcagctcatggaaa U1-21.1 007 AC019965.1C taattcccaaatggttctggccgtttgcccatcgaga U2 008 AC015392.1C taattcccaactgctactggctgcgcttgcatggagt ?? 009 AC017832.1C taattcccaaatggttctggcttgctgtgaatggaat U4-1 010 AC014745.1C taattcccaactgcttctggcagcgccggcatggtat U4-2 011 AC019965.1C tgattcccaacatgttcaagctcgttctaaatgatcg U5 012 AC019603.1C taattcccaacgtgttaaagcagtcactgaatagagt U5 (RNA) 017 AC019905.1C gaattcccaaaaagttctatcacagaacgaatctagg U5 018 AC017727.1C tgattcccaacacgttcaagcaatttcttagtggtac U5 019 AC019732.1C aaattcccaactccttctggccaacactgatcctaga U5 var. 020 AC017832.1C taattcccaagcggttctattcaatattgagtatgga U5 var. 021 AC017832.1C aaattcccaattccttctggccaatactgatcctaga U5 var. 022 AC014181.1C tgattcccaaccggttctggttgcatggccatgagtt U12 013 AC018038.1C tgattcccaagtacatattctgcaagagtacagtata U6-1 014 AC018038.1C taattctcaactgctctttcctgatgttgatcattta U6-2 015 AC018038.1C taattctcaacttctttttccagactcagttcgtata U6-3 016 AC018217.1C aaattcccaagttctttttccgcatggagtgcttata U6atac 023 AC013956.1C taattcccaactagtactggccacttttgcttgaggt U4atac 024 AC017832.1C taattcccaactgattttagctgcagtcgcatgaagt U2 025 AC017832.1C taattctcaactgattttagctgcagtcgcatgaagt U2 026 AC019732.1C taattcccaactggtcttggctgcagtcgcatcaagt U2 027 Small nuclear RNA gene report. 2-28-00 I found 5 genes for U1, 6 genes for U2, 2 genes for U4, 7 genes for U5 (3 of which are quite divergent at the 3' end), 3 genes for U6, 1 gene for U12, 1 gene for U6atac and 1 gene for U4atac. Sequences for Drosophila U1, U2, U4, U5 and U6 RNAs were known. Finding the genes for U4atac, U6atac and U12 absolutely required varying the blast search parameters. -r 10 -q -11 -W 7 -G 15 -E 4; -r 7 -q -14 -W 7 -G 7 -E 3; and -r 4 -q -5 -W 8 -G 10 -E 2 were among the more successfull parameter sets. Details follow below. Unfortunately, I did these searches through GenBank, and so I have GenBank accession numbers rather than Celera numbers. That these are indeed snRNA genes is clear from their promoter sequences. ALL of these genes have a characteristic snRNA promoter properly placed upstream of the RNA, so I have a high confidence in them. I do not mention U11 because I could not find anything that was compelling; the nature of snRNA conservation is such that I could be missing something. U1 -- five genes, all of which were previously described: gb|AC018327.1|AC018327 Drosophila melanogaster, \*** SEQUENC... 327 8e-89 my 001; U1-82.3 gb|AC017493.1|AC017493 Drosophila melanogaster, \*** SEQUENC... 311 5e-84 my 007; U1-21.1 gb|AC015109.1|AC015109 Drosophila melanogaster, \*** SEQUENC... 311 5e-84 my 002; U1-95.1 gb|AC015154.1|AC015154 Drosophila melanogaster, \*** SEQUENC... 311 5e-84 my 005; U1-95.2 my 006; U1-95.3 Additional U1-related sequences at 82E: 82.1 and a pseudogene (82.2) are missing. Their absence may be a polymorphism (due to recombination); the published sequences of the real genes, 82.1 and 82.3 are identical, and they were known to be directly oriented on the phage clone. Alternatively, the tandem repeat could have caused an assembly error. U2 -- six genes and what it probably a pseudogene. I have yet to work out the relationship between these genes and what was previously published. gb|AC019896.1|AC019896 Drosophila melanogaster, \*** SEQUENC... 381 e-105 2 genes (my 003, 004) gb|AC019965.1|AC019965 Drosophila melanogaster, \*** SEQUENC... 373 e-102 gb|AC017832.1|AC017832 Drosophila melanogaster, \*** SEQUENC... 373 e-102 2 genes (my 025, 026). There is also a U4 gene and two U5 genes in this contig. gb|AC019732.1|AC019732 Drosophila melanogaster, \*** SEQUENC... 365 e-100 gb|AC014977.1|AC014977 Drosophila melanogaster, \*** SEQUENC... 52 8e-06 A pseudogene ? U4 -- two genes, both previously published. gb|AC017832.1|AC017832 Drosophila melanogaster, \*** SEQUENC... 297 6e-80 my 010; U4-1. There are also two U2 genes and two U5 genes in this contig. gb|AC014745.1|AC014745 Drosophila melanogaster, \*** SEQUENC... 238 5e-62 my 011; U4-2 U5 -- 4 genes and 3 variant genes gb|AC019603.1|AC019603 Drosophila melanogaster, \*** SEQUENC... 143 2e-33 This matches the published U5 RNA sequence for Drosophila U5. My number 017. gb|AC019965.1|AC019965 Drosophila melanogaster, \*** SEQUENC... 141 7e-33 This differs from the published U5 RNA in the 3' end stem-loop. My number 017. gb|AC019905.1|AC019905 Drosophila melanogaster, \*** SEQUENC... 139 3e-32 This differs from the published U5 RNA in the 3' end stem-loop. My number 018. gb|AC017727.1|AC017727 Drosophila melanogaster, \*** SEQUENC... 139 3e-32 This differs from the published U5 RNA in the 3' end stem-loop. My number 019. gb|AC019732.1|AC019732 Drosophila melanogaster, \*** SEQUENC... 137 1e-31 This differs greatly from the published U5 RNA in the 3' end stem-loop. My number 020. gb|AC017832.1|AC017832 Drosophila melanogaster, \*** SEQUENC... 139 3e-32 These two genes differ greatly from the published U5 RNA in the 3' end stem-loop. My 021 and 022. There are also two U2 genes and a U4 gene in this contig. U6 -- 3 genes on a single contig. These were all previously described. gb|AC018038.1|AC018038 Drosophila melanogaster, \*** SEQUENC... 212 2e-54 U11 -- MISSING! There is nothing in the genome that convincingly matches U11. However, given the difficulty of finding highly divergent snRNAs, this result is inconclusive. I tried hard, varying parameters and checking all sites that match the snRNA promoter consensus, but I did not find any convincing hits. U12 -- 1 gene. gb|AC014181.1|AC014181 nucleotides 16743-16506 (238 nt.). U6atac -- 1 gene. gb|AC018217.1|AC018217 Drosophila melanogaster, \*** SEQUENC... 74 1e-12 U4atac -- 1 gene gb|AC013956.1|AC013956 Drosophila melanogaster, \*** SEQUENC... 52 5e-06 \################################################## On Mon, 28 Feb 2000, Skupski, Marian P. wrote: > Steve > > The sequences will be good enough. We're spending the next week or so > getting the data ready to go to GenBank, and I can blast the sequences and > get coordinates ready if I have the sequences. I'll try to get the two > extra U2 sequences added to the count that you sent to Mark. > > marian \################################################## Stephen M. Mount Cell Biology and Molecular Genetics H. J. Patterson Hall University of Maryland College Park, MD 20742-5815 Phone 301-405-6934 FAX 301-314-9081 permanent email address sm193@XXXX \################################################## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From smount@XXXX Fri Apr 28 04:08:17 2000 Envelope-to: ma11@XXXX Delivery-date: Fri, 28 Apr 2000 04:08:17 +0100 X-Authentication-Warning: rac9.wam.umd.edu: smount owned process doing -bs Date: Thu, 27 Apr 2000 23:13:05 -0400 (EDT) From: Stephen M Mount <smount@XXXX> To: Michael Ashburner <ma11@XXXX> cc: flybase-helpXXXX, Helen Salz <hksXXXX> Subject: snRNA report. MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY='-559023410-851401618-956891585=:3153' Content-Length: 21169 Michael, You suggested that I send you a report of snRNA positions. Here it is! I have not renamed ANYTHING. New genes have been named according to the system used for the old names. Notes relevant to this are at the bottom, below my assignments and above the reference RNA sequences. It is interesting that there are three or four clusters of snRNA genes: AE003639 at 34A -- three U2 genes and one U5 gene in 35 kb. AE03664 at 38AB -- two U2, one U4, and two U5 genes in 6 kb.!!! AE003501 at 14B -- a U2 gene and a U5 gene in 1 kb. AE003604 and AE003603 at 82E -- two U1 genes and a U4atac gene some unknown distance apart in beta heterochromatin. \################################################## On Sun, 2 Apr 2000, Michael Ashburner wrote: > Steve > > All the tRNAs are are in the xml, since they are displayed on Genescene. > Suzi might be able to provide a list with coordinates and (predicted) > cytology. > > We might be able to give you some help in correlating existing snRNA > genes with the Celera sequence. Aubrey can, I am sure, translate coordinates into > approximate cytology and that might help. > > I would like to avoid, if at all possible, the naming of snRNA/tRNA genes from > the Celera sequence independent of what FB has already. > > Michael \################################################## Gene name Promoter sequence and spacing to RNA Accession RNA location U1-21D taattcccaactgcttctggccatcagctcatggaaa 24 AE003588 18,650 18,813 U1-82Ea taattcccaactagttctagttgcgccctcatggaaa 24 X53542 2,592 2,755 U1-82Ec taattcccaactagttctagttgcgccctcatggaaa 24 AE003604 196,233 196,396 U1-95Ca caattcccaactggttttagctgctcagccatggaaa 24 AE003745 55,799 55,636 U1-95Cb taattcccaactggttctggctacttccctatggaga 24 AE003745 23,856 24,027 U1-95Cc aaattcccaaacagttctggcagatctctcaaggaga 24 AE003745 22,328 22,491 U2-14B taattcccaactggtcttggctgcagtcgcatcaagt 25 AE003501 221,890 222,081 U2-34ABa caattcccaactgcttctggccgtttgctcatggaga 25 AE003639 91,282 91,091 U2-34ABb gaattcccaactgcttctggccgtttggtcatggaga 25 AE003639 95,005 95,196 U2-34ABc taattcccaaatggttctggccgtttgcccatcgaga 25 AE003639 123,728 123,537 U2-38ABa taattcccaactgattttagctgcagtcgcatgaagt 25 AE003664 83,768 83,577 U2-38ABb taattctcaactgattttagctgcagtcgcatgaagt 25 AE003664 80,609 80,821 U4-38AB taattcccaaatggttctggcttgctgtgaatggaat 25 AE003664 78,697 78,838 U4-39B taattcccaactgcttctggcagcgccggcatggtat 25 AE003669 203,879 203,737 U4-25F taattctcaaaaggttttagcagactccgcatagaga 24 AE003610 126,856 127,003 U5-14B aaattcccaactccttctggccaacactgatcctaga 26 AE003501 221,335 221,226 U5-23D gaattcccaaaaagttctatcacagaacgaatctagg 25 AE003581 87,935 87,805 U5-34A tgattcccaacatgttcaagctcgttctaaatgatcg 25 AE003639 124,015 124,141 U5-35D tgattcccaacacgttcaagcaatttcttagtggtac 25 AE003648 179,845 179,720 U5-38ABa taattcccaagcggttctattcaatattgagtatgga 24 AE003664 80,037 79,911 U5-38ABb aaattcccaattccttctggccaatactgatcctaga 26 AE003664 84,377 84,503 U5-63BC taattcccaacgtgttaaagcagtcactgaatagagt 24 AE003477 26,662 27684 U6-96Aa tgattcccaagtacatattctgcaagagtacagtata 27 AE003748 102,469 102,574 U6-96Ab taattctcaactgctctttcctgatgttgatcattta 26 AE003748 103,072 103,178 U6-96Ac taattctcaacttctttttccagactcagttcgtata 26 AE003748 103,595 103,701 U4atac-82E taattcccaactagtactggccacttttgcttgaggt 21 AE003603 245,404 245,525 U6atac-29B aaattcccaagttctttttccgcatggagtgcttata 26 AE003621 31,949 32,045 U12-73B tgattcccaaccggttctggttgcatggccatgagtt 25 AE003526 210,776 210,539 Notes: U1-82Ea is not in the Celera sequence. That could be due to a strain difference or to an error in sequence assembly (the U1-82Ea and U1-82Ec genes are located nearby and are nearly identical [Lo & Mount. Nucleic Acids Res. 18: 6971-6979]). However, it should be noted that U1-82Ea encodes the variant U1b which differs from U1a by a single nucleotide and is known to be expressed in Kc cells and Oregon R flies. U1-82Eb is a pseudogene and so is not listed. It lies between U1-82Ea and U1-82Ec and is also missing from the Celera sequence. U1-95Cc encodes the U1c variant. I have used the term U4d for U4-25F to avoid confusion. The U4a, U4b and U4c sequences published by Guthrie and Patterson (A.R. Genetics 1988) are attributed to Kiss, unpublished, and the U4-38AB gene resembles both U4a and U4c (sometimes one and sometimes another) at points where they differ. No U5 gene entirely agrees with the published RNA sequence. Distances to the are cap site are therefore approximate. ------------------------------------ RNA sequences (for confirmation): >gi|174317|gb|K00787.1|DROUR1A D. melanogaster U1 small nuclear RNA GATACTTACCTGGCGTAGAGGTTAACCGTGATCACGAAGGCGGTTCCTCCGGAGTGAGGCTTGGCCATTG CACCTCGGCTGAGTTGACCTCTGCGATTATTCCTAATGTGAATAACTCGTGCGTGTAATTTTTGGTAGCC GGGAATGGCGTTCGCGCCGTCCCGA >U2 snRNA (from various) atcgcttctcggccttatggctaagatcaaagtgtagtatctgttcttatcagcttaacatctgatagttcctccattgg aggacaacaaatgttaaactcatttttggaatcagacggagtgctaggggcttgctccacctctgtcacgggttggcccg gtattgcagtaccgccgggatttcggcccaac >gi|174319|gb|K03095.1|DROUR4 D. melanogaster U4 small nuclear RNA AGCTTAGCGCAGTGGCAATACCGTAACCAATGAAGCCTCCCTGAGGTGCGGTTATTGCTAGTTGAAAACT TTAACCAACCCACGCCATGGGACGTGAAATACCGTCCACTACGGCAATTTTTGGAAGCCCTTACGAGGGC TAA >gi|174320|gb|K03096.1|DROUR5 D. melanogaster U5 small nuclear RNA, 3' end ATACTCTGGTTTCTCTTCAATGTCGAATAAATCTTTCGCCTTTTACTAAAGATTTCCGTGGAGAGGAACA CTCTAAGAGTCTAAAACTAATTTTTTAGTCAGTCTTGTCGCAAGACTGGGGCCA >gi|8768|emb|X06669.1|DMU6 Drosophila melanogaster U6 snRNA NGTTCTTGCTTCGGCAGAACATATACTAAAATTGGAACGATACAGAGAAGATTAGCATGGCCCCAGCGCA AGGATGACACGCAAAATCGTGAAGCGTTCCACATTTTT > Drosophila U4atac -- hypothetical caataatgttataaataataaacaatttttaatttttagaaggaagtcaaaagtagagtgtaaatcgcttattacacttt atttacaaacgatattttagtgtatgcaatatttcccttgc > Drosophila U6atac -- hypothetical gtgttgtttggaaggagagcaagttagcactcccctagacaaggatggaacacataaacggtcggctaggcacagacaaa agccgtccacaaatttt > Drosophila U12 -- hypothetical gtgcctcaaactaatgagtaaggaaaaccaatcagccttgctaatcgcttggcagtattggcttctaggcaggggggcgt gtcccgcgccccttgaagctcaaatttttgcaagggcacaggtcgtcccctcctcctccgcgtgggtggcgttcggccga gcgaaccggcgcctactttgcgtccggctagcgaggatctctgggtgccatcccacggctgggtgttgcgatctgccc \################################################## Relevant text from Mount and Salz (manuscript in preparation for a special issue of the Journal of Cell Biology): The Drosophila genome contains multiple copies of the 5 UsnRNAs found in the major class of sliceosomes. We found five genes for U1, six genes for U2, three genes for U4, seven genes for U5, and three genes for U6. With the exception of U4-25F, and the U5 genes (which were previously known only by in situ hybridization), these genes had been described previously [Lo, 1990; Saba, 1986; Das, 1987; Alonso, 1984; Saluz, 1988]. The variant U4-25F gene presumably escaped detection because the predicted RNA has only 69% identity with the major form of fly U4 [Saba, 1986] and 68% with human U4. This degree of divergence is therefore quite high (human and fly U4 share 73% identity), yet the gene is likely to be functional because the promoter is conserved and some of the variation includes compensatory changes that allow formation of the conserved stem loop structures. The Drosophila genome also contains a minor class, or U12, introns [Hall, 1994; Tarn, 1996] and is therefore likely to contain the U12-type spliceosome. Identification of snRNAs for the U12-type spliceosomes was difficult because they had not been described previously, and were somewhat diverged. However, by modification of the standard parameters for blastn (see Methods), it was possible to find one gene for U12, one gene for U6atac and one gene for U4atac. These are almost certainly real genes, as critical sequences are conserved. In addition, the highly conserved snRNA promoter, including a 9/10 or perfect match to the PSE consensus TAATTCCCAA approximate 52 nucleotides upstream of the start [Lo, 1990 \#19; Jensen, 1998] is present in each case. No U11 gene was found. This may be due to divergence beyond what can be detected by blastn searches. However, it is striking that the one identified protein component unique to the U11/U12 snRNP (and minor spliceosome), the U11 35 kd. subunit [Will, 1999 \#20] cannot be found either. In fact, the U11 snRNP may not be required for splicing, as its role should be 5' splice site recognition, and the highly conserved minor splice site consensus is also complementary to U6atac snRNA. Thus, minor class 5 splice sites could be recognized by the U6atac snRNA alone, or by an unknown protein that acts during the early steps of splicing. This mechanism would be analogous to a situation seen in vitro where certain vertebrate introns can be processed in the absence of U1 snRNP if the 5 splice sites can be recognized by U6 snRNA [Crispino,1996 \#21]. \################################################## Stephen M. Mount Cell Biology and Molecular Genetics H. J. Patterson Hall University of Maryland College Park, MD 20742-5815 Phone 301-405-6934 FAX 301-314-9081 permanent email address sm193@XXXX \################################################## From smount@XXXX Fri May 12 19:25:14 2000 Envelope-to: ma11@XXXX Delivery-date: Fri, 12 May 2000 19:25:14 +0100 X-Authentication-Warning: rac3.wam.umd.edu: smount owned process doing -bs Date: Fri, 12 May 2000 14:30:13 -0400 (EDT) From: Stephen M Mount <smount@XXXX> To: Michael Ashburner <ma11@XXXX> Subject: Re: snRNA report - some questions MIME-Version: 1.0 Michael, Sorry to take so long to reply. I had meant to go back and check all of the obscure Flybase snRNA entries (most of which are based on poor data), but I haven't finished yet, and I am about to fly to CA for Gerry's 50th. What follows is as far as I got. Steve \################################################## > Steve .. I am now updating the snRNAs in FB with your data. > I have some questions, which I need answered before I actually > pass the updates on to the database. > > 0. You made some other minor changes > to the names (ie the cytologies) .. were these on the basis of new evidence ? > > snRNA:U5:23DE > snRNA:U5:23D > snRNA:U5:34AB > snRNA:U5:34A > snRNA:U5:35EF > snRNA:U5:35D > snRNA:U5:63A > snRNA:U5:63BC The old evidence was in situ hybridization, and I suspect that some if it was fairly inaccurate (see below). The names I assigned were based entirely on the sequence. Specifically, on interpolation between nearby genes within that same sequence accession (e.g. snRNA:U5:63BC is near 'BtbVII,' which was assigned to 63B-C). I recognize that this has it's own problems. In this case, the BtbVII is over 50 kb. to the right, and the only mapped gene on the adjacent accession, Shab, ='CG1066,' 63A1-63A7,' is 130 kb. in the other direction. > 1. snRNA:U2:39B . & snRNA:U2:40AB . FlyBase had these genes but with NO > attached sequence. You have U2-38ABa and U2-38ABb. Do I assume one of > them is the U2:39B gene & the other is the U2:40AB gene ? Probably not. U2-38ABa and U2-38ABb are less than 3 kb. apart and would not have been distinguished by the method used (in situ hybridization). My response to the following questions is the same. The in situs in those old papers were just not that good. Remember, snRNAs make short probes. Your guess as to which is which is probably better than mine. > 2. FB had these genes with these accession numbers. Not on your lists: > snRNA:U2:84Ca , X04247 > snRNA:U2:84Cb , X04245, X04246 > > 3. FB has snRNA:U4:39B , AQ034021 .. is this your U4:38AB > > 4. FB has snRNA:U4:40AB , no sequence .. is this your U4:39B > > 5. FB has U5:39B with no sequence data. Is this your U5:38ABa ? > > 6. FB has these: > snRNA:U2:84Ca , X04247 > snRNA:U2:84Cb , X04245, X04246 > > Not on your list > > 7. I think I asked you this before: Any idea what these are ? > > \*a snRNA-a > \*z FBgn0003454 > \*x FBrf0042443 == Arrigo et al., 1985, EMBO J. 4: 399--406 > \*g M26817 > \# > \*a snRNA-b > \*z FBgn0003455 > \*x FBrf0042443 == Arrigo et al., 1985, EMBO J. 4: 399--406 > \*g M26818 > \# > \*a snRNA-c > \*z FBgn0003456 > \*x FBrf0042443 == Arrigo et al., 1985, EMBO J. 4: 399--406 > \*g M26819 > \# > \*a snRNA-d > \*z FBgn0003457 > \*x FBrf0042443 == Arrigo et al., 1985, EMBO J. 4: 399--406 > \*g M26820 > \# > \*a snRNA:K2a > \*z FBgn0016982 > \*x FBrf0038653 == gm626.h == Wooley et al., 1982, Proc. Natl. Acad. Sci. USA 79(22): 6762--6766 > \# > \*a snRNA:K2b > \*z FBgn0016981 > \*x FBrf0038653 == gm626.h == Wooley et al., 1982, Proc. Natl. Acad. Sci. USA 79(22): 6762--6766 > \# > \*a snRNA:K8 > \*z FBgn0016980 > \*x FBrf0038653 == gm626.h == Wooley et al., 1982, Proc. Natl. Acad. Sci. USA 79(22): 6762--6766 > \# > \*a snRNA:K9 > \*z FBgn0016979 > \*x FBrf0038653 == gm626.h == Wooley et al., 1982, Proc. Natl. Acad. Sci. USA 79(22): 6762--6766 > \# > > \*g is the sequence accession and \*x the reference > > Thanks for all the help. > > Michael >