Subject: nc5 report, third installment Hi Aubrey and Gillian, Here is another installment of P insertions. Let me know if you have any questions about them. cheers, sima >l(2)05642=Uba1 p_name l(2)05642 name CG1782 gene Uba1 ct_name CT5340 relation inside r_orientation \+ inside_intron 0 inside_exon 1 dist5 22 dist3 5018 <up>maps at nt 22 of gene in same orientation as gene; EP(2)2375 maps 544bp downstream in same gene</up> >l(2)05643=Rpt1 p_name l(2)05643 name CG1341 gene Rpt1 ct_name CT3016 relation inside r_orientation \+ inside_intron 0 inside_exon 1 dist5 8 dist3 1389 <up>l(2)05643 and EP(2)2153 inserted at same nucleotide, 8nt downstream of Rpt1 gene, a gene nested in an intron of CG17985; insertion in same orientation as gene</up> >l(2)05714=CG8886 p_name l(2)05714 name CG3792 gene ct_name CT12669 relation behind r_orientation \+ inside_intron 0 inside_exon 0 dist5 6081 dist3 7323 p_name l(2)05714 name CG8891 gene ct_name CT25526 relation front r_orientation \- inside_intron 0 inside_exon 0 dist5 8023 dist3 7320 CG8886|FBan0008886|CT25498|FBan0008886 last_updated:000321 Length = 940 Score = 178 bits (90), Expect = 3e-45 Identities = 90/90 (100%), Positives = 90/90 (100%) Query 1 gcgccgagtaaccgtcattactagacgccagccagctggagagctccggatatttctccc 60 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 290 gcgccgagtaaccgtcattactagacgccagccagctggagagctccggatatttctccc 231 Query 61 actctgtgcgatagcgttggtggtacagtc 90 |||||||||||||||||||||||||||||| Sbjct 230 actctgtgcgatagcgttggtggtacagtc 201 <up>not near any annotation by Guochun (not sure why); but by BLAST, inserted in opposite orientation at bp 208 of CG8886 (in LD03394.5')</up> >l(2)05836=CG7392=WD-40-family-member p_name l(2)05836 name CG7392 gene WD-40-family-member ct_name CT22741 relation inside r_orientation \+ inside_intron 1 inside_exon 0 dist5 359 dist3 7378 <up>l(2)05836 and EP(2)2514 are 1bp away from each other in intron of CG7392; l(2)05836 inserted in same orientation as gene, 53bp downstream of end of first exon</up> >l(2)05847=CG8732 p_name l(2)05847 name CG8732 gene CG8732 ct_name CT25221 relation inside r_orientation \+ inside_intron 1 inside_exon 0 dist5 3089 dist3 12768 <up>l(2)05847 inserted in intron of CG8732, in same orientation as gene, 473bp downstream of end of first exon; EP(2)2365 also inserted 286bp downstream of l(2)05847 in same intron; plasmid rescue flanking sequence probably needs trimming since only bp56-368 match genomic sequence (or polymorphic between strains)</up> >l(2)06225=CG6105 p_name l(2)06225 name CG6105 gene ct_name CT19171 relation inside r_orientation \- inside_intron 2 inside_exon 0 dist5 692 dist3 782 <up>inserted in opposite orientation in second intron of CG6105</up> >l(2)06270=CG8846=Phas1? p_name l(2)06270 name CG8846 gene Phas1 ct_name CT9265 relation behind r_orientation \+ inside_intron 0 inside_exon 0 dist5 20 dist3 1163 <up>l(2)06270 in hotspot for insertion: EP(2)2244 at 3396566, EP(2)1010, l(2)k00609 (but complemented), EP(2)2266, EP(2)2085, EP(2)2255, EP(2)2429, EP(2)2350, l(2)k13506 (but complemented), l(2)06270 at 3396689, EP(2)0818, EP(2)2550, and l(2)k07736 at 3396696 (all within 130bp); l(2)06270 only 20bp upstream of Phas 1 start in same orientation as gene</up> >l(2)06496=CG9893 p_name l(2)06496 name CG9893 gene CG9893 ct_name CT27878 relation inside r_orientation \+ inside_intron 0 inside_exon 1 dist5 18 dist3 622 <up>l(2)06496 inserted in same orientation as gene 18bp downstream of start of CG9893 transcription and in LD16285.5'</up> >l(2)06655=? <up>no computed analysis but BLASTing against All fly sequence: 3' sequence indicates inserted at 110608 minus orientation of AE003804.1, not near any annotation; 5' says inserted at 112619 in minus orientation of AE003804.1; polymorphism in strain or multiple inserts</up> 3' sequence: Score = 326 (48.9 bits), Expect = 4.2e-08, P = 4.2e-08 Identities = 70/76 (92%), Positives = 70/76 (92%), Strand = Minus / Plus Query: 76 GCGCCTTAGACCGCTCGGCCACGCTACCATGTCGTAGATTATGTCAAACGCACATGAAGA 17 |||||||||||||||||||||||||||||||| |||||||||||||||||| || |||| Sbjct: 110533 GCGCCTTAGACCGCTCGGCCACGCTACCATGTTGTAGATTATGTCAAACGCTCAACAAGA 11059 2 Query: 16 CACAAGGCTAGGAAGC 1 || |||||||||||| Sbjct: 110593 GACGAGGCTAGGAAGC 110608 5' sequence: Score = 411 (61.7 bits), Expect = 6.1e-12, P = 6.1e-12 Identities = 83/84 (98%), Positives = 83/84 (98%), Strand = Minus / Plus Query: 84 GGCGACCCTTTTGCCAGTACGCTTGTTGCCAGTTTCAAGTGTTTTTGTTGCACGTTGACA 25 ||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||| Sbjct: 112612 GGCGACCCTTTTGCCGGTACGCTTGTTGCCAGTTTCAAGTGTTTTTGTTGCACGTTGACA 11267 1 Query: 24 GTCCGTTAAAGGTGACACAGGCGC 1 |||||||||||||||||||||||| Sbjct: 112672 GTCCGTTAAAGGTGACACAGGCGC 112695 >l(2)06708 inserted in repeat in genome >l(2)06850=grh <up>no computed analysis; BLASTN against All shows l(2)06850 inserted in minus orientation at 250845 of AE003801.1, in intron of grh in same orientation as gene; also in intron of LP11035.5'; may be polymorphism because no match from 122 to 249 of flanking sequence to genomic sequence</up> gadfly| SEG:AE003801 |gb|AE003801.1|Drosophila melanogaster genomic scaffold 142000013386047 section 11 of 52, complete sequence.|AE003801.1 GI:7302679 Length = 272,650 Minus Strand HSPs: Score = 902 (135.3 bits), Expect = 9.1e-56, Sum P(2) = 9.1e-56 Identities = 186/192 (96%), Positives = 186/192 (96%), Strand = Minus / Plus Query: 441 CGAATTCCCCCAAAGGGAAGTGGNTCGATGTGTGACTGCGATGGTGCTATGTAGCGAGCT 382 CGAA TC C CAAAGG AAGTGG TCGATGTGTGACTGCGATGGTGCTATGTAGCGAGCT Sbjct: 250407 CGAACTCGCGCAAAGG-AAGTGGTTCGATGTGTGACTGCGATGGTGCTATGTAGCGAGCT 25046 5 Query: 381 CTGTGCGGGAGCGAGAGCACCATCCAAATCACGTACGCACACATACGATCGTATAACAGC 322 CTGTGCGGGAGCGAGAGCACCAT CAAATCACGTACGCACACATACGATCGTATAACAGC Sbjct: 250466 CTGTGCGGGAGCGAGAGCACCATACAAATCACGTACGCACACATACGATCGTATAACAGC 25052 5 Query: 321 GTGCGAGCGTATAACAGTTGACCGTGCAGTGGCAGCAGCAAAGGCTCATTGTTGTTGTTG 262 GTGCGAGCGTATAACAGTTGACCGTGCAGTGGCAGCAGCAAAGGCTCATTGTTGTTGTTG Sbjct: 250526 GTGCGAGCGTATAACAGTTGACCGTGCAGTGGCAGCAGCAAAGGCTCATTGTTGTTGTTG 25058 5 Query: 261 CTCGCGTACATA 250 CTCGCGTACATA Sbjct: 250586 CTCGCGTACATA 250597 Score = 600 (90.0 bits), Expect = 9.1e-56, Sum P(2) = 9.1e-56 Identities = 120/120 (100%), Positives = 120/120 (100%), Strand = Minus / Plus Query: 121 ACGCTGATTCGTATTGGCTGCAGCCACTAATACTCATTGCTCACTCACACCAGCAATTGA 62 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 250726 ACGCTGATTCGTATTGGCTGCAGCCACTAATACTCATTGCTCACTCACACCAGCAATTGA 25078 5 Query: 61 CAGACCAATTGCAGGTCTGTTAGACAACAGCAGCGGCAGCAACAACTACATCGCCCAGCA 2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 250786 CAGACCAATTGCAGGTCTGTTAGACAACAGCAGCGGCAGCAACAACTACATCGCCCAGCA 25084 5 >l(2)06949=CG8151? p_name l(2)06949 name CG8151 gene CG8151 ct_name CT24236 relation behind r_orientation \- inside_intron 0 inside_exon 0 dist5 35 dist3 2283 <up>l(2)06949 inserted 35bp upstream of CG8151 in opposite orientation of gene</up> >l(2)07129=CG10941=mm? p_name l(2)07129 name CG10941 gene mm ct_name CT30649 relation behind r_orientation \+ inside_intron 0 inside_exon 0 dist5 61 dist3 96016 <up>l(2)07129 inserted just 61bp upstream of start of mm; hotspot since same insertion position as l(2)k07110 (but complemented) and l(2)k04222b (but complemented)</up> >l(2)07806=? <up>no computed analysis but BLASTN against All indicated inserted in reverse orientation relative to AE003452.1 around 193215, not near any annotations</up> gadfly| SEG:AE003452 |gb|AE003452.1|Drosophila melanogaster genomic scaffold GI:7291191 Length = 305,505 Minus Strand HSPs: Score = 1299 (194.9 bits), Expect = 4.7e-52, P = 4.7e-52 Identities = 267/276 (96%), Positives = 267/276 (96%), Strand = Minus / Plus Query: 277 TTGGCAAATTCTTGGTTATAGCACTTCTATAACTTGCATTTCAAATATCTTAAGAGATAG 218 || ||||||||||||||||||||||||| |||||||||||||||||||| |||||||||| Sbjct: 192940 TTTGCAAATTCTTGGTTATAGCACTTCTTTAACTTGCATTTCAAATATCATAAGAGATAG 19299 9 Query: 217 CTAATTAAGTCTTAGCCCAACTTTTTAAGCCAAGCTTGGAAGCTCCCGCTCTCTTCCAAG 158 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 193000 CTAATTAAGTCTTAGCCCAACTTTTTAAGCCAAGCTTGGAAGCTCCCGCTCTCTTCCAAG 19305 9 Query: 157 CTTTTTTCGCGCTCTTTGTGGAGCGCGGCTTTTGGTGGCTCGCCTGGCAATGCAATTCAG 98 |||||||||||||||||||||||||||||||||||| ||||||||||||||||||||||| Sbjct: 193060 CTTTTTTCGCGCTCTTTGTGGAGCGCGGCTTTTGGTAGCTCGCCTGGCAATGCAATTCAG 19311 9 Query: 97 TTGCCAAACCAAACGCGATCGGCTAACAACCTCGTACCGCTCGCTCGCAGCTTTAGCTTT 38 ||||||||| |||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 193120 TTGCCAAACAAAACGCGATCGGCTAACAACCTCGTACCGCTCGCTCGCAGCTTTAGCTTT 19317 9 Query: 37 ACCGTTACGCCAGTGATTGAGCACTACTATATACCA 2 |||||||| |||||||||||||||||| |||| | | Sbjct: 193180 ACCGTTACACCAGTGATTGAGCACTACAATATCCGA 193215 >l(2)07837=? p_name l(2)07837 name CG3245 gene PpN58A ct_name CT10905 relation behind r_orientation \+ inside_intron 0 inside_exon 0 dist5 5491 dist3 6551 <up>l(2)07837 5.5kb upstream of PpN58A, in same orientation, but not near any other gene</up> >l(2)08014=? <up>no computed analysis for l(2)08014 but BLASTN of All indicates insertion in positive orientation at 186016 of AE003626.1; CG4539 starts at 186414 so ~400bp upstream in same orientation as gene</up> gadfly| SEG:AE003626 |gb|AE003626.1|Drosophila melanogaster genomic scaffold 142000013386055 section 19 of 63, complete sequence.|AE003626.1 GI:7297545 Length = 265,236 Plus Strand HSPs: Score = 1502 (225.4 bits), Expect = 3.2e-61, P = 3.2e-61 Identities = 330/354 (93%), Positives = 330/354 (93%), Strand = Plus / Plus Query: 1 GCGCCAAGTCGAGAATTTGTTCCTTCAGTCTCGTCATCTGTTTCATCATTTGAGGCATAA 60 GCGCCAAGTCGAGAATTTGTTCCTTCAGTCTCGTCATCTGTTTCATCATTTGAGGCATAA Sbjct: 185960 GCGCCAAGTCGAGAATTTGTTCCTTCAGTCTCGTCATCTGTTTCATCATTTGAGGCATAA 18601 9 Query: 61 CACTCTTCCATTTAAGCACTTTAATTCACTTGATGTAAAACTATTTACGAAGAGCACACA 120 CACTCTTCCATTTAAGCACTTTAATTCACTTGAT TAAAACTATTTACGAAGAGCACACA Sbjct: 186020 CACTCTTCCATTTAAGCACTTTAATTCACTTGATATAAAACTATTTACGAAGAGCACACA 18607 9 Query: 121 ACTTACCATTTCGACTCTTGAAATAAAATGAACGAGTTTCGTCATTGCACGTGGAAGGAA 180 ACTTACCATTTCGA TCTTGAAATAAAATGAACGAGTTTCGTCAT GCACGTGGAAGGAA Sbjct: 186080 ACTTACCATTTCGAGTCTTGAAATAAAATGAACGAGTTTCGTCATCGCACGTGGAAGGAA 18613 9 Query: 181 AAATTCTGTGAAAAGATGGCGGCCAACTATCGATGTCTCTGAATGCAACCATGGTAGTAT 240 AAATTCTGTGAAAAGATGGCGGCCAACTATCGATGTCTCTGAATGCAACCATGGTAGTAT Sbjct: 186140 AAATTCTGTGAAAAGATGGCGGCCAACTATCGATGTCTCTGAATGCAACCATGGTAGTAT 18619 9 Query: 241 CGTATCGGAAAAATTGTTTGGTTTTTGCCTGATATATATTGTATAAGAATGTG-AATAAA 299 CGTATCGGAAAAATTGTTTGGTTTTTGCCTGATATATATTGTATAAGAATGTG AATAA Sbjct: 186200 CGTATCGGAAAAATTGTTTGGTTTTTGCCTGATATATATTGTATAAGAATGTGCAATAAT 18625 9 Query: 300 TGATATT-CC-ATTTAAT--T-TTC-ATTAAATCATTCATAAAATACTTAATATA 348 T A T CC A T AAT T TTC ATT AAT TTCAT AAAT TT ATA A Sbjct: 186260 TAACTTAACCGAATAAATGATATTCCATTTAATT-TTCATTAAATCATTCATAAA 186313 >l(2)08307 line has insertion that maps to CG3971 (on III) p_name l(2)08307 name CG3971 gene CG3971 ct_name CT13185 relation inside r_orientation \+ inside_intron 1 inside_exon 0 dist5 296 dist3 11219 <up>l(2)08307 inserted in same orientation as CG3971 in first intron about 120bp downstream of end of first exon; element maps to 33A but gene maps to 73B; multiple insert strain or needs to be resequenced?; l(3)02281 and l(3)neo21 also appear to have insertions in this intron, as well as a number of 3rd chromosome EP lines</up> >l(2)08492=CG1952 p_name l(2)08492 name CG1952 gene CG1952 ct_name CT6120 relation inside r_orientation \+ inside_intron 0 inside_exon 2 dist5 685 dist3 3063 <up>l(2)08492 in second exon of CG1952, in same orientation as gene</up> >l(2)08717=CG15095 p_name l(2)08717 name CG15095 gene CG15095 ct_name CT34970 relation inside r_orientation \- inside_intron 1 inside_exon 0 dist5 1790 dist3 3335 <up>l(2)08717 in opposite orientation of gene in first long intron about 1.3kb downstream of end of first exon</up>