Open Close
Reference
Citation
Misra, S. (2000.8.25). nc5 report, second installment. 
FlyBase ID
FBrf0131112
Publication Type
Personal communication to FlyBase
Abstract
PubMed ID
PubMed Central ID
Text of Personal Communication
From sima@XXXX Fri Aug 25  18:55:49  2000
Envelope-to: gm119@XXXX
Delivery-date: Fri, 25 Aug 2000  18:55:49  \+0100
Date: Fri, 25 Aug 2000  11:02:17  \-0700 (PDT)
From: Sima Misra <sima@XXXX>
X-Sender: sima@fruitfly
To: Aubrey de Grey <ag24@XXXX>
cc: curators@XXXX
Subject: nc5 report, second installment
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; BOUNDARY='-559023410-851401618- 967225668=:13947 '
Content-Length: 33026
Hi Aubrey,
Here is the second installment of nc5 curations. I had substantial help
from Guochun, who has placed most of our P insertions on the genomic
sequence. Guochun provided me with data that I include in my report--here
is the key from Guochun of what the various fields are:
p_name: P insertion name.
name: CG name
gene: if the CG is a known gene, then here is the gene name
ct_name: is the CT of the above CG, and being used to decide the
relation to P insertion
relation: is the relation of P insertion and the CG. There are 3
values for this field: inside, front, and behind. 'inside'
is obvious, and if a P is inside a CG, then inside_intron
or inside_exon field will indicate which intron/exon it
actually inserts. I'll use following diagram to explain
'front' and 'behind' relation.
===> is CG with orientation.
\------- is genomic sequence.
Normally there are 4 CG around a given P insertion site.
P insertion site
|
front | behind
==========> V =======>
5' \----------------------------------------------> 3'
3' <---------------------------------------------- 5'
<====== <========
behind front
so each P insertion will have TWO CG with relation 'front'
or 'behind'.
You also can go to my P insertion Genescene launch page
'http://weasel.lbl. gov:94 /cgi-bin/pins/test.pl' to see
some examples. It'll help to explain the relation.
r_orientation: is the relative orientation of P insertion alignment
and CG. e.g. if CG is on plus strand, and P aligned on
minus strand, then r_orientation is '-'.
inside_intron: indicates which intron the P insertion inserts. 0 means not
inside.
inside_exon: indicates which exon the P insertion inserts. 0 means not
inside.
dist5: gives the distance from 5' end of CT to the P insertion
site
dist3: gives the distance from 3' end of CT to the P insertion
site
My personal comments after looking at the BFD report and GeneSeen are in
brackets [].
I also looked at each line in GeneSeen, and looked at the BFD insertion
line report to check the genetics. I found many cases where insertions
mapped to the same place, or very close together, but surprisingly (to me,
anyways) many times these insertions complemented each other. I'm not
sure what to do in cases where insertions map to the same transcription
unit but complement; I'm sure you all are experts at this sort of
situation, and will annotate appropriately.
.
Let me know if anything is unclear.
\-sima
\------------------------------------------------------------------------------
--
>l(2)01528,l(3)rJ880,l(3)j12B4 sequenced insertions in repeat in genome
>l(2)00231=?
p_name l(2)00231
name CG4272
gene  BcDNA:GH09817 
ct_name CT9361
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 351
dist3 3998
[l(2)00231 complements l(2)k08232, even though they are only 115bp away from
each other; l(2)k08232 is 236bp upstream of  BcDNA:GH09817 , which is nearest
gene to l(2)00231]
>l(2)00232 sequence=CG7664
p_name l(2)00232
name CG7664
gene crp
ct_name CT23425
relation inside
r_orientation \-
inside_intron 2
inside_exon 0
dist5 18013
dist3 5643
.
>l(2)00248=l(2)01275=l(2)k17014=l(2)k07303=Ef1alpha48D
p_name l(2)00248
name CG8280
gene Ef1alpha48D
ct_name CT24517
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 42
dist3 2850
[most likely Ef1alpha48D, since insertion is only 42bp upstream of
transcription start at 6892926; hotspot for insertion since l(2)01275 22bp
upstream of l(2)00248, l(2)k17014 17bp upstream, and l(2)k07303 14bp upstream;
l(2)01275 and l(2)k17014 are known to be alleles of Ef1alpha48D and both are
further away (more 5'); none ever tested for complementation]
>l(2)00629=?
p_name l(2)00629
name CG13438
gene CG13438
ct_name CT32796
relation front
r_orientation \-
inside_intron 0
inside_exon 0
dist5 5530
dist3 4790
[l(2)00629 insertion not near any annotations, but hotspot for insertion,
since l(2)00629, l(2)k07001, and l(2)k06409 within 7bp of each other; amazing
l(2)00629 was complemented by l(2)k06409 & l(2)k07001]
>l(2)01038=mm
p_name l(2)01038
name CG10941
gene mm
ct_name CT30649
relation inside
r_orientation \+
inside_intron 2
inside_exon 0
dist5 38086
dist3 57869
[gene has 4 introns, this insertion in middle of largest (huge) intron]
>l(2)01085=CG15426
p_name l(2)01085
name CG15426
gene
ct_name CT35488
relation inside
r_orientation \+
inside_intron 0
inside_exon 6
dist5 20251
dist3 3841
>l(2)01094=?
p_name l(2)01094
name CG9403
gene CG9403
ct_name CT9101
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 3149
dist3 8614
p_name l(2)01094
name CG15234
gene CG15234
ct_name CT35171
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 3130
dist3 3358
[possible that l(2)01094=l(2)k03204 since they are inserted 288bp away from
each other, but never tested for complementation (unless typo and l(2)k03404
that non-complements l(2)01094 should be l(2)k03204)]
>l(2)01296=CG3186
p_name l(2)01296
name CG3186
gene CG3186
ct_name CT10685
relation inside
r_orientation \-
inside_intron 1
inside_exon 0
dist5 493
dist3 1150
>l(2)01351=?
p_name l(2)01351
name CG13109
gene
ct_name CT32343
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 22258
dist3 23996
[no annotations nearby; hotspot for 10 insertions 10.7kb to right]
>l(2)01424=CG3845
p_name l(2)01424
name CG3845
gene CG3845
ct_name CT12829
relation inside
r_orientation \-
inside_intron 0
inside_exon 1
dist5 4
dist3 7047
>l(2)01466=ATPCL=CG8322
p_name l(2)01466
name CG8322
gene ATPCL
ct_name CT18257
relation inside
r_orientation \-
inside_intron 1
inside_exon 0
dist5 937
dist3 6500
>l(2)01810=CG5304
p_name l(2)01810
name CG5304
gene CG5304
ct_name CT16877
relation inside
r_orientation \-
inside_intron 1
inside_exon 0
dist5 334
dist3 7870
>l(2)01848=?
p_name l(2)01848
name CG2672
gene Tkr
ct_name CT9053
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 514
dist3 13063
[l(2)01848 is 514bp upstream of Tkr, but l(2)03263 is inserted 17bp closer and
remarkably they complement]
>l(2)01857=l(2)k00107=l(2)00681=CG2140
p_name l(2)01857
name CG2140
gene CG2140
ct_name CT6982
relation inside
r_orientation \-
inside_intron 1
inside_exon 0
dist5 645
dist3 1587
[inserted in first intron just 10bp downstream of l(2)00681; l(2)k00107 is
located just upstream of start of transcription of CG2140; none tested for
complementation; l(2)00681 must be multiple insert line since one insert maps
to 51B5 and doesn't complement ttv, but sequence maps to CG2140 at 43D3]
>l(2)02045=CG11546
p_name l(2)02045
name CG11546
gene CG11546
ct_name CT36453
relation inside
r_orientation \-
inside_intron 1
inside_exon 0
dist5 4301
dist3 5002
[in intron of transcript CT36453; upstream of 2 other transcripts, CT36451 and
CT9385]
>l(2)02074=CG1512
p_name l(2)02074
name CG1512
gene CG1512
ct_name CT3821
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 24
dist3 4050
[only 24bp upstream of transcription start]
>l(2)02836, l(3)03928, l(3)04069 sequences all map to repeat in genome
p_name l(2)02836
name CG6983
gene CG6983
ct_name CT21627
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 1055
dist3 6919
[must be multiple insert line since genetically insertion is at 53B1, but
sequence maps to 66D1, and other 2 insertions are supposed to be on third, but
instead sequences map to identical nucleotide]
>l(2)03050=CG9350?
p_name l(2)03050
name CG9350
gene CG9350
ct_name CT26565
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 203
dist3 961
[CG9350 is 203bp downstream but only evidence for gene was homology to EST, no
gene prediction]
>l(2)03105=l(2)k16702=?
p_name l(2)03105
name CG12464
gene CG12464
ct_name CT32655
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 13822
dist3 14253
p_name l(2)03105
name CG18369
gene CG18369
ct_name CT41749
relation front
r_orientation \+
inside_intron 0
inside_exon 0
dist5 26306
dist3 24529
[nothing nearby; l(2)k16702 inserted at identical nucleotide; these were never
tested for complementation]
>l(2)03497=wun?
p_name l(2)03497
name CG8804
gene wun
ct_name CT4876
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 164
dist3 9085
[seems to be a hotspot, with 4 insertions within 219bp of each other, just
upstream of wun; complementation never tested with l(2)k09507, which is
inserted in first exon of wun]
>l(2)03563=?
p_name l(2)03563
name CG17390
gene CG17390
ct_name CT33481
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 572
dist3 7041
[CG17390 is only gene close by, but 572bp downstream]
>l(2)03605=l(2)03832=?
p_name l(2)03605
name CG17952
gene CG17952
ct_name CT39996
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 990
dist3 3746
[CG17952 990bp downstream; l(2)03832 inserted 1bp upstream of l(2)03605, but
these were not tested for complementation]
>l(2)03709=CG15081
p_name l(2)03709
name CG15081
gene CG15081
ct_name CT42565
relation inside
r_orientation \+
inside_intron 0
inside_exon 1
dist5 20
dist3 2334
>l(2)03771=CG18323=CG14028
p_name l(2)03771
name CG18323=CG14028
gene
ct_name CT41595=CT33587
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 141
dist3 619
[nothing else around but CG18323=CG14028 141bp downstream]
>l(2)03832=l(2)03605=?
p_name l(2)03832
name CG17952
gene CG17952
ct_name CT39996
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 991
dist3 3747
[see record for l(2)03605; start of transcription of CG17952 is 991bp
downstream]
>l(2)03996=CG8258
p_name l(2)03996
name CG8258
gene CG8258
ct_name CT8297
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 7
dist3 2159
[CG8258 is just 7bp before the start of transcription]
>l(2)04008=?
p_name l(2)04008
name CG6320
gene
ct_name CT19694
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 1499
dist3 8603
p_name l(2)04008
name CG16874
gene Vm32E
ct_name CT19798
relation front
r_orientation \+
inside_intron 0
inside_exon 0
dist5 2850
dist3 2499
[not close to anything, 1.5kb from start of transcription of CG6320]
>l(2)04111=?
p_name l(2)04111
name CG10871
gene
ct_name CT30433
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 20188
dist3 28966
[nearest annotation, CG10871, is 20kb away]
>l(2)04154=CG5935
p_name l(2)04154
name CG5935
gene  EG:EG0003 .6
ct_name CT18411
relation inside
r_orientation \-
inside_intron 1
inside_exon 0
dist5 252
dist3 3886
[but complemented by l(2)k0997, which is inserted in second intron, 769bp
downstream]
>l(2)04329=Nacalpha
p_name l(2)04329
name CG8759
gene Nacalpha
ct_name CT25274
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 7
dist3 1158
>l(2)04493=smt3
p_name l(2)04493
name CG4494
gene smt3
ct_name CT14617
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 33
dist3 832
[also shown to non-complement l(2)04841, which is inserted in smt3]
>l(2)04530=CG9342
p_name l(2)04530
name CG9342
gene CG9342
ct_name CT3751
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 36
dist3 5986
[maps only 36bp upstream of gene]
>l(2)04535=l(2)k16713=tkv
p_name l(2)04535
name CG14026
gene tkv
ct_name CT33585
relation inside
r_orientation \+
inside_intron 2
inside_exon 0
dist5 5237
dist3 17291
[l(2)04535 must be multiple insert line since in situ said it mapped to
42C1-42C2; sequence of l(2)04535 and l(2)k16713 maps to large second intron of
tkv, 211bp apart; l(2)04535 shown to be an allele of tkv genetically,
l(2)k16713 never tested for complementation but is also a known tkv allele]
>l(2)04723 sequence=dock
p_name l(2)04723
name CG3727
gene dock
ct_name CT42218
relation inside
r_orientation \+
inside_intron 1
inside_exon 0
dist5 798
dist3 6434
p_name l(2)04723
name CG3727
gene dock
ct_name CT12313
relation inside
r_orientation \+
inside_intron 1
inside_exon 0
dist5 827
dist3 6434
.
>l(2)04845=l(2)k00208=AGO1?
p_name l(2)04845
name CG6671
gene AGO1
ct_name CT20708
relation inside
r_orientation \-
inside_intron 2
inside_exon 0
dist5 1210
dist3 8908
[l(2)04845 maps to second intron of CT20708, l(2)k08121 maps to third intron
of CT20708, second intron of CT42234, and l(2)k00208 maps to first intron of
CT20708; l(2)k08121 surprisingly complemented l(2)04845 genetically but
l(2)k00208 not tested]
>l(2)05070=CG8392
p_name l(2)05070
name CG8392
gene CG8392
ct_name CT18263
relation inside
r_orientation \-
inside_intron 0
inside_exon 1
dist5 9
dist3 957
[l(2)05070 in first exon]
>l(2)05095=?
p_name l(2)05095
name CG10248
gene Cyp6a8
ct_name CT28799
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 5329
dist3 7009
p_name l(2)05095
name CG17453
gene Cyp317a1
ct_name CT31993
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 5185
dist3 6742
[must be multiple insert line since in situ at 39E1-39E2 and sequence maps to
51D3; sequenced insertion not really near any gene]
>l(2)05248=?
p_name l(2)05248
name CG8297
gene CG8297
ct_name CT20148
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 12775
dist3 13826
p_name l(2)05248
name CG8291
gene CG8291
ct_name CT21678
relation front
r_orientation \-
inside_intron 0
inside_exon 0
dist5 9789
dist3 1070
[original in situ for both l(2)k02205 and l(2)05248 at 52D1-52D2 but inferred
genomic sequence map is 52D9; insertion sequence maps 50bp from l(2)k02205,
which surprisingly complements l(2)05248; not really near any gene]
>l(2)05287=CG12050
p_name l(2)05287
name CG12050
gene CG12050
ct_name CT3661
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 10
dist3 7350
[insertion maps 10bp upstream of gene; in situ was at 39A1-39A2 but inferred
genomic sequence map is 39B1]
>l(2)05428=l(2)k06503=Cdk4/6
p_name l(2)05428
name CG5072
gene Cdk4/6
ct_name CT16072
relation inside
r_orientation \-
inside_intron 3
inside_exon 0
dist5 4371
dist3 1562
p_name l(2)05428
name CG5072
gene Cdk4/6
ct_name CT15896
relation inside
r_orientation \-
inside_intron 3
inside_exon 0
dist5 3765
dist3 1562
[l(2)05428 and l(2)k06503 map to third intron of gene within 71bp of each
other; never tested for complementation]
>l(2)k00107=l(2)01857=l(2)00681=CG2140
p_name l(2)k00107
name CG2140
gene CG2140
ct_name CT6982
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 36
dist3 2268
[insertion is located 36bp upstream of start of transcription of CG2140; other
insertions map to intron, within 10bp of each other; none tested for
complementation; l(2)00681 must be multiple insert line since one insert maps
to 51B5 and doesn't complement ttv, but sequence maps to CG2140 at 43D3]
>l(2)k00208=l(2)04845=AGO1?
p_name l(2)k00208
name CG6671
gene AGO1
ct_name CT20708
relation inside
r_orientation \+
inside_intron 1
inside_exon 0
dist5 560
dist3 9558
p_name l(2)k00208
name CG6671
gene AGO1
ct_name CT42234
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 1849
dist3 9558
[l(2)04845 maps to second intron of CT20708, l(2)k08121 maps to third intron
of CT20708, second intron of CT42234, and l(2)k00208 maps to first intron of
CT20708; l(2)k08121 surprisingly complemented l(2)04845 genetically but
l(2)k00208 not tested]
>l(2)k02205=?
p_name l(2)k02205
name CG8297
gene CG8297
ct_name CT20148
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 12825
dist3 13876
p_name l(2)k02205
name CG8291
gene CG8291
ct_name CT21678
relation front
r_orientation \+
inside_intron 0
inside_exon 0
dist5 9839
dist3 1120
[original in situ for both l(2)k02205 and l(2)05248 at 52D1-52D2 but inferred
genomic sequence map is 52D9; insertion sequence maps 50bp from l(2)k02205,
which surprisingly complements l(2)05248; not really near any gene]
>l(2)k03204=?
p_name l(2)k03204
name CG9403
gene CG9403
ct_name CT9101
relation behind
r_orientation \+
inside_intron 0
inside_exon 0
dist5 3437
dist3 8902
p_name l(2)k03204
name CG15234
gene CG15234
ct_name CT35171
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 2842
dist3 3070
[possible that l(2)01094=l(2)k03204 since they are inserted 288bp away from
each other, but never tested for complementation]
>l(2)k06409=?
p_name l(2)k06409
name CG13438
gene CG13438
ct_name CT32796
relation front
r_orientation \-
inside_intron 0
inside_exon 0
dist5 5523
dist3 4783
[l(2)k06409 insertion not near any annotations, but hotspot for insertion,
since l(2)00629, l(2)k07001, and this one within 7bp of each other; amazing
since l(2)k06409 was complemented by l(2)00629 & l(2)k07001]
>l(2)k07001=?
p_name l(2)k07001
name CG13438
gene CG13438
ct_name CT32796
relation front
r_orientation \+
inside_intron 0
inside_exon 0
dist5 5525
dist3 4785
[l(2)k07001 insertion not near any annotations, but hotspot for insertion,
since l(2)00629, l(2)k07001, and l(2)k06409 within 7bp of each other; amazing
since all 3 complement]
>l(2)k08121=AGO1?
p_name l(2)k08121
name CG6671
gene AGO1
ct_name CT20708
relation inside
r_orientation \-
inside_intron 3
inside_exon 0
dist5 4098
dist3 6020
p_name l(2)k08121
name CG6671
gene AGO1
ct_name CT42234
relation inside
r_orientation \-
inside_intron 2
inside_exon 0
dist5 1689
dist3 6020
p_name l(2)k08121
name CG6671
gene AGO1
ct_name CT42236
relation inside
r_orientation \-
inside_intron 0
inside_exon 1
dist5 5
dist3 6020
[l(2)04845 maps to second intron of CT20708, l(2)k08121 maps to third intron
of CT20708, second intron of CT42234, and l(2)k00208 maps to first intron of
CT20708; l(2)k08121 surprisingly complemented l(2)04845 genetically but
l(2)k00208 not tested; hard to determine why these are all in AGO1 but 2
complement]
>l(2)k16702=l(2)03105=?
p_name l(2)k16702
name CG12464
gene CG12464
ct_name CT32655
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 13822
dist3 14253
p_name l(2)k16702
name CG18369
gene CG18369
ct_name CT41749
relation front
r_orientation \+
inside_intron 0
inside_exon 0
dist5 26306
dist3 24529
[nothing nearby but l(2)03105 at same nucleotide; not tested for
complementation]
>l(3)03928, l(2)02836, l(3)04069 sequences all map to repeat in genome
p_name l(3)03928
name CG6983
gene CG6983
ct_name CT21627
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 1055
dist3 6919
[must be multiple insert line, since insertion maps genetically to third, but
sequence maps to 66D1, and other 2 insertions' sequence likewise map in wrong
position at identical nucleotide]
>l(3)04069, l(3)03928, l(2)02836 sequences all map to repeat in genome
p_name l(3)04069
name CG6983
gene CG6983
ct_name CT21627
relation behind
r_orientation \-
inside_intron 0
inside_exon 0
dist5 1055
dist3 6919
[must be multiple insert line, since insertion maps genetically to third, but
sequence maps to 66D1, and other 2 insertions' sequence likewise map in wrong
position at identical nucleotide]
>l(3)j12B4,l(2)01528,l(3)rJ880 inserted in repeat in genome
>l(3)rJ880,l(2)01528,l(3)j12B4 inserted in repeat in genome
\------------------------------------------------------------------------------
--
DOI
Associated Information
Comments
Associated Files
Other Information
Secondary IDs
    Language of Publication
    English
    Additional Languages of Abstract
    Parent Publication
    Publication Type
    Abbreviation
    Title
    ISBN/ISSN
    Data From Reference
    Alleles (62)
    Genes (66)
    Insertions (69)