FB2026_01 , released March 12, 2026
FB2026_01 , released March 12, 2026
Reference Report
Open Close
Reference
Citation
Robertson, H.M. (2001.7.7). Third set of annotation suggestions from bee ESTs. 
FlyBase ID
FBrf0138600
Publication Type
Personal communication to FlyBase
Abstract
PubMed ID
PubMed Central ID
Text of Personal Communication
Subject: Third set of annotation suggestions from bee ESTs
Dear Gillian,
Attached is a TEXT file with the last set of suggestions for annotation
improvements deriving from our honey bee EST project. ..
Thanks,
Hugh
Hugh M. Robertson
Professor
Department of Entomology
University of Illinois at Urbana-Champaign
\------------------------------------------------------------------------------
--
\**********A. mellifera BB260007A20E1.F
TGGCTTTTAAAATAAATAATTTTTAGATTTTATCAAAAAGGTTAGTAAGTCCGGTCAGGTACGACGAGAATTTAAGTTT
TCCAGTGCAGAATGTTTTGGAAAATGGTTGGTGAACCGGAGAACAACAAGCAGAGCCATTTAAAGCTATAAGATTTAAT
CTGAACGGAAATGGAAAATTATTTTCCCAAGGTGAAATGGAAAATTAATTTCCCAATCATTCAGTTTAATAGATTTTTA
TATTTTAAATATCATGCTATTATTTTATTTTACACAGATTAAGGGAATTGGATAGAGATGAACTTTTCGAAAAAGCAAG
AGGAGAAATCTTGGATGAAATTGTAAATCTGTCCCAAGTTTCGCCTAGACACTGGGAAGAAGTGTTAATGGTCAGAATT
TGGGATAAAGTTAGTATGCACGTATTTGAAAACATCTATCTACCCGCGGCTCAAAGCGGAAGTCCAAGTATATTTCACA
ATAATTTTATATGATTTTCCCTTATGAATATTTACTATTATAATTATCATTATAATTATTTTTCTTTGTTATAGGTACA
TTTAATACTACCGTTGATATAAAACTTCGTCAATGGGCTGAACAACAATTGCCAGCTCGAAGTGTCGAAAGTGGTTGGG
AATGTTTACAACAGGAATTTCAACATTTTATGAATCAAGCAAAACTCAGTCCTGATCATGATGATATTTTTGATAATTT
AAAAAATGCAGTAGTCAGTGAAGCTATGAGACGTCATTTTT
Unspliced transcript, yet clearly shows that CG8479 is missing an exon, the
sequence of which is also present in mammalian and nematode orthologs
\**********A. mellifera BB260009B10G1.F RC
GGTCATAGGGGTCAGGATCGAGGGTTATGCCGAATTTATCTTCGTTATGCTCAAAATTTTTTTCAATGAAACAATATAA
TTAATGAATTTAAAAGTTTTAGAAATCCTGATTTTCAAGAAGCAGTTTTTCAAAAAGATGGTCAGATACTCTTAACATT
TGCAATTGCTAATGGATTTAGAAATATACAAAATCTTGTACAAAAATTGAAGCGTGGAAAATGTCTATATGACTATGTT
GAAATTATGGCATGTCCTTGTGGATGTCTTAATGGAGGAGCACAAATTAGACCTTTAAATAATGTTCAATCACGTGAAC
TGGCATTAAAATTAGAATCTATATATCGAGAACTTCCTCAAAGTGATCCTGAACAAAATTTAATAGTGAAAAATTTATA
CAAAAATTGGTTAGGAGGAGAATATACAGACAAAGCTTTAGCATATTTTCATACTCAATATCATGAAATAAAGAAAGTA
AATACTGCTCTTGCTATAAAATGGTAATTTATGATAAATATATATTATTATTAATTTAAATTAGAATTTTTAAATACAA
CTAATATCAAATTAATATGATTTAAAAATATAGTTGTATGATTAGATGTATATTTATTCTATTACTATTATATATTAAA
AAACCTTAATTGTTGGCAAATATTCATTAAAGAATATAAAAGACAAATTTTTAATAAATATTTATATCTTTTATAAAAA
AAAAAATAATTTAATAATATACATGTAAATATAAATTTGATATTATATAGAGTATAATGTATTTAATGTAAATTATCTA
ATTATAAGGATATAAAAGAAATTTC
The vertebrate matches for this are about 476aa, while the Drosophila CG17683
is only 2141, but when search with vertebrate proteins, Drosophila genome show
N-terminus in exon just upstream of CG17683
\**********A. mellifera BB260011A10D4.F
AATATTCCAGAAGGAACTCATCAATATGATTTGATGAAGCATGCAGCAGCCACTTTAGGTTCTGGTAATTTAAGACTTG
CAGTTATGCTTCCAGAAGGTGAAGATTTGAATGAATGGGTTGCAGTAAATACTGTTGATTTTTTCAACCAAATCAATAT
GTTATATGGCACTATCACAGAATTCTGTACTGAAGAAAGTTGTCCCATCATGTCTGCAGGACCAAAATATGAATACCAT
TGGGCTGATGGGCATACTGTCAAAAAACCAATAAAATGTTCTGCACCAAAATATATTGATTATTTAATGACTTGGGTAC
AGGATCAATTAGATGATGAAACCTTATTTCCTTCAAAAATTGGTGTTCCTTTTCCAAAAAACTTCTTGTCTATTGCTAA
AACAATATTGAAAAGACTATTCAGAGTATATGCTCATATTTATCATCAACATTTTAGTGAAGTCGTTCAACTTGGCGAA
GAAGCACATTTAAATACATCATTCAAACATTTTATATTTTTTGTTCAAGAATTTAATTTGATAGAAAGAAGAGAATTGG
CACCATTGCAAGAATTAATAGAGAAGTTAACAGCAAAGGATGCTCGATGAATGATCTTGATAGTTAACTGAAAACTCCT
CCAAAGAATCTCACTGTGCTATCTTTTTAACGTGTAATTGTGCATGTTATTTATTAATCCAATTCTTGCT
This and excellent mammalian matches show that the N-terminus is missing from
CG13852
\**********A. mellifera BB260015B20F5.F
GTTGCCTTTTTTTTTTTTTTTTTTTTTTGCTCTGTACTTATTTGAATCTGTAGCTAACAGAGAGCTAATTTCAGAGATA
TACTTCGCTATATAGAGGAGGATATTCTACCAGAAATGCATGTAAAATTTGGCCAAGAAATTCTATATTTAGAAGGATG
GTGTGCACGAACTCAGTATAATGCGTGCTGTAGATTACTTGGTCCGGGAATAAATATTCATCTTGCGGAAAATCAACTT
CTGCGCGAAATTTTTCACCTTGGTAACAAAGTTATGCCGATACAATTGAATCAAAAGACAAGTAAATTAGAAAGGACAT
TAATGAATGCAGCTGCATTCAAAGCGCGCACTATTCAGCGTAATAAAAACCGAGACAAACGATCAGCTGCCTTGGCACC
GTGATTCTTCAACTTTCTTTTTTTATATATATTAGATAATTTCTTTTAAAATAAGAAAAAAAAAAGCAACGC
This EST and the vertebrate matches show that CG3098 and CG15401 need to be
fused.
\**********A. mellifera BB260017A20C9.F
CCAGGATCTCAATTGCCGAGATGTCGTACCGCGGAATGCTGAGCACCTCGGACGGTTCGCCCGTCGAGCTGCGCGTCCA
GGATGTGGACTCCTTTGACGGATCGATGATGTCCCATGCGGCCACCAATGCGGCCGGTCCGCTGGGCAGCAAGGCCAGC
GGCAAGTCCCTCAACGCCTGCACCTCCATGTCCATGCCCCAAAAGCCGAGTTCCCTAAGGGCGACACCCAGCCGCTGCC
CATGTGATCTCGCCGATTCACCGGTGGATGAGAAGGCCGAAGCTGATGTGGAGGTGACCAACGAGATCTCGGCGCCCTA
TGAAGTGCCCCAATTTCCCATTGAACAGATCGAGAAGAAGCTGCAAATCCAGCGCCATCTCAATGAAAAGCAGGCTACA
GGACCTAGACCAGTGGCCACAGCCGCCGTATTGGCCACAAACAGGGAATCATCCAGCAGCACCGAGGGCCGCGAGTCCG
CCGTTACTATGGAGAGAAACGATGCGGACATTAACTTTCAACGGGTCTCAATTTCGGGTGAGGACACCAGTGGCGTACC
GCTCGAGGATTTGGAGCGGGCCTCGACCCTGCTGATCGAGGCACTGCGCCTGCGTAGCCACTACATGGCCATGTCGGAT
CAATCGTTTCCCTCGACCACGGCTCGATTCCTCAAGACAGTGAAGCTCAAGGATCG
This EST and vertebrate matches and then a whole set of old and new Drosophila
ESTs show that CG15762, CG11065, and CG11058 need to be fused.
\**********A. mellifera BB260018B20A5.F
GGGAGTAAATTCTGGTGCTCATCAGCTTGACCAGTTCAGCACAACAGAATTGACGCCCGAACATCAGAAGATTCTAATA
GACATCAGACGAAAGAAGACGGAACTGCTACTCGAAATACAACAACTGAAAGATGAGCTGGGTGAAGTGGTGGCTGAAA
TGGAGGCGATGGAGGGCGGTGGTCTCGCCGTCGATGAAACCAAGCCATCGAATAAGGCTAAGCAGACGTCGATCGGCCG
TAAGAAGTTCAACATGGACCCTAAGAAGGGCATCGAATACCTGATCGAGCACAATCTGCTAGCGCCGACGCCCGAGGAC
GTCGCCCAATTCCTCTACAAGGGCGAAGGTCTTAACAAAACGGCGATTGGCGATTACCTTGGTGAAAGACACGACTTCA
ACGAGAGGGTGTTGAGGGCATTCGTCGAATTGCATGATTTCACCGATTTGATCCTCGTACAAGCTCTTCGACAATTTCT
TTGGTCGTTCCGAATGCCCGGCGAAGCGCAAAAGATCGATGGATGGAGGAATGCTTCGCACAAAGGGACTGGCCGGTGA
AATCCCATTTTTTACCAATTCCGACCCTTGGTACCTGGTCAGGTTGCGTTTATATGGTGGGAACCTCTTTTCCCAATCC
GAGGGCAAGGATAACCTACGGGGAG
This and vertebrate matches show that CG11633 and CG11628 need to be fused \-
indeed Drosophila EST from clone AT31091 unifies them already
\**********A. mellifera BB260020A10B2.F
GCTTTTTTTACTGGTGGAGAAATAGTCAGTACTTTTGATAATCCTGATATGGTGAAACTTGGCAAATGTGATCTTATTG
AACAGGTGATGATAGGTGAAGATACTTTATTACGATTTTCAGGAGTTCCTTTAGGAGAAGCTTGTACAGTAATCATTAG
AGGTGCAACCCAGCAAATTCTTGATGAAGCTGAGAGGTCTTTACACGATGCACTTTGTGTATTATCAGCTACCGTTCGC
GAATCAAGAATTGTTTATGGAGGAGGATGTAGTGAGATGATAATGGCTTGTGCTGTTATGAGAGCTGCTGCTTCTACTC
CTGGAAAAGAATCAGTTGCAATGGAATCTTTTGCTCGAGCATTGCAACAATTACCTACTGTTATTGCTGATAATGCTGG
TTATGATTCAGCTCAATTAATTAGCGAATTACGAGCTGCACATAATTCTGGTGCAAATACCATGGGTTTGGATATGGAA
CGAGGAAAAGTAAGCTGTATGAAACAATTAGGAGTAACTGAATCTTGGGCTGTGAAGAGACAAGTTTTACTTAGTGCAG
CAGAAGCAGCAGAAATGATTTTACGTGTGGATGATATTTTACGAGCAACGCCCAGAAAGCGTGTTAAAGATCGCGGACG
TTGTAATTATAGGTTACATATAACAATGCTCATTTATTGCTT
This and vertebrate matches show that CG7033 needs a C-terminus, and it is
readily endocded after an intron;
\**********A. mellifera BB260020A10C8.F
AATTTCAATATTTAAAGTTACCTCTTCTTAGGAAATGAAAAGAAATATTGAAACTATATTGAACTTATTAAAATAGAAA
AAATTTATCATAAAAACTCTTTTAATTGATACTTGCGATTTGTGATAGATGATTTGTATAGTACGATAAAAGGATGAGG
ATAGCAAAACAATTTAAATATTTTATACTTGCGAATAATTATCTGTGTAAATATATTTAGGCTTAGATTTACGTCATGA
TCCTATGATTGTACTCCAATATGCGTTAAATCATCAAAAATATCATCAAAATGCCACGTATAGATCCTCTATCTTTATT
GAAATGTCTTAGTGTTTTATTGGGGCCAACTGGAGGTATAAAAAGTAAAGAAGAAGTTCATCGGTTGGCTAGTTTGATG
ACAAAATTTTCCAAAAAACTTGTTTCAAAATGCATTTATATACAAATATTGAAAACTACAAATACAGATTTACTTAGTC
AATTTATGGGTGCTGGAGGATGGAATCTCATACATATGTGGCTTACAGATGGTATTCTTGCAAAAAATTGGGCTTTAAT
TCAAGAACTTCTAGAACTTTTATTATTATGCCCAGTAGATATAGAAAGATTAAAAAGCAACAATTGTCCAAAATTAATA
AAAGGTCTATCAAAAGAAAGTAGTCATCAAAGTGGTAAAATGTTA
Shows an N-terminus is needed for CG4124 \- it is annotated as mRNA, but not
translated. There are ESTs from Drosophila and Bombyx mori also showing it.
\**********A. mellifera BB260020A20G9.F
TGGTTCAGAGACTTTGCACTAGTGTAGCTGCGCTTAGCCTCGCCCTAAGTGCTTGCGTTTTTTTTCCGCGTGCTGTCAT
GGCGATCGACTTAAGTCGATTTTATGGTCATTTCAACACGAAACGTTCAGGAGATGCTTGTCGTCCATATGAGCCATTC
AAATGTCCAGGAGACGATACATGCATTTCGATTCAATATCTGTGTGACGGAGCTCCCGATTGTCAAGATGGATATGATG
AAGATTCGCGATTATGCACAGCTGCAAAACGACCACCAGTAGAGGAAACTGCTAGTTTTCTGCAGTCATTGTTAGCAAG
TCATGGTCCAAACTATCTTGAAAAATTGTTCGGGACTAAAGCGCGGGATACTTTAAAGCCTCTTGGTGGAGTGAATACA
GTTGCTATAGCGCTTTCCGAATCTCAGACGATCGAGGATTTTGGTGCAGCCTTGCATTTGTTACGCACAGATTTGGAAC
ATTTGCGCTCCGTATTTATGGCAGTGGAAAATGGCGATCTAGGCATGTTGAAATCAATAGGTATTAAAGATTCTGAATT
GGGAGATGTGAAATTTTTCCTTGAAAAGCTCGTGAAAACTGGTTTTCTCGACTGAACAAGCTTTTTTTTAAACGTAGGG
GCTATAACGTATATACGTTGACTGTATCGTGTTATAATCGCACTAACATATTTCTAGAAAGATTGTTGGAAGATTCTTT
CACTAATTTTTGCAAATGATCACCGCCTCTCTTTTCTTCATTATTACTCT
Shows that CG7237 needs both N and C-terminal regions readily available in
genome.
\**********A. mellifera BB260020B20A4.F
GTTGGCCAGGAGTCGGAGCACTCACGCATTGAAATCGAGGGAGCCGAGCCCGGAGCGAGACAGGGTGGGCGCAGAGAAG
GATGGGGCTGCGTTAAGTTCGTGGGCACGGTACTTGAAGAACAAGTACGGGAATCGAACGACCAAAGATAAGGAGCCTT
CGTCCTCTGCCTCGACGATTCCATCATCGAGTGGAAGCACCTCGAGGAGGTTATCGCTCGGATTACCTTTGAGGCACGG
TGGCCAAACGTCCTTCGAATCTTCTGACGACGATCAAAAAAACCCGTCAGGCTCCCCCACGTCCCCTACGGCAGCTCCC
GTTATACCCGCGGCAGCAGGTTCCTCCACTAGCAATGGGCGGAGGAGTCACTACTTGCTGAAGCGGCGGCAGCTGTTCA
AGTTCGGGATGCGGGGGAGCGAACCCGGATGCTTTACTTGGCCCAGAGGCCTCGCGGTTGGCCCTGACAACTCCATCGT
AGTGGCCGACAGCTCTAACCATCGTGTTCAAGTGTTCGACTGTAATGGGAACTTCATGAAGGAGTTCGGATCGTACGGC
AGCGGCGAGGGTGAATTCGATTGCCTGGCCGGGGTAGCGGTGAACAGGATCGGGCAATACATCATAGCGGATCGTTACA
ACCACAGAATTCAAGTTCTCGATCCTTCCGGTCGTTTCTGAGAGCGTTTGGCTCCCAAGTACCGCAGACGGGCGGTTTA
ATTATTCTTGGGGAATACCACCGATGCTCTTGGATTATTT
Shows that there are other alternative splices of CG15105, and there is a
Drosophila EST that confirms this.
\**********A. mellifera BB260021B10H4.F
GCGTCATAAGATTTGGAAACCATTACGAATGTTAATTGTCAAAAAATCATTATATCAATGTTAATTATTTAATTAAATT
GATTATATGAATTGCTAATAAAAGCAATTAATTTAATTATTAAATAATGGAAAAACAACCGTTAAATCCAAAAGATGAT
AAACCTCCGTCTTACTCAGCGGCAACTGCACCAAAAGTAAATTGTTCATGGCAACCACCACCAGGTTACAATTCCACAC
AATGTCAAGAATCTACTTCATGGCCTCCACCTCCAGGATATTATCCTAGTGCCAGTGAAACTAATAGTCATAACTATGT
ACCTTCTTATGGCTCTACACAATCAACTACAATAATTATGCCAGAAATTATTTTAGTTGGAGGATGTCCTGCATGTAGA
GTGGGCATTATGGAAGATGATTTTACATGTCTTGGACTACTATGTGCTATTCTATTCTTCCCAGTGGGCATAATTTGTT
GTTTGTTATTAAAGACACGACGTTGTTCTAATTGTGGTGCATATTTTGGTTAATGTGTTACTATATTACAGTAACCATT
TCTGAAACTAAACATATTATTAATCTAATTTTAATATAATTATGTAGCTAATATAATTATATGATATTGAAAAATAAGA
ACATAGGCAATTAATTTTTTATTAATATTATATAATTATAATTATCTATGCATTTGCATTATTTGAGGAAAAATAATGC
CTCAATCATTCATATAATATACAATAATTTGTTAAATAGAAGTTCATGAAATATTGACATATATACATTAAGAAGATCT
CTATGCATTTAAGACTTTATAATTTAAG
This and vertebrate matches show that N-terminus is missing from CG12012;
Bombyx mori ESTs agree.
\**********A. mellifera BB270008B10B6.F
ATATTCTCTATTATTAATGTAAATAGTTATTAATTTTAAGTTTTATTTTGTTACAACTGGTACCTGATTAATTTGTAGA
TAGTAACATAACGTTTAACATCATGCCTATTCTAAGGAAAAAATCTAGCAAAAAAGTAAATGTGGAAAATGGCAATGAC
AATTTGATTGGAGATATAACTATCAATGATTTTGCAAATTCATCACGGATACACACAAGATTTAAAAAGGCAATAATAA
ATGTCTCCTCTGAAGTGAGACAAAAAATAACAATAGATGAAGAAAGTTTTTTAAATGAAATGAATGAAAATTACCAAAA
TAATTTGTATGATTCGAATAAGTCAAAGAAAAGTACTAAAGGAGTTTTAACAAATCAAAGAAATAAAAAGACAAATTAT
TTTAAAAAAGATATAGAAGAACAAGAAGGTATAGATTATCCATTAGATATATGGTTTATTATATCTGAATATATCAGTC
CTGAGGCTATAGGAAAATTTGCTCAAATATGTAGAAGTTCTTATTATGTAGTTTCAACTGGAAAATTTTGGTTTCATTT
GTATAAATCTTATTATAAATTTGTTCCTGGTTTACCAGAACGTCTACAACCACAATGTATGGTTCGTACTCATGGACTT
CGAGCTTGTGTCATCAGAACATTACATTATACTTATTTTGCTTTGAAAAGAAAAGTTGATGATGTATCCTATTTAAGAA
CAGATGAACCACATTCACTTATAAAA
This and a new testes cDNA suggest that CG12765 needs a longer N-terminus. So
do the vertebrate matches in mouse.
\**********A. mellifera BB270010A10B7.F
CTGTTCGGCGGCGGTGTTATTCGTACGGCCAATGAGGAAAGCGGAGCACGTGACCATGCTGGATCCTTTTCAAGAGAGA
TACGGTGCCGGAGTGGGGGGTCTCTTGTTCCTGCCTGCCCTCTTCAGTGATCTGTTCTGGTGTGGCGGCGTGTTGAGAG
CTTTGGGAAGCTCGTTGGCAGTGGTTGCTGGCGTGAATCCCGACATCAGCATAGTCGCCTCTGCCCTCTTGGCAGCCGT
ATACACAGTGTTCGGTGGGCTTTACTCCGTCGCGTGCACCGATGCGTTGCAGCTGGCATGCATCGTGATCGGGTTGGGA
TTGGCCGCGCCATTTTCCGTTCTCCACCCCGCCGTCAACTTCGAGAAAAATCTAACGCCGCACGAATGGCTCGGGGAGA
TCAAGAACGAGGATCTCGGCGAGTGGGTGGATTGCATGCTGTTGTTGGTATTTGGCGGTATACCCTGGCAGGTATATCG
AATATTATATTCGTCAAGATAATTCTGATTTGTGAATACGAATCGCGTGAATCTACACGAGCGTGAATAGATACTCGAT
ACGCGCGAACTCTCGTTTC
This and vertebrate matche indicate that CG7708 needs a different C-terminus,
and it's there in the genomic matches.
\**********A. mellifera BB270020A10D10.F
TGGCCGTACACAACCAGAACTTATGGGAAGAGAACGACATGCAAAAACATTGGAAATTGCTCAAGAAGAGGTACTTACA
TGTTTAGGAATGTGTGTTGCTGAACGTTTACATAGAGTTCATAGACGATTAAGAGAAGAGGAAACTGTATGCAAAGTGT
TAGCTGCTGTTGCAGTAGATGCATTATCTAGAAATTTTCAAATGGCTGTAGAAGTTAAACAAGGTATTTCTCAATTAGA
ACTTCTCTATGAAGAATTAACAAGAGAAGAGATAGCAAAACAACAAAGACGAGAAAAGTTACGTTTGAAACGTAAAAAG
AAGAAAGAACGACGATATGAGACAGAAGAAAAAGAAAATACATGTGATGTATGTTATTATTTGAAAATTAAAATTTTCA
AAATTTATATATTTAATATTCTTACATAAGTTAATTATTATTAGTGTTCGAGCAAAAAACAAAGTGGTAATAGTGATAC
ATCTTGTGTTTGTGCGGATTCAAAACCGACAACACAAAATATAGATCAACATAAGTTACAAGTATTAGATCCAAAAAAT
AAGGGACCACCTACTTGTAATGTCCGGATTGTGTAAAAAAATCAAAATCTAGTATATCACGTTCACAGAGTCAAACACA
ATTAGCATTCCCAAAAAAATCATCGAACGTGCAAAAAACTACAATTAAAAAGAGTTCTTCTGAATCAAAGACCAATTTC
TACAAT
This and a B. mori EST, and now two Drosophila EST sfrom testes and Schneider
cells, and vertebrate match indicate there is a segment missing from CG2182,
and it's there in the genome.
\**********A. mellifera BB270023A20B8.F
CATATCGTATGTTGGTTATATACGAATAGTTTAAGATATCTACCAGTATTGGTAAGGCAATGGTGGAGTACTGCTGATA
GTAGAGTCAGCGCTGCCGTGGATAAGATCACAACACATTATGTTAGCCCTATGCTTTGTCAAGAGGAACTTCTCAATAA
TAAATTACAAAATATTGAAAACATGCAAGTAAAAGTACATCCAACATTCCGTGAAGTGATAGCTTTATATCAAATGGAT
GATACAAAATTAGAACTTAATATTACATTGCCATCTAATCATCCTTTGGGACCAGTTAGCGTTGAACCTGGACAACACG
CAGGTGGTACTGCGAATTGGCGGAATTGTCACATGCAATTATCCATATTTTTTACACATCAAAATGGATCTGTTTGGGA
TGGACTTGCATTATGGAAAAGAAATTTAGATAAGAAATTCGCCGGCGTTGAAGAATGTTACATATGTTTCAGTATTTTT
CACATAAATACGTATCAAATACCAAAATTATCTTGTCACACATGTCGTAAGAAATTTCATACTGCATGCTTGTATAAAT
GGTTTAGTACAAGTCAAAAATCCACGTGTCCAATTTGTAGAAATATATTTTAATCTATTATATTATTTATAAACAAAAT
TTGTATTTGTATTTAAATAAATAAAATAATATCTGTACATAAAAAAAAAAAAAAAAAAAAAAAAAAACCTCGTGCCGCC
TCGTGCC
This encodes a protein with an excellent match to the end of a very long human
protein \- suggesting that the annotation of CG9274 needs revision \- and
there is now a Drosophila Schneider cell EST matching it too.
\**********A. mellifera BB270023B20C3.F RC
GTTGGGCAACGTGAACAAGATAATAATACGGCCGCCAGGGCATAGCGGGAAGGCGAAGAAAGGGCACATCTGCTTCGAC
GCCTCGTTCGAGACCGGCAACTTGGGCAGGGTGGATCTTATCTCGGAATTCGAGTACGATCTGTTCATCAGGCCGGACA
CTTGCAACCCGCGACTTCGTATGTGGTTCAATTTCACCGTGGACAACGTGAAGGCCGACCAACGAGTGATCTTCAACAT
AGTTAACATATCCAAGAGCGCGAATCTGTTTCGAAATGGGATGACACCGTTGGTAAAGAGTAGCAGTAGATCGAAGTGG
CAAAGAATTCCCAGGGATCAAGTGAGTCATCGTCGAAAAGAACGAAGAAATACGAAGATATATCGAGAGATATCTTATC
TTCCAAAAATTTTTATCCTAGGTTTTCTACTACAAATCGGCGCAACATCAAAACCATTACGTGCTCAGTTTCGCATTTT
CTTTCGACCGCGAGGAGGACGTGTATCAATTCGCCCTGACGTATCCGTACTCGTACAGCCGTTATTTGGCGCATCTGGA
CAACCTTTGCACCAGGTTAACGTACACGAGGAGAGAAACTATAGCCACGTCGATACAAAAGAGGAAGATCGAGTTGGTC
ACGATAAGTTCGAATCTGGACGACGTTCAAGATCGTTCGAGAAAGGTGGTGGTTGTCCTCGCAAGGGTGTATCCA
continuous ORF encodes
LGNVNKIIIRPPGHSGKAKKGHICFDASFETGNLGRVDLISEFEYDLFIRPDTCNPRLRMWFNFTVDNVKADQRVIFNI
VNISKSANLFRNGMTPLVKSSSRSKWQRIPRDQVSHRRKERRNTKIYREISYLPKIFILGFLLQIGATSKPLRAQFRIF
FRPRGGRVSIRPDVSVLVQPLFGASGQPLHQVNVHEERNYSHVDTKEEDRVGHDKFESGRRSRSFEKGGGCPRKGVS
Not part of any annotation, indeed is within the first 5.5kb intron of CG2246!
And in opposite orientation.
Best matches are a C. elegans and mouse genes, both to N-terminus, although
they differ greatly in size (447 versus 137, but mouse may be incomplete)
Try to reconstruct it.
This is the entire intron sequence, in forward strand, from about
163912-169487 \- try using FGENESH at CGG website, and it predicts this gene,
which is almost perfect.
First, I probably would not have found the large first intron, but in fact is
fairly well predicted and quite possibly right
Second, I don't think the internal third phase X intron is needed, because it
is open, so could add those aa to total
|AE003774
aattcaaatgattagtacgagttagggtcaagagagtgcaaatcaaagccataacgaaacattaacttggattagggaa
aagctacctgcgtttggacatgctgtcctagctgtagcttttggggttattggttatgttaatggggtggtgcatggtt
ctctcggtggcataatcgttaaagtaacacacacgcagtcgttacggttcccacgttgcgatgtacgaccgtatcgatt
agccccagtgaactgaatagcccccggtttttgcttgaacgcaattaatcctgccgctcctctcggttgcccagttgtg
gaaactgctctctggctgcggaggaacccaaaaccacaggcgaaccgaagcccctagccggaaactactgcgaattatc
gatcaaacaaaaaggcgacacacgtgagtggcgtggcaacgtcaaagtgccaacatgcgaatcacagttccggagtccg
cagcccgaattccggattccaaggcagtcgtctgtgcggtgcggaggaagcggcgaggagcgaccaggagctgcccggg
attagcagatagagatcgacctccagcgcccagacttcgcacgcgacttcaccgtcttgggagccccactcgagaaaag
tgaaaggccaggaggcgttggctaattgaatttctatttttgatcctgcccaaggagcgaagccccaaagaaaagagaa
gttgatgggttgaggcggcggaATGGGCGACTCAGgtaagcgaaatggtcataaatactcgcgataagtacgttgaggg
aatcccctagcaaagcaatgtaaaaatacactaggagtttatgaagatgttattgaaaaatagttaaccatgaacaaaa
gctagcaatctatacctcattccgaccaaaaatcacaacacccacggttagggtataaatgaagctatttaaaaacctt
tcaagcctcgagacattcacctccggctttcaactcaggattggtttggcatgggcttgggattggggggccgacgcaa
ccgcgaggcgactattcaattttagacatttgacgacttgagctgctcgaggctcagccaaaaggcgttggcaggtgca
gtgagtgcaatggaaacttgcggcacgcatatgtggcacaaggaaaaggaaaggaccggcgggatgggatgccttgatt
taagatagcagccgggtggcattaggaggaggattagaaggcgccccggccaagaaagttatgacaggccgggcagagg
ccaaagcctcgagggattagggccactgcactcggattggcttggttgcattagaagaacgtagatttttgccaaagaa
agacgcatcagcggataagagtcatcatgttgggaatagcctgcgatccacttcagcatacgagcatgtacgaacgcgg
atcagatggtagtgaaattgatattaaacactcatcttgaacagattagaaactaagagatagtatactacgcagatca
aaaaaagaatattggaaatatttgctcttatcaatcagtttcttatttcctaaactagaggcatgccgcttaaaatctt
tttggctcaattcatagtttttatacttctaatggatattcctttgcagATAGCGAAGACAGCGACGGAGAAGGCGGTC
TGGGCAACGTTTCCCGGGTTATCATCCGTCCGCCGGGTCAAAGTGGCAAGGCCAAGAGGGGCCATCTCTGCTTTGATGC
GGCCTTCGAAACAGGAAACCTGGGAAAAGCGGAGCTGGTGGGCGAATTCGAGTACGATTTGTTCCTTAGACCGGATACG
TGTAATCCTCGCTTCCGTTTCTGGTTTAACTTCACCGTGGACAACGTAAAGCAGGATCAGCGAGTGCTCTTTCACATTG
TTAACATCAGCAAGAGCAGGAATCTCTTCTCCTCGGGACTGACTCCCTTGGTGAAGAGCTCCAGTCGACCCAAGTGGCA
GAGGCTGTCCAAGCGGCAGGTGTTCTTCTACCGATCGGCCATGCACCAGGGTCACTATGTCTTGAGCTTCGCCTTTATC
TTCGACAAGGAAGAGGATGTCTACCAGTTTGCGTTGGCCTGGCCTTATAGCTATTCGCGTCTGCAGTCCTATTTGAATG
TGATTGATGCCCGGCAAGGATCGGA
M G D S
\---------------------------1------------------------------------------------1-
-----------------------------------------------1-------------------------------
-----------------1------------------------------------------------1------------
------------------------------------1------------------------------------------
------1------------------------------------------------1-----------------------
-------------------------1------------------------------------------------1----
--------------------------------------------1----------------------------------
--------------1------------------------------------------------1---------------
---------------------------------1---------------------------------------------
---1------------------------------------------------1--------------------------
----------------------1------------------------------------------------1-------
---------------D S E D S D G E G G L G N V S R V I I R P
P G Q S G K A K R G H L C F D A A F E T G N L G K A
E L V G E F E Y D L F L R P D T C N P R F R F W F N
F T V D N V K Q D Q R V L F H I V N I S K S R N L F
S S G L T P L V K S S S R P K W Q R L S K R Q V F F
Y R S A M H Q G H Y V L S F A F I F D K E E D V Y Q
F A L A W P Y S Y S R L Q S Y L N V I D A R Q G S D
Fly protein
MGDSDSEDSDGEGGLGNVSRVIIRPPGQSGKAKRGHLCFDAAFETGNLGKAELVGEFEYDLFLRPDTCNPRFRFWFNFT
VDNVKQDQRVLFHIVNISKSRNLFSSGLTPLVKSSSRPKWQRLSKRQVFFYRSAMHQGHYVLSFAFIFDKEEDVYQFAL
AWPYSYSRLQSYLNVIDARQGSDKRFTRCVLVKSLQNRNVDLLTIDHVTAKQRSTNRLDRSFIRVIVVLCRTHSSEAPA
SHVCQGLIEFLVGNHPIAAVLRDNFVFKIVPMVNPDGVFLGNNRCNLMGQDMNRNWHIGSEFTQPELHAVKGMLKELDN
SDvsrgietdligiifvcsynisfqTYQIDFVIDLHANSSMHGCFIYGNTYEDVYRYERHLVFPRLFASNAQDYVADHT
MFNADERKAGSMRRFSCERLSDTVNAYTLEVSMAGHYLKDGKTISLYNEDGYYRVGRNLARTLLQYYRFINILPMPIVT
EVRSKRRGRNRHAHHSRSRSKTRYEVKPRPKTPRCHAPIAYTNLSICYDSGGGGGSSDEGGFSPARPLAPGSSCFSGYR
NYRRAATASCSAHPGHDQYSPFALGALKTGSDHGGGVGGSKGKRSAAVTIEVPLPVNVPPKPYLSIIDLNQLTRGSLKL
KSNSFDAADRRZ
Bee EST
LGNVNKIIIRPPGHSGKAKKGHICFDASFETGNLGRVDLISEFEYDLFIRPDTCNPRLRMWFNFTVDNVKADQRVIFNI
VNISKSANLFRNGMTPLVKSSSRSKWQRI-not sure why no match after
this?-PRDQVSHRRKERRNTKIYREISYLPKIFILGFLLQIGATSKPLRAQFRIFFRPRGGRVSIRPDVSVLVQPLFG
ASGQPLHQVNVHEERNYSHVDTKEEDRVGHDKFESGRRSRSFEKGGGCPRKGVS
No Drosophila ESTs for this region
Great BLASTP matches to both C. elegans and human proteins of comparable size
in GenBank
\**********A. mellifera BB270028B10A5.F
AGAACCCTTAGATCTTATAAGATTGAGTCTTGATGAAAGAATATACGTTAAAATGAGAAACGAGAGAGAATTAAGGGGA
CGATTACATGCTTACGACCAACATTTGAATATGGTGTTGGGTGAAGCAGAGGAAACCGTAACCACAGTAGAAATTGATG
AAGAAACATACGAAGAAGTGTATCGTACTACTAAAAGGAATATTTCTATGCTTTTTGTTCGTGGCGACGGTGTGATTTT
GGTTTCACCACCGAGCATGAGAGCACCGATATAAAAAATTTCTATTTACATTACAATCTGTCTAAAAATTAAGAAACAT
ATAAGTAATATATAATATTACGTCGATACAATTTTATCAACTTTTGTAACGAGCATTTAACAAGTGTGTCGACAAAAAT
CTAACGACAAACCAACCATAGATATTTAATTATAGATGAGAAGAATCTGTAACGATAGGAATATAAAAAATTGTAATCC
AAATATGGAAAAAAGAAAACAAATCAAGATGTGGTACACTATGTTAAAATGATATATTAATTAAAAGTTTATAAATAAT
TATAAAAAAAAAAAAAAAAAAAGCAAC
This EST, plus genomic matches, plus human match, plus a Drosophila EST all
indicate that the N-terminus of CG5926 is missing
\**********A. mellifera BB270029A10G1.F
TTTTTTTTTAAATATGGAATATATATACCGTTTACCATTTTTAGCATTAGAAGTACCTAACTTAAAATTAAAGAAGCCA
TCATGGTTTGTGAAACCTAGTGCTATGATAGTTTTTTCATTTATACTTTTATCATATTTTCTAGTAACTGGAGGTATAA
TATATGATGTAATTGTGGAACCACCTAGTGTAGGCTCAACAACAGATGAACATGGCCATACAAGACCTGGAGCATTTAT
GCCGTATCGAGTAAATGGGCAATATATTATGGAAGGATTGGCATCTAGTTTCCTTTTTACATTAGGTGGAATTGGTTTT
ATAGTATTAGATCAAACACATAATCCATCAACACCTAAGCTTAATAGAATTCTTTTAATATGTGTTGGATTTATTAGTG
TTATTGTCTCATTTATTACCTGTTGGGTTTTTATGAGAATGAAACTACCGGGATATCTGCAATCATAAATTTAATAGAT
TTAAAATAATATGGCAAATTGAGCTTAAAACTCAATATTTTGTACTCATCAAAACTATTACATTTTGTATTAAATATGT
GGATTAGAAATGTGATTTATAAATAATATTATAAAAAAATTCTGTCAAATATATTTAAAAAAAAATAAATTCTTATTTT
TATTTAAAAAAAAAAAAAAAAAAAGCAAC
This and vertebrate matches, B. mori EST, and Drosophila EST, show there is an
intron in the annotation of CG9662 that is open and in frame and encodes
needed aa.
\**********A. mellifera BB270029A20C10.F
TATGGTCGCACGAAACCGCTCATTTCCAGGACAATGATGAAGAACATTCTTGGCCAAGCTATCTATCAGTTGACTGTAA
TTTTTATGCTTCTTTTCGTTGGTGATAAGATGCTCGACATCGAAACAGGCCGAGGAGTTGCGCAGGCTGGTGGCGGTCC
AACGCAACACTTTACTATTATCTTTAATACATTCGTCATGATGACACTTTTCAACGAATTTAACGCTAGAAAAATCCAT
GGTCAGCGTAATGTCTTCCAAGGAATATTCACCAATCCCATCTTTTACACTATCTGGATCGTGACATGTCTATCGCAGG
TAGTTATCATACAATATGGTAAAATGGCGTTCAGCACGAAAGCTCTCACATTAGAACAATGGATGTGGTGCCTATTCTT
CGGAGTCGGTACTCTATTGTGGGGCCAAGTAATTACAACTATTCCTACGCGCAAGATTCCTAAAATCCTTTCATCCGCG
TAGTGAACGCATTTCGGCAGGGCCTAGACGCACGCTACACGAGCGAGCACAGCAGCACTACATTGGCGGAGGTACTGAG
AAAACAGTCGTCCTTAAGCAAGCGGCTCTCGCAACGAGCAGCATTGAATACGCCGATAACAACCCAGACGAACTGACCA
TACCCGAGATAGATGTGGAAAGATTGTCAAGTCACAGTCATACAGAGACCGCTGTTTAGAATGGGAGAAGATGGCGGGT
CAGCA
This and vertebrate and nematode and Drosophila EST indicate there is an
intron boundary problem in CG2165
\**********A. mellifera BB270030A10C8.F
GAGTGGACAATTTCTCAGAGGTGCCTTTAGTAGTGAATCAGATTACTGTACACGATAATTTGCAATGGACAGTCAGACC
ACATTCCTCTGTACCATATGCATTAGACGAACCCATACAATCTGCTGGTTTGGTTCTAACTGCACCAGGAGGTGTGACA
GCTACATATGATTTAAATATACTTGGAGAGGGCAGAGTTCTGACTTATGAAAATTTTATTTATATTGCTTTTACAGGAA
CCTTTAAAAATGATTGTGATCTGGGAACAGGAGAAAGTGGTTTAGATCCTTTGGATGTTGAAACACAGAGATTAGTTTT
AGAAGCTGGTCCTGGTGGTTGGGTTGCTTTGCGCAGAAAACAATATGGAGCTAGATCTCAATTATGGAGAATGACTGGA
GATGGTCAATTACAACATGAGGGATCTAGTCCACCTAGAGATAAACATTCAAAAGTTCCAGAGACAGTATTAGTATTAG
ATATAGCAGGCACAGCACCTCAACCATTCACTTATTGTGCTCTTGCCTTAAGAAAGCCTGATCCCCGAAGGAGATCTAC
ACAAACTTGGCGATTCACAGATGATGGACGATTGTGTTGTGCACATAAAAATATGTGTGTGCAATCGAAAGATGGATTC
ATGGGATTACATGAAGGGAGTGAAGCAGTATTAGGTCCTCAACACATACTAATCCTCCTCCAATTGAACAGCGTGTTGG
TAGACAA
The N-terminal region of the translation of this EST has clear vertebrate and
Drosophila genomic matches, indicating that the annotation of CG11003 needs a
change.
\**********A. mellifera BB270032A10D4.F
TGCAACCTTTTTTGCGAAAAATGTTGGAATTACAAATATTTCAACAATTTATAGAAGAAAGACTGAATATGCTTAATTC
AGGACTTGGTTTTTCTGATGAATTTGAAATGGAGGCTTGCAGTTATTCTGCTAAATCTGGTAGCAAATTTATGCAGCAA
TATCGAGAATGGACTTATACTATGCGAAAAGAAAGCTCTGCATTTTTCCGCAGTGTAAAAGACAAGGCCAATCCAGCTG
TCAAATATGCAGTTAAATCAGTAAAAGATAAAGGAAAAGATATGAAAACGGTATATAAAGGATTAAAATGGAAAGGACG
ATCAAATAGAAGCGATACAAGTTTGAGATTCCATCAACCAAGATCAGCACCTAGTTCACCTACATCGGATCGAAGGCCT
ATCGATTTTTCATCACCTCCAAAATCTCCAAATGGTTTTACTGCTACTACTAGTTATAGAAAAGATCTTCGAATACGTA
ATAGTAATTTCACCGATTCAAGCAGAAAACAATATTCACCATTAAGTCCTAGTTCACCAGAAGAATCTGATTTTCCACC
AGAAAGAGTAAATATTGATTTGATGCAAGAACTCCGTCATGTAATATTTCCGAACACACCTCCTGTTGATAGAACAGTT
TCTCCTGAAGTGCCAGATTTAATTAGATTAGATTCGACAACAAGTACTGAAGATTTCGATCCACTACTTTCTAAAT
This and a new Drosophila testes cDNA, and vertebrate matches, show that
CG18659 needs a C-terminus
\**********A. mellifera Contig1366
GAATTTCTTACCACTTTTAAAATCGACATCTGTCATTTCAAATGGAATGCAACCAGTGACCGGAAGGGGTATTGTAGAA
AAAATAAAAATTATTACTAAATCTGACGTCAACCAGATTCTCGTTCAAAGTTCGCAAGAATCTCTTATTACAGGAGCTC
TACGAGTGAGTTCAGGAATTGGAGCAAGTTTATTAGCAACGCAGAGGAGGTTAGCACATACTGATATTCAGTGGCCAGA
TTTTAGTGATTATCGTCATGAAGCTGTACAAGATCCAAGAAGTAAAAGCAAAGAAAATTCAAGTAGCCGTAAATCATTT
GCATATGTTATGACAGCTGCAAGTGGAATTACCGGTGCTTATATAGCAAAATCAGCCATACATGATTTAGTAGCTACAT
TTAGTGCTTCAGCTGATGTACTTGCATTGGCAAAAATTGAAATAAAACTTGATGCTATTCCTGAAGGAAAAAGTGCTGT
CTTTAAATGGCGAGGAAAACCTATATTTGTACGACACAGGTCAAAAAAAGAAATTGAAAAAGAGGCAGCAGTTGATATT
AAGATTCTTAGAGATCCACAAGTGGATTTAGATCGTGTAAAGCAGCCACAATGGTTGATTGTTTTGGGTGTATGTACAC
ATTTAGGATGTGTACCAATTGCAAATGCAGGTGATTTTGGTGGTTATTATTGTCCTTGCCATGGATCTCATTATGATGC
TAGTGGCAGGATTAGGAAAGGACCAGCTCCATTAAATTTGGAAGTACCACCTTATGATTTTATCGACGATAATACAGTA
GTTATCGGTTAATGTATTAATATATGTAGTGAAATGATAAATGTAATTTATATCAAATTGTTTTACACAAATTCGATTT
TTATTTGTAAATTTTTTCCTAGAACCATCACATGTTGAAATGTTAAACTACAGTGAAATATAAATAAAAAACTTAAGTT
AATTTTAAAAAAAAAAAAAAAAAGCAAC
This and vertebrate matches show there is additional complexity to CG7361 \-
lots of Drosophila EST?
\**********A. mellifera Contig1440
GTCAATGTTAATTTAAATCGAAGTTTTTAATATTTTTGTGGCGTGGCTCGAATAAACTCTTAATATTCGTATATGTAAA
TCAGAAAATTATGTGGAAGGCAGTGTTGTTGATCGCAATCCTGTCCGCTGCAACGAATAAAAGTGTTGCAAACGACTGC
GTGCCACGAAGTTTCGGGACAAATAATATCGTATGCGTTTGCAACTCGACTTACTGCGACAGCACACCGGAACCGAAGC
CGAGCAGTCCAGAAAAAGGTACTTTTCACTGGTACGTATCGAGCAGAGATGGCCTCAGGCTGAGCTTGTCAAAAGGACA
AATGGGCCGTTGTCAAAACGATGGATCTTTAACCCTGAACATCGATACCTCGAAAAGATATCAAACGATCCTCGGTTTT
GGTGGCGCTTTTACCGACTCGGCGGGAATGAATATCAAGAATCTGAGCGAGGCTACTCAGGATCAATTAATCAGAGCAT
ATTTCGATCCGAAAGATGGAAGTAGATACACGTTAGGCCGTATACCAATAGGAGGAACAGACTTCTCTACGAGAGCCTA
TACATTGGACGATTACGATGACGATGCGACGTTGCAACATTTCGCGCTTGCTCCTGAAGATGTCGAGTATAAGATACCG
TATGCGAGGAAAGCTGTCGAATTGAATCCCGATTTAAGATTCTTTAGCGCCGCGTGGTCGGCGCCGACATGGATGAAAA
CTAATCACAAAATCAATGGATTTGGTTTCTTGAAGACCGAATATTATCAAACTTTTGCCAATTACATATTGAAATTCAT
AGAGGAATATAAAAAGAATGGAGTAGATATATGGGGTGTTTCAACTGGAAACGAACCATTCGATGCCTATATTCCTTTT
GAACGTCTTAACAGTATGGGATGGACACCAGAGCTGGTTGGCGATTGGATCGCTAACAACTTGGGCCCAACTTTGGCAA
ATTCCGAATACAACGCCACACATATTTTCGTTTTGGACGATCAAAGACTAGGATTACCTTGGTTCGTTAACGAAATCTT
TAAAAATGAAATTGCGAGAAATTATGTTTACGGCATAGCCGTACATTGGTACGCGGATATATTGATTCCACCGGTAGTA
TTAGATCAAACGCACAATAATTTTCCTGACAAAAATCTGTTGATGACCGAAGCATGTGAAGGATCTTTTCCATTGGAAA
AAAAGGTTGTGTTGGGATCATGGGAAAGAGGAAAAAGATATATATTAAGTATAACGCAGTATATGAATCATTGGGGAGT
TGGATGGGTGGATTGGAACATAGCTTTGAACAAAGATGGTGGACCAACCTATATCAATAATAACGTCGACTCGCCCATT
ATTGTAAATCCGGAAAATGATGAATTTTATAAACAGCCGATGTATTACGCTCTTAAGCACTACAGCAGATTCGTCGACA
GAGGATCGGTCAGGATTTTCATCACCGACACGATTGAAATTAAGGCCGCAGCCTTTATAACGCCCTCGAACGAAATTGT
GGTTGTTGCGTATAACGACAATAATGAAAAAACAAATGTGGTTCTGAACGATGTGACAATTGAAGATTATATTTGTTTG
GAATTACCCCCACATTCTATGAATACTGTAATTTACAACAAATAGAATACAACATATGAAATGAAAGATCATATGAATG
AAAAGCAACCGATGAAGTATATTACAAATTCTATCATGTTAAA
This encodes a full-length ~500aa protein with ful-length matches to various
vertebrate and nematode proteins, and indicates that CG10299 may need to be
split into two or three proteins
\**********A. mellifera Contig1578
CGGACCGATGGTCGGCCTACGGTATCCCCACCCAAGTCCCGGCGGGGTACGGTTCCCCGGCAAAGCAACCGATTTCAGT
AGCAGGACAACAACAGAATGCAGCGTCTACTGGTAAAGTCCTGACTGGAGATTTGGATAGCAGTCTTGCTAGTCTTGCT
CAAAATTTGACCATCAACAAAAGTGCTCAGCAACAAGTCAAAGGTATGCAATGGAATTCGCCTAAAAATGCTGCCAAAA
CTGGTGGCTCAGCTGGAGGATGGACACCGCAACCTATGGCAGCTACAACTGGCGCTGGATATCGTCCAATGGGAATGCA
AGGTGTGCCAATAGGCATGCAAAGTATGCAAGGCATAAGACCTATGATGAGCACAATATCTGGTGGTCCTGGTAACATG
ATGGTCACAGGAGGAGCTGCACCAATGATGATGTCTAGTTCAAATTCGATGATGGGTACTAACCTTCAACAACAACCAC
AGCAGCAACCACAGCAGAATATTGCGCAACCACAAAATAATCAAGTTCAACTTGATCCATTTGGTGCCCTGTGAACAAA
TTAGGTATCTATTTGATATCTCCTTGTGAAAAAAAAATTATCAATAATAAAATATGGATTATGATTTTAATTAAGTAAA
ACATTATTTAATTGAATACTTAAAATATTTATAAATTTTGCCTCTTTACATTCCCATTTTTTGAATTTTATGAATTTTC
AGTTTGCGAATGAACAGAAAAATGTATTGATTTGGTAATGTTATTTTGAAAAATTTTTTTTCAGAAGTAATTATATATA
ACAGAATATAATTACTTCTGAAAAAAAAAGCAAC
This EST, plus vertebrate matches, plus an A. gambiae EST, indicate that there
are two exons missing near the C-terminus of CG2520
\**********A. mellifera Contig1667
TTAAAAAAAATATTTATATTTTCCTATTATATTTGGTCAAATAAAACACAATTCTTTTGACAAAGAAAAATTATCAGAA
TTGTTTCATTGAAAGAAGTTTGCTTTGCAATTTGTAGATATTTCAATTAAGGGCAAGATTATTGGATTTTAATAAAATT
TTAAAAATGGAGGTGGATTATAGTAGCAATTGTGATGTTAAAATTCCAGAATGTAAAAAATTAGCAAGTGAAGGAAAGT
TACATGATGCTTTGGACCAATTACTAGCATTAGAAAAACTAGCACGAACAAGTGCAGATGTGGCATCTACATCTCGAAT
TCTTGTTGCTATTGTTCAAATTTGTCTAGAAGCAAAGAATTGGGCAGTATTAAATGAACACATAGTATTGTTGTCCAAA
AGACGTTCTCAATTAAAACGAGCTGTTACAGCAATGGTTCAAGAATGTTGTACTTATGTAGATAAAATGCCTGATAAAG
AAACCAAAATTAAATTGATAGAAACATTACGTACTGTAACAGAAGGAAAGATATATGTAGAAGTTGAAAGAGCAAGACT
TACTCATCGTTTGGCAAAAATCAAAGAAGAAGATGGAGATATTTCGGGTGCAGCAGCTGTTATGCTTGAATTACAAGTT
GAAACATATGGTAGTATGTCACGTTTAGAAAAAGCATCTCTTATTCTAGAAGCAATGCGATTATGTTTAGCaAAAAAAG
ATTTTATGCGCACTCAAATAATAGCTAAGAAAATTAATGTTAAATTTTTTAATGATGAAAATGATGAAGAAACACAATC
TCTTAAATTGAAATATTATGATCTAATGATGGAATTGGCTCGTCATGAAGGTTGGCATTTAGAATTATGTAGACATAAT
CGAGCAGTATTGGAAACTCCAGCAGTTAGAGATGATCCTGAAAAAAGACATGTTGCACTTTCACGAGCTGTTCTGTATC
TCGTACTTGCACCACATGAACCAGAACAAGCTGATTTGACCCATAGGTTGCTTTCTGATAAACTTCTTGATGAGATACC
AACATACAAGGAATTATTACGACTTTTTGTAAATCCAGAACTAATAAAATGGTCAGGACTTTGCGAAATTTACGAAAGA
GATCTTAAAGCTACAGAAGTTTTTAGTCTATGGACTGAAGAAGGACGTAAACGATGGGCCGATCTTCGAAATCGTGTTG
TCGAACATAATATCAGAATTATGGCAAAATATTACACAAAAATTACATTGACTCGCATGGCTGAATTATTGGATTTACC
AGTTGAAGAAACTGAAGCATGTCTATGCAGTTTAGTAGAAACTGGTGTGATAAATGCCCGTACGGATCGTCCAGCCGGT
GTGGTTCGTTTTACAGGAACCCAAGAGCCGGCTGCTCTTTTGGACGCATGGGCTGCATCTTTATCAAAATTAATGAGTC
TTGTCAATCATACAACTCATCTTATTCATCAAGAAGAAATGTTGGCTGTAGCTCAATCCTGAAAGATAAATTTTCACTT
TTTCTTTTTCCCTATTTTTTTTTCTCCTTTCTTTCTTCTCTTTGGATCAAAACTTGGTTAATTTTCTTCTTTATATAAT
CCTTTTACTTTTTTAAAAGTTATTTCCATAAAATGACTGCACATAAATAATTGTACACTTTCTATAAAACTTTAAAAAA
CTTTAAAAAAAACATTTAA
This long contig indicates that there might be an unspliced intron in the
Drosophila gene CG3294; it's a proteasomal subunit and the honeybee sequence
aligns well with all others, but Drosophila has a large insertion.
\**********A. mellifera Contig1674
GTATACTTGTTGACCATCCCCGGCCTTCGACGTGCACCTGTCAGCGTCATGTATTCGCGTTCTCGAATGCGACCACGCC
GACTCTTCTCCACGATGCCTTTCGTGTTGTAAATAATACTGTGAAAAGAAACCTCAGACCGAAGATGGGCAAGCTTTTG
AGCCTTCTGGCTCGAGATGAGTCTACCTGTTGCACCCCTCAAAAGTACGACGTCTTTTTGGATTTCGAAAATGCACAAC
CTTCCGATATAGAACGGGAGACCTTTGAAGCGGTGCAAAGAGTTTTGAAAAATTCAGAATCTATTTTAGAGGAGATTCA
ATGCTACAAAGGTGCCGGAAAAGAAATCAGGGAGGCGATTTCGGCTCCCACGGAAGAGTGTCAACGAAAAGCTTACCTG
ACTGTTGCACCTCTAGTCGCCAAGCTGAAAAGATTCTATGAATTTTCATTGGAACTTGAGAAGGTAGTACCAAAAATCT
TAGGCCAACTGTGCTCCGGTAATCTCTCCCCGACCCAACATCTCGAGACTCAACAGGCATTGGTGAAACAGCTGGCGGA
AATTCTGGAATTCGTCTTGAAATTCGACGAGCACAAGATGAAGACACCCGCTATTCAAAATGATTTTAGTTATTACAGA
AGAACGTTGACCAGAGCATCTCTGGCGCGACAAGAAAGCGCTGAAAAGGACCTCGTGGTCGGGAACGAGCTCGCCAACC
GAATGTCCTTGTTCTACGCCCACGCGACACCCATGCTTCGTGTTCTGAGTCACGCGACCATTACTTTCTTGATGGACAA
CGAAGATGTAGCACGTGAAAATATTACTGAAACTCTTGGCACCATGGCCAAAGTTTGTCTGCGCATGTTAGAAAATCCG
AATTTATTGGCGCAATTCCAACGAGAAGAAACTCAACTCTTTGTTCTGAGAGTGATGGTAGGGTTAGTGATCCTTTATG
ATCATGTTCATCCCCAAGGTGCCTTTGTTAAGGGTTCGAATGTCGATGTTAAAGGCTGTGTAAAGCTATTGAAGGATCA
ACCACCTTGCAAAAGCGAGGGTCTTTTGAATGCTCTTCGCTACACCACCAAGCACCTGAACGAGGAGAACACACCGAAG
AACATTAAGAACCTTCTAGCAGCATGATTACCAAAGCAGCGCGGATGCTGCATCAGAAGCTAAGACAAATTGAACGAAT
GACATCCATTCAAATTTTGTGTGTACGAGATCGTCGAAGCGACTTTATATGACACCAAACAAGCAATCTCCTGTTTCTG
ATCGTGGCTGGGTGGTGGACCATTAAAAAAAAAAAAACGTGTCTTTCGTAATAGAGTCTATAGCTGCCCGAAAACTGCC
GACTTCTACATACCTTCTGACGAAAAAAAGAAATTAAAAAAAAATAAATAAATAAATAAATAAATAAAAAAAACAAAAA
AAAGTAAAAAAGAAAAAAAAAAAAAAAAAAAGCAACGC
This contig, plus vertebrate matches, show that CG6487 and CG6491 should
probably be fused. There's a B. mori EST that agrees, but amazingly no
Drosophila ESTs at all.
\**********A. mellifera Contig1686
GATAAAAAGAGTATGCAAGCTTTTTTAAAAACTGGTAAATTAGGTCCTGGCGAGTTCAAAAAAGTTTCGAATTCACGTT
CAAAAGAAGAACGTAGTGGTCCTGCACCGCCATGGGTTGAGAAATATCGTCCAAAGAACGTAGAAGATGTTGTTGAACA
AACAGAAGTAGTAGAGGTATTACGACAATGTTTAAAAGGAGGTGATTTTCCAAATTTATTATTTTATGGTCCACCTGGA
ACTGGTAAAACAAGTACTATATTAGCTGCAGCTAGACAATTATTTGGTAGTCTTTATAAAGAAAGAGTATTAGAATTAA
ATGCTTCTGATGAACGGGGTATTCAAGTTGTAAGAGAAAAAATCAAATCTTTTGCACAACTTACAGCAGGTGGTATGAG
AGATGATGGAAAAAGTTGCCCTCCTTTTAAAATTATTGTCTTAGATGAAGCAGATAGTATGACTGGTGCTGCACAAGCT
GCACTTCGTCGTACTATGGAGAAAGAATCTCATAGTACTAGATTTTGTTTGATTTGTAATTATGTATCAAGAATCATAG
AACCTTTGACTTCTCGTTGTACAAAATTCAGATTTAAACCATTAGGAGAAAATAAAATTATTGAGAGATTAGAATATAT
ATGTAAAGAGG
This contig and human matches show that CG8142 needs an N-terminus and it is
available in the genomic sequence, with an N-terminal exon and intron. There
are embryon Drosophila ESTs showing this.
\**********A. mellifera Contig1749
TCAAAATGGGAAATTGTTTGAAACGCGCTGGAAGCGGTCAACAGGACAATACCACTTTGCTGAGTAACAATCCTGATCC
TCCTACATTGACTAGCGGTTCTTTACAAGAAGGCCTTGGACCTCCGATACCTAACAATGAGGCTGTAACCTTTTCTTAT
GCACCAGTTTTTACAAGGGAACTTCATCTTCAACAAATTGGCATTGGTGTTAATCTAGGACCAGGTAGTGAAGAAGAAC
AACAAGTTAGAATAGCAAAACGCATAGGACTTATTCAACATTTGCCTATGAGAGAATATGATGGGACTAAAAAAGGAGA
ATGCGTGATATGTATGATGGAGCTGCAGGTGGGAGAGGAAGTGCGTTATTTACCCTGTATGCATACTTATCATGCAGTA
TGTATCGACGATTGGTTGCTGCGTTCTTTGACTTGTCCATCGTGCATGGAGCCTGTAGATGCAGCATTGATTAGTTCAT
ATCATCCAACCACTTAACACCAAGGAGATGGAGAATGAAATAGCTAATCGGTAATCAAACTATATTATCGCGATGAAAC
AGAGAAATTGGAAAAGATTATTAATTTTATTATCATATATAATGGCTTCTAAATAGCAAAAGGGTATTTTCTTCTTGTT
ATGGCTAAATTTGCCATTGTATTCAAATATATCTGTTCGCATGCTAATAAT
Encodes full-length protein, by vertebrate and other matches, which is
unannotated in the Drosophila genome
MGNCLKRAGSGQQDNTTLLSNNPDPPTLTSGSLQEGLGPPIPNNEAVTFSYAPVFTRELHLQQIGIGVNLGPGSEEEQQ
VRIAKRIGLIQHLPMREYDGTKKGECVICMMELQVGEEVRYLPCMHTYHAVCIDDWLLRSLTCPSCMEPVDAALISSYH
PTT
There's even a head cDNA for it.
AE003844
TAACACCATCTTCTAACAATCGTCAACTTTCCGATGAAAATCAAGTGAAAATTGCAAAGCGAATTGGATTAATGCAGTA
CTTGCCAATAGGAACATACGACGGGAGCTCAAAGAAAGCACGAGAATGTGTTATCTGCATGGCTGAATTTTGTGTTAAT
GAAGCCGTACGTTATCTACCTTGCATGCATATTTATCATGTTAATTGTATAGATGATTGGTTGTTGAGAAGTCTAACTT
GTCCCAGTTGTTTAGAGCCTGTTGATGCTGCCTTGCTGACGAGTTATGAATCGACATAGCGTTATAAAAACATTAGCTT
ACAATTTTGCTGCTGTAATGTGTTTTGGGATAACAAAACCTTTG
GH26713.5prime
GAATAAATCAGGCTTTATTAAATCGAATCTAGTCTAATTTCAAAGAAAGTCACATTTAATGTTTTTTTTTTTTTAAATC
AACTAACTAAATTGTTTCTGTTTATTATGAAAGTTGTGTATACATATGTGCATTTTATATACATGCATGCGTACTTATT
AATTTAAGATTTCTTGGGGATTGGTACTAATTGGTACTGTATATTTAAATCTTCGAAAAACGCATGAAATGGGTAATTG
CTTAAAAATTAGCACTTCAGATGACATTTCACTTTTACGCGGCAATGACAGTCAAATCAGCGGGACACAGCCAGTGTAT
CATCAGGGAGAGCATTATCAACGAGAATTGTACCCTTCCACGTCGTCTTCGACAACGCTAACACCATCTTCTAACAATC
GTCAACTTTCCGATGAAAATCAAGTGAAAATTGCAAAGCGAATTGGATTAATGCAGTACTTGCCAATAGGAACATACGA
CGGGAGCTCAAAGAAAGCACGAGAATGTGTTATCTGCATGGCTGAATTTTGTGTTAATGAAGCCGTACGTTATCTACCT
TGCATGCATATTTATCATGTTAATTGTATAGATGATTGGTTGTTGAGAAGTCTAACTTGTCCCAGTTGTTTAGAGCCTG
TTGATGCT
M G N C L K I S T S D D I S L L R G N D S Q I
S G T Q P V Y H Q G E H Y Q R E L Y P S T S S S T T
L T P S S N N R Q L S D E N Q V K I A K R I G L M Q
Y L P I G T Y D G S S K K A R E C V I C M A E F C V
N E A V R Y L P C M H I Y H V N C I D D W L L R S L
T C P S C L E P V D A A L L T S Y E S T \*
Drosophila protein is
MGNCLKISTSDDISLLRGNDSQISGTQPVYHQGEHYQRELYPSTSSSTTLTPSSNNRQLSDENQVKIAKRIGLMQYLPI
GTYDGSSKKARECVICMAEFCVNEAVRYLPCMHIYHVNCIDDWLLRSLTCPSCLEPVDAALLTSYEST
\**********A. mellifera Contig2388
TGGGATGTGAAAACACTTTCGGAATGCTGTAGGACTGATCATGGTTATACACCAGATTCTCGTGCTATTCGTTTTCTAT
TTGAAGTTATGTCAAAATATAATAGTGAAGAACAAAGGCAGTTCGTTCAATTTGTTACAGGTTCACCTCGATTACCAGT
AGGAGGTTTCAAGAGTTTAACACCGCCGTTAACAATAGTGCGTAAAACGTTCGATCCATCTATGAAAACAGACGATTTC
TTACCATCCGTAATGACTTGTGTTAATTACTTAAAACTGCCTGATTATACAACATTAGAAATAATGCGGGAAAAGTTGC
GAATAGCTGCACAAGAAGGACAACATTCGTTCCACCTTTCCTAGAAACGGATGGAAAACCGGCGCGCCATTTGCTCTTT
TGCTATCGTATTGTCCAGGTTGAAAAAATTCTTCAAACACCATTTAAAATATTTAAAAAAGTAACTGCGCGCGCGTTTT
CTTCAACTCAGTAATCGTATATTTGCACTTTATATGAGCTTTTTCGAAAATATCTTTTTCTTGATTTTAGTGAGAATTC
ATTTAAATAAATTTTAATTTGATGTTTATCTTTTTACTTATAAATATGGATTGTGATATATTTTCTCAAAAAAGTGCAA
AAATGACATTAGCTCTATACTATGAGTTTCCTCTTGTGGTATTATGAACCAGGTAGAATGGGAGCAGCCTTGTTCTAGG
AAACTTATCTTTGATGTGTTACAATAATTAATAAAAGTTGTAAATATAGTTTAATATGCGCAAGTCTATATAAATACTT
TATATAATAGTAACAAAAAAAAATGATATTGCTACGAAACTCTATACTAATTAAATATAATTTAATCTGAGAGGACGTC
TATTAGACATAAGTAAATTTTATATCAGTGGTGCATTTAGCATCGCCATAAATCGCCAATTTCATGCCTCACTGCAGAT
TGTAATGAAATCACAAGTCAAAATTGATCAAAAAAGAAAAAAAATACGCGCTAATAACGTAGCTCATGTTTCATCATCA
TAAAAATATAATAACTGCGAATATAATTGCGTGTTTACGATTATCGTTTTAATTTCTTAATCGATCTCTTTTGTTCGCT
GCTTAGTTCTCAGTATGCGAGCACTGCGCGAATTAACGTGCAAAACTCTTATATTTAAGTTTATTCAATGTATCACTAA
TTAAACTTATTTGATATTTTTATTC
This EST and mammlian matches indicate that CG17735 needs a C-terminus. Seems
to be indicated as a transcribed region, but not in translation?
\**********A. mellifera Contig2672
GGTATTATGATGGTTGTACGCGATGCCTGATCCCGTAACGCTGAGCAACAATGGTGACGTAGAGCTGTTCATCCTGGCG
GTCGTATGATCCTCGGCGGTCTTGTTCGGCAAATTGCCCTTGCTGGCGTTCAGCGCACTGTTTGTCGGGTCTTCGTCAA
TGTATGTGAAGTCAGAAAGTGGTCGGGGCCCAGTGTGGCTGCGTGAACTGCGCATACTGTGTAAGCTGTGTAAACTATG
GACAGAGCGACGCTCCTCCAAATCTTCTAGAGTCGACTTGAACGCTCTGATTACTCGAAGCTGTGTCTGTAGTCGTGTT
AGACCACGGATCCATAGAATTTGTCCTGCGCGCGGCTTTTTATCCGAGTCAGGGTCGAATTTCTCATCTCCTAGATTGA
TCGCACTGATATCATCCGGCTGGCCGCGGCCCCATGAAAGGATTTTAGGAATCTTGCGCGTAGGAATAGTTGTAATTAC
TTGGCCCCACAATAGAGTACCGACTCCGAAGAATAGGCACCACATCCATTGTTCTAATGTGAGAGCTTTCGTGCTGAAC
GCCATTTTACCATATTGTATGATAACTACCTGCGATAGACATGTTACGATCCAGATAGTGTAAAAGATGGGATTGGTGA
ATATTCCTTGGAAGACATTACGCTGACCATGGATTTTTCTAGCGTTAAATTCGTTGAAAAGTGTCATCATGACGA
This and vertebrate matches show that a longer C-terminus is needed for
CG2165, and it is available in two exons in the genomic sequence.
\**********A. mellifera Contig2806
AACACATGCAAAATGGAACCACTCTATAGGACAATCTGGATTATCACAACCTATCATTTCCCCATAAGAAACTTGATGA
CATAGGCAATAAGTTGGTTCGTTAGGATCAACTGGCATATCTAAAACATCTGCTGGATGACCAAGTGCAGTAGAATCTA
CTTGAGCACCACTTCCTACAGCTCCAGCTGATGAAGCAGAAGCAACAGATCCTCCTTTTTTCTGTTTTTTTCTAGCTGT
TTTCGATTCATCTTCACTGTTAGTACCTGCACCTTTCTTTCGTTTTTCTTTTTCTTTTAATTTTTTCCTGCCCTTTTTA
CTAGCATTATTTTCTTCTTGTGCCCTACTACTATTTAAAGCTTTATCTTGTATTTCAGCTTCAAATCTAGCCAAATCAG
AATCTAGTCTCCTAATATGTTTATCAACTAATTCATATGTTTGTATTGCTAATTGTACTTTATCATCACCATATTCCTT
TGCCTTGTTAAATAAGTTTTGAATATGAGTCAATTGTTCCTTTTTCTTTTCTGGGGATTCTTTCTTTACATTTTTTAAA
TAATCATCTGCTAATTTATCTATATCTTTCATTAATCCTTGTGCTCTAGCATCAAGGTCTCGCATTAAAGTGAAATTTC
TTTGTAATTCAATAGGTAGATGTTCCAAACTGTCTAAATAATGTTCTAAATACAGTGCTGTTGTCATGTTTATTGTTAC
TTAAAGAAAAA
This EST, and vertebrate matches, and even Drosophila cDNA LD46333, show that
CG9293 has three open introns retained that need to be spliced out.
\**********A. mellifera Contig321
TTGCTCCTTTTAGGATTGGATAATGCAGGAAAAACAACAATTTTAAAATCATTGGCCAGTGAAGATATTACACAGGTAA
CACCGACACAAGGATTTAATATAAAAAGTGTTCAAAGCGAAGGTTTTAAATTGAATGTCTGGGACATAGGAGGTGCTCG
AAAAATTCGACCTTATTGGCGAAATTATTTTGAAAATACAGATGTTTTAATATACGTCGTGGATAGTGCGGATGTAAAG
AGATTAGAGGAAACGGGTCAAGAACTATCAGAACTTTTATTGGAGGAGAAATTGAAAGGTGTTCCATTATTAGTTTATG
CGAATAAACAAGATCTTGGACAAGCAGTCACAGCAGCAGAAATTGCCGAAGGCCTCGGATTACATAATATCAAAGATCG
CGATTGGCAGATACAATCGTGCATTGCTATCGACGGGAAAGGCGTGAAGGAGGGTCTTGAATGGGCATGCAAAAATATC
AAAAGAAAGTAATATTGGCACAAGTGTCTTAAAAAATCGAACAGCTTACAATGTACTCTCATACATTCAACATACTTTT
TATTTGGCTTTTATCTAATATTTACAAAAGCTGTTAACAGACAACGTAAACGAATGGATTTACTAGTCAATTAACTTCC
GTCCTTCATGGAAGGAGCTTTTCTAAAGCTCTAATTTACGATGCCTTAATGAATACGATATCTATTATA
This and vertebrate matches, and cDNA AT01916 indicate that there is an open
intron that needs to be spliced from CG6560.
\**********A. mellifera Contig457
ATATATACAGTTGCGACACAGTTGCCGGTGCGACACAACGCGATATTTAGTCGAGCGATATGCCGACGAGTTTGTTCAG
CGAAATGGATCGCGGAGCTAATGGCGGTGGAACTGCTTCCCTCGAAGACCAGAGAGCCCTACAGTTGGCATTAGAATTA
TCTATGCTTGGTCTCGAAGGTACTCCCGGATGTCCGGGCACTGGTACAGGAACCGGCACCGGAACGGCCAACGATCCGG
ATCCTTTGCAAACGACACCGGCAGGCGTTTTCGAGGAAGCACGTTCTAAGAAGAGCCAGAATATGACCGAGTGCGTGCC
GGTGCCTAGCAGCGAGCACGTGGCAGAGATCGTCGGCCGACAAGGTTGTAAGATCAAAGCGCTCCGAGCGAAGACCAAC
ACCTACATCAAGACGCCGGTGCGCGGCGAGGAGCCGGTGTTCGTGGTGACCGGGCGCAAGGAGGACGTGGCCCGCGCGA
AGCGCGAGATACTGTCGGCCGCCGAGCATTTCTCGCAGATCCGTGCCTCCCGCAAGAGCTCGTTGGGCGCCCTGTTGGG
CGCGCCCCCCGGCCCGCCAGCCTCCGTTCCGGGCCACGTGACGATTCAAGTACGAGTCCCGTACAGAGTAGTCGGGCTT
GTGGTAGGACCGAAGGGCGCGACCATCAAGAGAATCCAGCACCAGACCCACACCTACATAGTGACGCCGAGCCGCGACA
AGGAGCCGGTGTTCGAGGTGACGGGGCTACCCGAAAGCGTGGAGGCTGCGAGGCGCGAGATCGAGGCACACATAGCGCT
TAGAACCGGCACAGGAACCACCCTCGACGATTCGGAACTGCTAAGCGTGCTCTGTCGCGGTGGCCTGGGCTCGATCCTC
GGTTGCCTCGACCCACCCGGCTCGAACGGCTCGAACGGATCCAGCGGAGCGTTCTCCAGCAGCGGCAGTTGCAGCAGCT
CGTCCAGCAGTTCCGGCGCGCCCGGGCTCAACGATCTCGTCGCGATTTGGGGCGCCGGCATGGAGAGGGACGAGGGCCT
GGGGGAGTCGCCCTCCTTCGAGTCCCAAACGGCGTCCGCCTCCTCGATCTGGTCGTTCCCAGGCGTCGCGCTACCCTCG
AGGCCATCCCCGCCGGCGTCAGCGAGCCCGACGTCGCCCACGGACTCGCTGCTGGGCGGTGGACGGCGGGAATGCGTGG
TGTGCGGCGAC
This and vertebrate match indicate at least two exons missing from the
N-terminus of CG11360; and they are encoded in Genome.
\**********A. mellifera Contig49
TCAACCACCAACTCAAGTTACAACTTTAGATTGTGGTATGAGAATTGCTACTGAAGATAGTGGGGCACCAACAGCCACA
GTTGGATTGTGGATTGATGCTGGTAGTCGTTTTGAAACTGATGAAAACAATGGAGTTGCCCATTTTATGGAGCATATGG
CATTCAAAGGAACTACTAAACGTTCACAAACTGATTTAGAATTAGAAATTGAAAATATGGGTGCTCATTTAAATGCATA
TACAAGTAGGGAACAAACAGTATTTTATGCTAAATGTTTGGCAGAAGATGTTCCAAAAGCTGTTGAAATTTTAAGTGAC
ATCATTCAAAATTCAAAACTTGGCGAAAATGAAATAGAAAGAGAACGTGGTGGTATTTTAAGATAAATGCAGGAAGTTG
AAACAAATCTTCAAGAAGTTGTTTTTGATCATTTACATGCTAGTGCTTATCAGGGTACACCATTAGGAAGAACAATTCT
TG
This and vertebrate matches and many ESTs show CG3731 needs N-terminus, and
there is an exon in the genomic sequences
\**********A. mellifera Contig663
AGGAGGTAATGTTTTTAAACGAACTCGAAGAAATTCTTGACGTGATCGAACCTGCGGAATTCCAAAAAGTTATGGATCC
GTTGTTTAGACAATTAGCGAAATGTGTATCATCTCCGCACTTTCAGGTCGCTGAAAGAGCTTTGTATTACTGGAACAAC
GAATACATAATGTCACTTATATCGGATAATTATTCAGTCATTCTACCAATTATGTATCCAGCATTCTATAGAAATTCCC
GAAATCATTGGAACAAAACTATCCATGGTTTAATCTATAATGCATTAAAGCTTTTTATGGAGATGAATCAGAAAGTATT
CGACGAGTGTACTCAACAGTATTATCAGGATCGACAAAGAGAAAGAAAACTTATGAAAGACAGAGATGAAGCGTGGATG
CGCGTTGAAGCTCTGGCAATGCGACATCCAAATTACAACGCGACTATAAAAGGTATTACGAATACAACAATTGGCACAA
TATCGCAACAACAATTGGACAGCCCTCCGCCCGATGAAGATGGTGATACTGATCAGACACCGCTTACATTGGAAAAAAT
AGAGGCGAAAGCAAATGAGGCAAAAAAAATGACGAACTCTAACAAAACAAAGCCACTTTTACGGAGGAAAAGCGATTTA
CCCCAAGACACGTATACAATGCGGGCATTATCTGATCACAAACGTG
This and vertebrate matches show that CG7913 needs a different C-terminus, and
it's available in the genomic sequences.
\**********A. mellifera Contig75
CCAAATTCTACCCATGCAAGTAGTAATTCTTGTTTTGAATTTTGACGATTTTTTACAGTATAATAATGTTTCCTTAACC
GCCATCCCATATCAGGAATGGCTTTACAAACTTCAAAATTTGCATCGCTACGTAATGCTAATGACACCTGTTGCAAAGC
TATTTCATGTAATGGTGCTGTATAATTATCTGGATCTAAGAATTTTTTCATAGGAAGTGATGAAGCCAATATAGGATTC
ATTAGTATATTGTTTAAATAATTCTGTAGGGCTATTTGACGTTGAGCTATGAAGTCTGGTTCCATATTACCAATGATTT
TCTTTGGAGGAAATGCCAGATCAATGCCAGATATTGATAAAGCTGCATTAAGCTGTACAAAGTCATTGTAGCGTCTGCT
AACTCTCCAAAATTTTTCAGAAAGTGGACCTCTTTGTGTTCTAATCACATATTCCGTATGTCCGTCGATTGTTCTTGCA
TTTTCAATGACGCTCGTCAGCTTTTCTGTATCATCTAACAGCACTTTATTTGTATATCGTTTCTCAAATAAAGCCATTG
ATCATTAGTCCAACAAAATGCAGAGCCCGTGCGGGGCGAACTGCGTATGGCTAGCGCATATTTTTCGTAAGTC
This and vertebrate matches, and cDNA LD23236 show that CG8726 needs an
N-terminal exon, and it's there in the genomic sequence.
\**********A. mellifera Contig94
GGAATATTAAAGATTGTTTTCCAGATCTTCAAACGCAAGTTAAACTTAAATTACTTCTTTCATTTTTCCACATACCAAG
ACGGAATGTCGAGGAGTGGCGTGTTGAATTGGAAGAAATCATTGAAGTGGCCTCATTGGATAGTGAATTATGGGTATCA
ATGCTATCCGAAGCGATGAAAACATTTCCATCAACAGGTTCTTTGAATACAGACATTACAGATTTAGATGAACATAGGC
CTATTTTTGGAGAACTAGTCAATGATCTTCGAAAACTTTTGAAAAAACAAAATGATCCAGCTATGCTTCCATTAGAATG
TCATTATCTTAATAAGACTGCTCTTACTTCTGTGGTTGGCCAACAACCTGCACCAGTTAAGCATTTTACTTTAAAGAGA
AAACCAAAAAGTGCTGCATTAAGAGCAGAGTTACTTCAAAAAAGTACAGATGCTGCAAGTAATTTAAAAAAAAGCACTG
CTCCTACAGTACCTGTAAGAAGTAGAGGAATGCCTAGGAAAATGACTGACACAACACCTTTGAAAGGCATTCCCAGTAG
AGTCCCTACAAGTGGTTTTCGTTCTCCCTCGCTTACGAGTTCTTCAATGTCTAACAGGACACCTC
This and vertebrate matches, and cDNA SD13146 show that CG5874 needs at least
two additional exons
\**********A. mellifera Contig950
AACGAAGTTGGTAGTTTTGTAATGAGTTCTTTGCTTTCAAGGTAGAATTTAGGAAAGGATAAATTATTTTTTCGTAGTA
TTCTTGAGAATTTATATTAAAAGATGGAGGACGTTCTCGAAGAGGTTGTTTCTTCTGATGATTTAAAGAAATTTGAACG
TATATATAATGAGCAATTACGTTCATCAGTAATAACACAAAAAGCTCAATTTGAATATGCATGGTGTCTTGTTAGAAGC
AAATATCCTGCAGATATCAGAAAAGGAATAATGTTATTGGAAGATTTATATTGCAATCATAGTGATAGCGAAAAACGGG
ATTGTCTTTATTATTTAGCCATTGGAAATGCTAGAATAAAGGAATATACAAAAGCTTTAGCGTATGTCAGATCGTTTCT
TCAGGTTGAACCTGGAAATCAACAAGTACAGCATCTGGAAACATTGATCAAGAAAAAAATGGAAAAAGAGGGACTTTAT
GGTATGGCCATTGCAGGAGGAGTTATTATTGGTCTTGCAAGTATTCTCGGCCTCAGCATTGCTATGGTCAAAAGAAACT
AATCATTTCATGTGAAAAAATCTGTAACTATTGTGTGATGTGAAATAGTTGCTTTGTACAACGCGTTATATATATTATT
TATTGATTGAAATA
This and vertebrate matches show that CG17510 needs an N-terminal exon,
present in genomic sequences.
\**********A. mellifera Contig982
GCAAGTGTCAATTTGATTATTGTCATCGTTTCTGTGACTATCGATAGTTCTGTAAGTTTGGAATTTCTACCGATGATCG
ACAAATTGCATATACGTATTTCTGTGGGTGCCGTGAGGCGGCGAGAAAGGCCACCTCTTTAAATTAAAAGAAGGAATGA
CGCTAGAGAATCCGTTCTTTGTCGTCAAGGACGAAGTTTGTAAAGCTTTAAATAAAAATCGCGGTTTATACGGGCGCTG
GACGGAATTGCAGGATGTTGTCACGAGTCCTACTGTGAGTGGGGGAATCCCAATCTCACGCGACGAATTAGAGTGGACT
ACTACTGAATTACGGAAAGCTTTACGTTCAATCGAATGGGATCTTGATGATTTAGAAGACACAATTTGTATTGTAGAAA
AAAATCCAACAAAGTTTAAAATAGATAACAAAGAATTAACGGTTCAACGAAGTTTTATCGAACAAACTCGAGAAGAAGT
TAAGACCATGAAAGATAAAATGAATTTAAGTAGAGGTCGGGATCGTGATAACACAGCAAGACAGCCACTTTTAGATAAT
AGTCCTGCTCGAGTTCCTGTCAATCATGGCACAACAAAATATAGCAAATTGGAAAATGAAATTGATAGTCCAAATAGAC
AATTTTTAGGAGATACCTTACAACAACAAAATGATATGATGAGACAACAAGATGAGCAACTAGATATGATAGGTGAAAG
CATTGGAACATTGAAAACAGTATCTAGACAAATCAATACTGAATTAGATGAACAAGCAGTTATGTTAGATGAATTT
This and vertebrate matches suggest that CG7736 needs different C-terminal
splicing.
\------------------------------------------------------------------------------
--
DOI
Associated Information
Comments
Associated Files
Other Information
Secondary IDs
    Language of Publication
    English
    Additional Languages of Abstract
    Parent Publication
    Publication Type
    Abbreviation
    Title
    ISBN/ISSN
    Data From Reference