FB2025_01 , released February 20, 2025
Reference Report
Open Close
Reference
Citation
Robertson, H. (2001.5.6). FlyBase error report on Sun May 06 00:40:30 2001. 
FlyBase ID
FBrf0136023
Publication Type
Personal communication to FlyBase
Abstract
PubMed ID
PubMed Central ID
Text of Personal Communication
Subject: more bee-suggested annotation improvements
Dear Gillian,
Early this year I sent to you, via Sima Misra, a set of unannotated genes,
plus some annotation improvements, suggested by some of the 15,000 honey bee
brain ESTs we have. I've been working on this more, and have about another 40
such instances, only 2-3 of which are completely unannotated genes in
Drosophila. There are many more instances like these and I will try to get
them to you soon. For now the attached WORD MAC TEXT file has a simple format
of each bee EST sequence with a short note about what I think it suggests, but
I leave working up the details of the needed changes to the Drosophila gene
annotations to you folk.
Hugh
Hugh M. Robertson
Professor
Department of Entomology
University of Illinois at Urbana-Champaign
\------------------------------------------------------------------------------
--
More honey bee interestings things, found by actually comparing TBLASTX and
BLASTX scores;
wherever the former is much higher than the latter one might expect that the
BLASTX found a member of a gene family, while the TBLASTX found the real
ortholog
Also, now comparing BLASTX to nr with all of these searches of the Drosophila
genome, and those coparisons turn up additional annotation improvements.
I've not worked any of these up in detail, simply note the honey bee EST that
provides the data, and some indication of the problem.
\***************A. mellifera BB260004B20D3.F
AGATCGCCCTCACGACAACATCCCCGTGGCTCCTCTTCTTCGACCTGGTCGCCATTTACAAAGACGACCCGGACCTGAG
GCTCTGCCTCGAGGTGTGGCCGCGCCCGAAAGACGAAACCCTCTTCTTCCTGATCGGCAATCTGACCCTGTGCTACGTA
CTACCCACGATCCTCATCTCCCTCTGCTACATATTGATCTGGATCAAGGTGTGGCGGAGGCACATACCCTCCGACACGA
AGGACGCCCAAATGGAGAGGATACAGCAGAAGTCGAAGGTGAAGGTGGTGAAAATGTTGGTCGTGGTCGTAATACTGTT
CGTCCTCTCGTGGCTACCCCTCTACGTCATCTTCACTGTGATCAAGCTGGGCGACGAGCAAAGGGAGGACGAGATCGTC
CCCATAGCAACGCCGATCGCCCAATGGCTGGGAGCGAGCAACTCGTGCATCAATCCGATCCTCTACGCCTTCTTCAACA
AAAAGTATCGGCGAGGCTTCGTCGCGATTCTGAAG
Very unusual example. The EST has two forward ORFs. One matches the
translation of CG10823 at about e-18, but there is a TBLASTX genomic match at
e-60 for the other
Turns out that the latter is the correct translation, matching as it does
tachykinin receptors in vertebrates, and the Drosophila annotation of CG10823
is using the wrong reading frame!
\***************A. mellifera BB260006A20B4.F
TGACTGAAGAAACACAACAGTCTGAGACTGCCGCACAAAATGAGGCACAAACTAGTTCTCCAGATGTTGGAAAAGTTAA
AGATAGTAAAAAAGAAAAATGCCGACCTATGACAAAGGTAGTAATACGGAGATTACCTCCAACTATGACTCAAGAACAA
TTTCTAGAACAGGTTTCTCCATTGCCAGAAAATGATTATCTTTATTTTGTGAAAGCTGATATGTCTATGGGACAATATG
CTTTTGCCCGTGCTTATATTAACTTTGTTGAACAACAGGATATTTTTATGTTCAGAGAAAAATTTGATAATTATGTATT
TATCGACTCTAAAGGTACAGAATATCCAGCTGTAGTAGAATTTGCACCTTTTCAAAGATTACCAAAAAAAAGAACAGGA
AAAAAGAAAGATTTAAAATGTGGTACAATAGAATCAGATCCTTATTATATAAGCTTCTTAGAAACTCGTAAAAATCAAG
AAGCTGAATCTAATATATCACAACCAAAAACAGAATACTCATATCAACCACCTGATAATACACCAAAAAAAATTACAAC
CACTCCTCTTTTGGAATATGTAAAACAACGTAAACAAGAAAAGCAACGTCTCAGAGATGAAAAACGTGAAGAGAGACGA
CGAAGAGACCTAGAACGGAGGCGAACAAAAGAAGATCCTATCATATCTAAGGTATTGAAAATCAAGATCTTGATAAAGA
AATGTGTAAAGATTATAAAGAAATAGGGAAGAAAAGATAAT
Shows that the N-terminus of CG11184 is missing
\***************A. mellifera BB260007B10E5.F
GCTACAAGAAAAAGTATTAGCCGAAACATCAATAAAAAAAGAAAATAAAGACAACACTGCAAGTGGTGGCGAATCTTTA
GAAACGGTTTTAGAAACAGAAACAACTGAAAATGTAGAGGAAACCTTTAATGATCAAGAAAGTATACCTTATTGCGTGA
AGACTAAACCTTATTATTCTATTAATGCGAAATTAAATACAAGAGAAAAAATTAATATTCCTTCCATTAATGCTTCCAC
CAACAAATATGTTGAGAAAACATATTCGGAAAGTGGAATAGATGAGAGTACACCTGCTTTAGAAGATCAAATTACTCCA
ATAGAAGATACAAAACCTCCTCCTGTTTGGGATGCGAATTTTAAATTTCTTACTAGCTCTGTGAGCGGTAAAAGTTCTC
AGAATTATAATAAATCTAGTGTTACTTGTCTTATTTGCGGAAAGCAATTAAGTAATCAATATAATTTGCGAGTACATAT
GGAAACTCATAGTAATAGTTCATATAACTGCACAGCTTGTTCACATGTATCAAGATCACGAGATGCCCTTAGAAAGCAC
GTTTCTTATAGACATCCAATGGCCTCGCCACAAAAACGTTCACGTTATAGTACACCGAAATCCTAAAAAATGGTATCTT
TAATCTTCGATTAAATCTTTGATCTAAGAATAGGACGTTTTAATTAAAACATAAAACTTTGTGTAATTTAATGCCCAAA
TTTTTAACGGTTAATGGTCTAAGGAAATGCCGATTGATTTAAAAAAAAATAAAAGCCA
Encodes C-terminus of a > 200aa protein
LQEKVLAETSIKKENKDNTASGGESLETVLETETTENVEETFNDQESIPYCVKTKPYYSINAKLNTREKINIPSINAST
NKYVEKTYSESGIDESTPALEDQITPIEDTKPPPVWDANFKFLTSSVSGKSSQNYNKSSVTCLICGKQLSNQYNLRVHM
ETHSNSSYNCTACSHVSRSRDALRKHVSYRHPMASPQKRSRYSTPKS
BLASTX match is weak at e-06 to multiple four-cysteine repeats within CG2889;
but genomic e-15 match is to a single four-cysteine repeat interupted by an
intron; N-terminus doesn't align
Genomic match is to 7kb unannotated region \- try to reconstruct gene below:
no ESTs to help
I think there are two more exons upstream, with excellent splice sites, but
they don't splice in frame to this below, so unsure what's going on in absence
of cDNA
ATGCCAGACAACCGCAGCATCATCATTTTGATCAGCAGCTGCAGCACAATGACAGCCAACAGCAATTCTGCCTGCGATG
GCATAACCATCAGgtgagatcctccagataaaattcatttgaattttcatttgtggggggactatattcaacattacct
cataagtaagatccattagaatggactggacatcgttcagtgaagcgagcaggccgaaagtaaaatgtcgccgcatcta
tatataggatccatctatccaaacacattcccctatacgtacagattcacacatagttttccgcattttcatatagtga
caatttttgtgtttcggcccaaagggtgtaatgcggtgaaagcggggcagcggtgaaactggctgtccaaaatagcaaa
acagccaaatggacaaagagtagtgaggaggagtgttcagtgtgtgtgtcacataaatgaacagattcgtgcaattgcc
aaaatccaaagaatttccacagcgagtaaatcgaaaagttggaaagcagctgcagccaacccctgtcccctgttacacc
gcagacagaggagcatcatggggcccaggaaattgtatttttatgatcgtgtgacccacttatgagcacttttcacatc
cccaaccccattccctctgatcccttccagACAAGTTTGCTGAGCACTCTGCCCATTCTGCTGGACCAATCCCATCTGA
CGGATGTGACCATTTCGGCCGAGGGACGCCAACTGAGGGCCCATCGCGTGGTACTGAGTGCCTGCAGCAGCTTTTTTAT
GGACATCTTCCGGGCTCTAGAGGCCAGTAACCATCCAGTCATCATCATACCAGGGGCCAGCTTCGGCGCCATCGTCTCA
CTGCTCACCTTCATGTACTCCGGAGAGGTGAATGTATACGAGGAGCAGATACCGATGCTGCTCAACCTGGCCGAGACAC
TGGGCATCAAGGGACTGGCCGATGTCCAGAACAACAATgtgagtagctcaagtgcaagatctagttagataatttaaat
aaacttgtagTTTTTCCCTGAAACTCATACGCCTTTCTCTTTGGTCTCATAAATCCAAGCAGTTACCAAAAACAGCGAG
AAGTGGAGGTGGCTCCTAC
And not even sure about the C-terminus, although it does align somewhat
AE003538.2
ATGGATACGACGAATGAGAAGTCCTCTGAGTTTGAACGTCCCACCACACCCTCGCCCACGCCCACCCCCACCCTTACGC
CCTCCCACACGCCTACCCCAAGCCATTCACTCCCCCTGCCCCAGCTACCAAGTGCAGCACTTAACACCCCTCTCCTGGC
CAACAAACTGGGATCGGTGAACTCCAGTGGAATGGGCACCACGCCCCTGGAGAATCTCTTTAAATCACTGCAGTTCTAC
CCCAGTCTGCTGCCCCAACCACTCAACTTCTCGCAGACGGCGCTCAACAAGACCACTGAGCTGTTGGCCAAGTATCAGC
AGCAATGTCAGCTCTACCAGAGCGGGATGCAGGAGGATCAACTGGAGACGGACTGTTTTGGCTCCAAGAGGCTAAAGGG
CGACAGTCCGCCGAAGGAGCTTAGGCGACTGGAAAAAAGCCTTTTAAAGAATCCAAAATCCTCATCGACGACCAACAGT
AGCAGTAGCAAGTCGCCTCAGGAATGCTCCAACCCGAATCCCATTGTGGCCACTTCACCAGTAACTCTCGCTCCCCCGA
CCATGGCACACTTCTCCCCTCAGTTGCCGGTGGTCAAGTGCTCCTCGGCTAGTTACCCCAGCGCTCTCGGCCAAGGTCA
ACTCTATAGTAGCAAGCCACCACTTTATAGTGCAGCAGTCACGCCGACTGCCGCCCAGCAGGCGGCGCAGATGCACCAC
CACCCGCAACCCGGGCCATCGCCCTACATCTCGGCCGAGGATCATGCCAAGCTGCAGCTCCACATCGAACAGTATCAGC
GGGAGGCGGCAGCAGCAGCGGCAGCGGCCGCCGGCGGAATGGCGCTGGTCAGCGCCAAGTCGGAGCCCAATCTGCTCTC
GCTGAGCGCCGACCGCGACAAGTCGCTGGCCACCGCTCCCATCAAGCCGCCGTCCAACTCGAAGCTCTATGCCACCTGT
TTCATCTGCCACAAGCAGCTGAGCAACCAATACAACCTGCGCGTCCACCTCGAAACCCATCAGAATGTTCGgtaagtgg
tctgaattttaattgttataaaacaatagaagtcacctttggaattacttttcgattccatcagGTATGCCTGCAATGT
CTGCTCCCATGTGTCCCGCAGCAAGGATGCCCTGCGCAAGCACGTTAGCTACCGACATCCTGGGGCGCCATCGCCATGT
CGAAAACGAGGCTCGCCGGAAGAGGGTCTCCAAGCTAGCAGCGACCACTGTGCCCACTTCCACGCCCATGTCCATGAGC
GCCAGTCACACGGTCACCAGTGGCGATGTGGGTCCAGCTCCAGCGACCACACTTGGATGTTCGGGTCAGGAGGCAAGGA
ATCCGTACCTTTTCCTGCCCAATCAATTTCAGATGGCTGCTGCTGCAGCAGCCGTAGCAGTGGCCGAATCCTCGCCAGC
TTCTGGTCAACCATCGCTAGACTTGGCACACGAAGCGCCACCGAGCATCAAAAGTGAGCGGGAGCCACCGACGGCGAGC
AACGGAGAGGCGACCGGTGTAGAGGCATCGGCGTCAACCACCTGAGGATAATTTTTTTGAATATTTTT
translation M D T T N E K S S E F E R P T T P S P
T P T P T L T P S H T P T P S H S L P L P Q L P S A
A L N T P L L A N K L G S V N S S G M G T T P L E N
L F K S L Q F Y P S L L P Q P L N F S Q T A L N K T
T E L L A K Y Q Q Q C Q L Y Q S G M Q E D Q L E T D
C F G S K R L K G D S P P K E L R R L E K S L L K N
P K S S S T T N S S S S K S P Q E C S N P N P I V A
T S P V T L A P P T M A H F S P Q L P V V K C S S A
S Y P S A L G Q G Q L Y S S K P P L Y S A A V T P T
A A Q Q A A Q M H H H P Q P G P S P Y I S A E D H A
K L Q L H I E Q Y Q R E A A A A A A A A A G G M A L
V S A K S E P N L L S L S A D R D K S L A T A P I K
P P S N S K L Y A T C F I C H K Q L S N Q Y N L R V
H L E T H Q N V
R---------------------------------------2-------------------------------- Y
A C N V C S H V S R S K D A L R K H V S Y R H P G A
P S P C R K R G S P E E G L Q A S S D H C A H F H A
H V H E R Q S H G H Q W R C G S S S S D H T W M F G
S G G K E S V P F P A Q S I S D G C C C S S R S S G
R I L A S F W S T I A R L G T R S A T E H Q K Z
Leave it at this for now. encodes weak similarity to several other zinc finger
proteins in Drosophila, e.g. sob
\*********** extra \- there is a single testis EST for near the end of this
7kb section, beyond this gene, which could encode a small protein; no matches
at all.
bs85b10 5'
GCACCAGGTTTTTCGTCTTCTGCACTCGAGCGCCTTTCTAGCTATCTTAATTTATGTACTTACGTTTAGCACAATTTAC
GTTTATTTTTTCTGAAACTTTAGCAACCTCCGAGGCTCTCAAACTGCAAGTTACAAACACAATCCTTTCGCTTTCACGC
TCTCTCACAAGCACACGTCCACACATTCTAGTAATCTAAGCCAAGTTTTTATAAATATTGTAATTATACACCTGAACAC
GCACACACACTTGCACACAAAAAGACAGGGTGAAGACCTACCAAAAAAAAAAAAAAAAAA
translation M Y L
R L A Q F T F I F S E T L A T S E A L K L Q V T N T
I L S L S R S L T S T R P H I L V I
\************A. mellifera BB260008A10B2.F
GCTAAGGGAAAGGCGAAGTCTCGTTCCAACAGAGCTGGTTTGCAATTCCCTGTTGGTCGTATTCATAGACTTCTTCGCA
AAGGAAATTACGCAGAACGTGTCGGTGCAGGAGCACCAGTGTATTTAGCAGCCGTTATGGAATATTTGGCTGCTGAAGT
GTTGGAATTGGCGGGAAATGCTGCTCGTGATAATAAAAAAACCAGAATTATACCACGTCATCTTCAACTTGCTATCCGT
AATGATGAAGAATTAAATAAATTACTTTCTGGAGTAACTATTGCTCAAGGTGGTGTTTTACCAAATATTCAAGCAGTTT
TATTGCCAAAGAAAACTGAAAAAAAAGCTTAACCATATAAACATCGATATAAATGGCCCTTTTTAGGGCCACAATTTTT
TAAAACGAAGGAATTTCTTATCCGATTCATAATTAAATAAAATACATTTTTATATAATAAAAAAATATAGAATAGAGAA
GTATAAATAATCTAAAGATATATTATTCAAAGATAGAATTAAATATATTATTGAAAAAAAAAGATATGATATAGAAAAT
TTCGTGAAATCATTTTTAATATAATAAAAGCAATATATTTTTATATTATTAACTACAATTTAAATCATACATATCTATT
TTCAAAAATTTTATATTATAAGTTATTTTATAAATATAATATATTATTATTTTTCTTTTTTGTTAATTATTAAATATTT
GTAATTTTCTTTATATCTAATAAGACTAGTAAATATAACATAAGAATATTGGACTATTAGATTCGTAAATTGCATTATA
AAATAATAAAAATTATATTTAGTCTTATACCGTATATTTTTTTTATACG
This encodes the end of Histone 2A, but genomic match is not annotated because
it is a smallisolated scaffold with the N-terminus truncated!
And there is something else wrong with it because it does not encode the
C-terminus \- indeed the ESTs show there is a deletion in the genomic DNA.
There are many ESTs
AE002735.2
TGAAGGGAAAGGCAAAGTCCCGCTCAAACCGTGCCGGTCTTCAATTCCCTGTGGGCCGTATTCACCGTTTGCTCCGGAA
GGGCAACTACGCAGAGCGTGTTGGTGCAGGCGCTCCAGTTTACCTAGCTGCCGTAATGGAATATCTAGCCGCTGAGGTT
CTCAAGTTGGCTGGCAATGCTGCTCGTGAGAACAAGAAGACGAGAATTATTCCGCGTCATCTGCAACTGGCCATCCGGA
ACGACGAGGAGTTAAACAAGCTGCTCTCCGGCGTCACAATTGCACAAGGT-----------------------------
-------------------------------------------------------------------------CGTCAA
TCAAACCGTCCTTTTCAGGACGACCAAATTATTAGCAAAGAATTGAAAAAAATTTTAACCACGCAATTTGTTGTATAAT
ATTAAATCATACAAAAAATATTTCAAACTATTTATTTACGTAAAGATTGTAATATAATACGGTTTTTGTATTTTTTCTA
TTATATGCGGTATAAACTATAATTTGTTTCTTTAATTACTCACACATTACTCTAATTACTAATTAGATTACTCTCCAAT
TATAATTACTAATAAATTACTCTCCACAAATCAATGCTAGGAATACACCTTGGTATACCTGAAGGAGTACGAACGCCGG
ACATTTATCATACGCGTTACTTTTAGAGTAAAAGGGTATACTAGATCAGTTGAAAAGCATGTAACAGGCAGAAGCCCCA
CCGCTATCGCCCTGGAACCGTGGCCTTGCGTGAAATTCGTCGCTACCAAAAGAGCACCGAGCTTCTAATCCGCAAGCTG
CCTTTCCAGCGTCTGGTGCGTGAAATCGCTCAGGACTTTAAGACGGACTTGCGATTCCAGAGCTCGGCGGTTATGGCTC
TGCAGGAAGCTAGCGAAGCCTACCTGGTTGGTCTCTTCGAAGATACCAACTTGTGTGCCATTCATGCCAAGCGTGTCAC
CATAATGCCCAAAGACATCCAGTTAGCGCGACGCATTCGCGGCGAGCGTGCTTAAGCTGACACGGCATTAACTTGCAGA
TAAAGCGCTAGCGTACTCTATAATCGGTCCTTTTCAGGACCAAAAACCAGATTCAATGAGATAAAATTTTCTGTTGCCG
ACTATTTATAACATAAAAAAAAATAAGAGAACAAAATTCATATTCTATTATTTATGGCGCAAATGGTACTGGGTCTTAA
ATGTAAAAATAGTAATTCTTTCAGAGAAAGAATCAAAATAATCTT
ESTs
GGCACGAGGACTAAGTGAAATAAACGCAAAGCAAAATGTCTGGACGTGGAAAAGGTGGCAAAGTGAAGGGAAAGGCAAA
GTCCCGCTCAAACCGTGCCGGTCTTCAATTCCCTGTGGGCCGTATTCACCGTTTGCTCCGGAAGGGAAACTACGCAGAG
CGTGTTGGTGCAGGCGCTCCAGTTTACCTAGCTGCCGTAATGGAATATCTGGCCGCTGAGGTTCTCGAGTTGGCTGGCA
ATGCTGCTCGTGACAACAAGAAGACTAGAATTATTCCGCGTCATCTGCAACTGGCCATCCGCAACGACGAGGAGTTAAA
CAAGCTGCTCTCCGGCGTCACAATTGCACAAGGTGGCGTGTTGCCTAATATACAGGCTGTTCTGTTGCCCAAGAAGACC
GAGAAGAAGGCCTAAACGTTTCAAAGGCTAAGCTAAAAACCTACATGTACATAAAATCGTCAATCAAACCGTCCTTTTC
AGGACGACCAAATTATTACCAAAGAATTGAAAAATTTTTTAGCTTGGCAATTTCTTGTA-ATTAGTAAATCATAAAGAA
TTATTAACGTAAA
Histone 2A M S G R G K G G K V K
G K A K S R S N R A G L Q F P V G R I H R L L R K G
N Y A E R V G A G A P V Y L A A V M E Y L A A E V L
E L A G N A A R D N K K T R I I P R H L Q L A I R N
D E E L N K L L S G V T I A Q G G V L P N I Q A V L
L P K K T E K K A \*
\**********extra There are some weaker matches to this histone and they are
not annotated either
This one is to a 15kb scaffold with just one protein annotated at the front
No ESTs for this clear histone relative \- not sure why
AE002870.2
tattgccccacaagcttagccgaaaaATGTTTCGGTCACATTCCCTCCTCTTCACTTGGTGCAAAATAAATTGCCGGTG
CCGGTCTTCAATTCCTGTGGGCCGTATTCACCGTCTGCTCTGGAAAGGCAACTACGCGTGTGGGTGCAGGCGCCCCAGT
TTACCTAGCTGCCGTAATGGAATATCTGGCCCTGAGGTTCTCGAGTTGGCTGGCAATGCTGCTCGTGACAACAATAAGA
CTAGAATTATTCCTCGCCATTTGCACCTGGCCATCCGCAACGACACGGAGTTAAACATGCTGCTCTCCGGCGTCACAAT
TACACAAGGATGCTCTCTGTTGCCTAAAAAGTCAGAAAAGAAGGCCTAAACGTTTCAAAGGCTAAGCTAAAAAACAACA
CGTACATAAAATCGTCAATCGATGC
translation M F R S H S L L F T
W C K I N C R C R S S I P V G R I H R L L W K G N Y
A C G C R R P S L P S C R N G I S G P E V L E L A G
N A A R D N N K T R I I P R H L H L A I R N D T E L
N M L L S G V T I T Q G C S L L P K K S E K K A \*
\*********** Check out the rest of this 15kb, and in the first half above
histone relative is all there is with BLASTX matches, and no ESTs for it
But the second half has two long 1kb long ORFs that must encode something, on
the opposite strand to the histone gene
But it's just a boring LTR retrotransposon!
\**************A. mellifera BB260007B20F9.F
AAATCATGTCGATTACTCGCGCACTTAAGTAGCAATTAATTAATATTCGTATATATATCCGATCCGATTTTTTTAGAGC
ATGATAATGTACCGTGAATATGCTCTAACACTGAAATGTGAGTTTGAATGTGTCGAGTGTCCAAGGAATTTTTTTTTTT
TTTCAAAAAGAACACGTGCGTGTGATTGTGCGGGGAGGAAGACGGGATGACCACTGGCTTCTTGTCTTCTTCTGCCCTG
CGTTGCGCGTGAAATGGTAAGCGTGAATGTAGATGCGTGACTGTGGACGAGTGATTCTGACGGGAGAGACACTGCGAGA
GTTGTTTTTTCTCTCTCTCTCTCTCTCTCCCCCCCTCCTCCTTTTCCTTTGTAACTATTGTTCGCGATCGATCCCGCAA
AAGTATGTTACGTAATAAAAGTATCTGGCACATTTTCTTTCAGTTACTGGACACCCTCCCGGTGTGCCAAGATTTTAAT
CGACAAGTGTGCAACCGTCCCGCCTGCAAGTTTATCCATCTCAGTGACGGAAACGTGGAGGTGATCGAGAATCGCGTGA
CCGTGTGCAGGGACGCACTGAAGGGCGCGTGCATGCGACCCCAGTGTAAATATTATCACATACCGGTCGCGTTGCCGCC
GGCGCCCTTGATGGCGATCACGTTCCCTGCGACGCCCTAATTACTCCTTCTCGTTTAGCGGATGTGTGCATCACGACGA
GGAGAATAGAACACCGGCCAACCGATGATGATGAACGCGGAGAGAAAATTTTGACAGGGATCGAAAGAAACGAGGATCC
AACCAAGGAATTAATTGCCCAAGAAAGCATCGC
This appears to be an unspliced transcript with an exon in the middle.
It has genomic matches e-21, but BLASTX only e-08 and to vertebrate
equivalents of muscleblind B isoform
Turns out the current annotation is incorrect near the C-terminus, leaving out
at least one exon.
It is a massive 100kb gene, with huge introns, and I can't find anything in
them!
\************A. mellifera BB260003B20D12.F
AAATCCTAACAGTTCTCATGTACTTACGGAAGATACTATATCGAGAAAAGTTAAAAATGGTATATTATATAGCACACGT
CTTTTGACAAAAACTAATAGAGTACCTAAATGGGGAGAGAGATTTGTTAGCAAAAATATAGTAAAAATTATTGAAGAGA
GTATAGTGGATCCAAAAACAAAAACTTTAACAACATATACAAGAAATTTAGGTTACACTAAAGTCATGAGCATTGTAGA
GAAGGTTGTTTATAAAGTATGTGAAGAAAACTCTAATTGGACAGTAGCAAAACGATCAGCTTGGATTGATAGTCAAGTA
TTTGGATTCAGTAGAGCTATCCAAGCATTTGGATTGGATAGATTTAAAAAGAATTGTACTCTGATGTATAACGGGTTTA
ATTACGTTCTAGCTCATTTGTTTCCTCACACAGCACAATATATGAATCCATCGCTTTCTCAAATGGGTTTTGCTCATCT
AGTCGAAGAATTTCCTGGAAAGACAAGTTTAGCAGAAGATTTCCAACATTCGTTACAAGGTAAAGCAGAAAAAGTAAAA
GATGCTGCGAAAAAGGCAACTGATTTAGCAAAGAAAAAGGCTGGCACCATTTATGCTACATAACATTCTGAGCAATCAT
AATTAAAACAACATCGTCGGTGTGAATGATTTAAAATATTTGGCGAATAGAGAAAATGAAAATTCAGAAAATAATTTTA
AAATGGAAACAAAGGATAATTATTTGATGGTTTATTGGATGAGTAAATAATAGAATGAAATTGTTAGCGTTATTCGTTA
AAAAATA
This one is barely better in TBLASTX, but the human BLASTX is much better, and
indeed find the annotation of CG8806 needs fixing, I think to remove an exon
\************A. mellifera BB260004A10B10.F
TCAACAAACTATTCCATCGAATATGAATCATAACTTTCTTCAGAGCCTAAAGATGGCAAATCAATTTTAAATCGATATA
TGAAGTTGTACTCGTCTGCCATCTCGTGGCGCTACGTAAAAAATATTTATCGTACCGATATAACATGGCGTTCACGTTG
TGGGCACTTTTCGAGGTAACCGTGTTATGTTTGAATGCGGTCTGTATTTTGAACGAAGAGAGATTTCTTGCAAAAGTTG
GCTGGGCATCGTGGCAAAATGTTCAAGGTTTTGGAGAAACTGCTACAGCTAAATCACAAATTTTGAATCTTATTAAATC
GATACGAACGGTAGCACGAGTTCCATTGATATTTTTAAATATCATAACAATAATTGTGAAACTGGTGCTCGGTTGAAAG
AAACAATTGATGATGTAAATAAGAGGATAATGCACAGTGAAAAGTCAAAGCGTGATGTACATAAATGAACAATTAAATT
CTTTTATTTTTATCGCTGCATTAAAAAAAAAAAAAAAAAAAGCAAC
Shows that CG6316 needs to be split into two genes.
The C-terminus matches this cDNA, which is full-length and encodes an 80aa
protein with excellent matches to similar length human and yeast proteins
\***************A. mellifera BB260004A20B3.F
TTTCTTCTTAGATTTGATACTAAATAAAGAACTTAAATTGCATAATTTCAATTTATATAAATGACTATGCAGACTTTAA
ATTGTGTGATGAAAGTGATTTTGAACTATATATGTTTTTTTAAGTGCGCTTCTTATAGTGAAATTAAAGCATAAGGTGA
TAATTAATAATTAATCAAAATGGCTAACATTCAACTTCGTGAATTAGAAGAATATTTACAACAGTTGGATGGATTTGAT
AAACCAAAAATATTACTTGAACAATATTGTACTAGTGCTCATATTGCATCACGCATGTTGTACTGTGCTGAAGTTCAAT
TTAATGACATAGAGGGACATTCAGTAGGTGACTTAGGTTGTGGGTGTGGTGTTTTATCACTTGGGGCACAGATGCTTGG
AGCAAGTCATGTAATTGGTTTTGAAATAGATTCTGATGCACTTAAAATTCAATCTAAAAATTGTAATGAAATAGATTTG
TTTGTGGAAACTGTACAATGTGATGTATTACAATATTTACCAGGCCGATTTGAGAAGTACTTTGATACAATTATTATGA
ATCCACCATTTGGTACAAAGCATAATACAGGTACAGATATGAAATTTTTAAAAGTTGCAACCAAATTAGCATCAAATAC
AGTGTATTCATTACATAAGACAAGTACCCGTAACTATGTTCTTCAGAAAGCTGCACAATATGGAGCCAAAGGCAAAGTT
ATTGCAGAACTGAGATATGATTTACCAAAAGCATATAAGTTTCATAAAAAAATGTCTGTAGATGTTCAAGTGGATTTTA
TACGATTTGAATTAAATTACTAAATATTTTCATGAAAAAAAA
Shows that CG9666 needs to lose its N-terminus and gain a C-terminus by adding
a simple exon. Human and C. elegans proteins confirm this new structure.
cDNA LD25448.3prime confirms this
\***************A. mellifera BB260004A20G4.F
TCATGTGTGAGGATACAGTGACAGCTGTGGACAGGTACGCGTTTTAAATCTGTTATGATTAAAAACGTACCAAGTAAAG
CCGGTCATCGACCAGCTGTGAATTCAAGAAAGAGATCGTTGTAAACGCGCGTGTCGAAAATTCTGTGAACGTGCGTTTT
GATTTATCGGTTTATTACGCCTAGTCGAAATCGTATTACCGAAAAGAGTCTTACCGAAAGAAACGGTGCGTGTAAGAAC
CTTTTTGGATACTACCTGACAAGATGTGGTGTCTCGTTAGTCAAGCTAATTCTGTCATTCTTGAGGTACAAGTCGATCC
CAAAGCTATTGGTCAGGAGTGTCTCGAAAAGGCATGCGATTGTTTGGGCATTAGCAAGGAATGCGACTACTTTGGGCTG
AAGTATCAGAACGCGAAGGGCGAGGAGCTCTGGTTGAATCTGAGGAATCCTATAGAGAGGCAAACGGGCGGCGGTGTGG
CCCCGCTAAGATTCGCATTGAGGGTTAAGTTTTGGGTACCGCCTCACCTGTTGCTGCAAGAAGCTACCAGGCATCAATT
CTACTTGCACTCTCGCCTCGAGCTTCTCGAAAGTAGGCTAAAAATGGCGGATTGGAGTTCGGTGGTGCGTCTGTTTGCT
TGGATAGCGCAAGCCGATATTCGTGATTACGATCCATTGTCGCACCGAACGCCCTCTTCTTGCATTGCTGTCCAATTCA
ACGGCGGAAACAAGCGAATCAAACCGTTGGTCTTATTCACCGGATGGTCCACCACCCAAGAATTGAAGGGAAGAAGCCT
TCCCGGG
This suggests the annotated N-terminus of CG12489 might be wrong, since the
honey bee sequence nicely matches the N-terminus of human ortholog, yet cannot
find better N-terminus in Drosophila genome?
Drosophila cDNAs agree with current annotation, so may be real difference
between Drosophila and other insects
\***************A. mellifera BB260008B10H2.F
GCGAGCGGTGGGACGGGCGGGTACATCGCGCCTACCACGTGGGATCAGCTGATGCAAGACGATCATTTCCTCGGCAAAT
TCTTCCTCTACTTCTCCGCCATCGAGAGGAGGATTTTGGCTCAGGTATGCTTAAGATGGAGAGACATACTTTACGCGCG
GCCTCGACTTTGGGCAGGCTTGGTGCCCGTAGTAAGATGTCGCGAGGTACGTGCCATGCCTTCTACTTCACGCACGCGG
CTCTACGCCTCCTTAGTTAGAAGAGGATTTCATTCGTTGGTTCTTCTCGGAGCATCGGACGAGGATATCCCGGAACTGA
CGCACGGCTTTCCATTAGCGCAAAGAAATATTCACTCGTTATCGTTGAGATGTTGCGCGGTGACCGACAGGGGACTAGA
AGCTCTCTTAGATCATTTGCAAGCGTTGTTCGAGCTTGAACTAGCAGGTTGCAACGAAATAACGGAAGCCGGGTTGTGG
ACTTGCTTGACACCTAGAATAGTATCGCTCTCCTTGTCGGATTGTATCAACGTGGCTGATGAAGCCGTCGGTGCTGTCG
CTCAATTACTGCCGAGTCTCTACGAGTTCTCGTTGCAAGCTTATCATGTAACCGACGCCCGCCTTGGATATTTTCACGC
AACCCAGAGCAGCTCCCTTAGCATCTCAGGCTGCAGTCCTGCTGGGGACCTACCCACCATGGTATAGTCAATATTGTAC
ATTTCTTGCCCAATCTGACTGGTCTATCGGTGGCCGGATGCAGCAAAGTAACCGAGGACGG
Something very strange here. An excellent human match is recovered, as is a
TBLASTX to Drosophila genome, but can't get a BLASTX match to Drosophila
protein;
yet there is a protein annotated in this region, CG6060, but the annotation
doesn't quite match \- check it out
Matches entire human protein, but clearly is longer at N-terminus at least.
Weak full-length match to Partner of paired , from 170 of 550aa
225198RC-ATGGAGAGAGCACGGGGAAAGCCACGGGTCGAAAATCGATACTGCTGCCCAGGCAGCACGCAGGCTTACG
GATACGGACTCTCCACCCGGACCCTGGAACCCCGGATTCGGGCGAAGGCGgtgcggagtgcattatggcaatgtggag-
2822-ccccatggtgtatcgtgttgcctcgttacagGTGAAACTCGCTAATTGTCGGGTGGAAAAAGgttagcaataca
cagagcaaacgggatgtaaaggataagcaaagccgacaccataagcgattgaaaaacgaagcagaggagaggacgaagg
actccacttttccagagcgcttcgtttcccagGAGGCAAGGTGAATCTGTTCCAGCACACGATGTCATCGATCTCGGCG
CAGGGCGTGGTCGAGCGAGCATCCGCGGAGTTGTCGAAGCGGATCAATGGCCTGGGCCTGCGCTCGAAGCACCATCATA
GCAGCACATCCAGTGGTGCTGGTGGCGCCGGCGATGCTGCATCCCCGGCAGGAGCCACGCCCACTCCGGCAGCGCCCAG
CGGCAAGACGTCGGTGATGGAGCGCGTAACGAACGCCCTGTGCGGCGGTGGCAACTCAAATTCTAACTCAGGATCGAAT
AGCTCCAATAGCAACACCTCTTCAGCGTCTGCCACCGCTGCCACATCGCCCGCCAGCAACGCCAATCCTCCACAGACGC
CGGACAAACCGTCGCGTGGCAGTAGCCCCAGTCCCGGCGGTATCACAATGCCAGgtggccagtcgcaggtccagaactc
cacacaccacctcctgcagcagcaacaacagcaacagcagcatatgcagctgcaacaatcgcagcagcagcatctccag
ctgcaagcctccacgctgatcaactccaaccaccatgtgatggtgggtcctgctccgcccactggcatgcctctgggtg
ccccgcccacgccgacagtgaagtccattgccaagcagatgaacataaccataccggg
CG6060 M E R A R G K P R V E N R Y C C P G S T
Q A Y G Y G L S T R T L E P R I R A K A
\-------------------------------0--------------------------------- V K L A
N C R V E K
\----------2----------------------2----------------------2---------------------
-2----------------------2--------------------G G K V N L F Q H T M
S S I S A Q G V V E R A S A E L S K R I N G L G L R
S K H H H S S T S S G A G G A G D A A S P A G A T P
T P A A P S G K T S V M E R V T N A L C G G G N S N
S N S G S N S S N S N T S S A S A T A A T S P A S N
A N P P Q T P D K P S R G S S P S P G G I T M P
\-------------------------------1----------------------------------------------
-----1---------------------------------------------------1---------------------
------------------------------1------------------------------------------------
--- G R S K S R F A H L Q H H G H G G R P E G G G Q C
W R Y T V A V A E T T A Q S S P A
PSVWQYGHQRSIAADHATPAPPAAYPSMSTRSVCAAPRLATGDWRPCLTTCRVCLNWSWLAATR
I can't reconstruct this gene with their coordinates, so leave it up to them
to sort out.
\************A. mellifera BB260008B20A1.F
AAGCGCAGCTAGGACGTCTCCAGTACGGTCGACGCCCTAGTTCCCCTCCGCATGGAGGGAACAGGCGCGCACCCCGAGG
GCCAAATCTACTTCCTCACCCCCTGACGCGACCACGACATGGGAAGATCCACGAAAAACAGCGGCGGCGGCGAACGTTG
CGGCTGTGGCCGCAGCCGTCGACAATGGGAAATCCTCGACCGGCGCTACCAATTCTCTAGGTCCATTGCCCGACGGATG
GGAACAAGCGCGTACTCCCGAAGGAGAAATCTATTTCATTAATCATCAGACACGCACCACTTCGTGGTTCGATCCAAGA
ATCCCTACTCATCTTCAAAGGGCTCCGACCTCAGGTGCAATGTTACCGCAAAATTGGCTTCAACAGCAACAACCTACAG
GTGGTGGTATTCAGAATAATCAAACATTGCAAGCGTGTCAACAGAAACTTCGCCTCCAGTCGCTACAAATGGAACGCGA
GCGTCTCAAACAACGGCAACAGGAAATTATACGTCAGCAAGAGCTAATGCTTCGACAGAGCACCACCGACGCCGCTATG
GACCCATTTTTGTCGGGAATCAACGAGCAACACGCACGCCAGGAGAGCGCGGACAGCGGCCTGGGCCTTGGTTCCGCTT
ATTCCCTCCCTCACACACCGGAAGATTTTCTTGCAAATATCGACGATAATATGGATGGTACAAGCGATGGCGGCGCACC
CATGGAGACCCCGGATCTTTCTACTCTGAGCGATAATATCGATTCGACCGACGATCTCGTTCCATCGTTACAGCTGAGC
GAAGATTTTAGTAGCGATATTTTGGACGATGTGCAATCGTTGATAAACCC
Has much better mammalian matches, and indeed indicates that there is at least
an exon missing from CG4005
\************A. mellifera BB260008B20D1.F
TCAACCCCTTGCGTGGTACCGATGTCATTTTACCGGAAACCGCAGTATTCGTAATAGCGCACAGCCAAGCTTGTCATAA
CAAAGCTTCCACAACAGATTATAATTTAAGAGTTGCAGAATGTCGCTTAGCTGCACAGATGATAGCAAAGAAAAGAAAC
AAACCTTGGGAACATGTACAAAGACTAATCGATATCCAAGAGAGTCTTAATATGAGCTTAAACGAAATGGTTTCAGTTA
TAACAACCGACCTTCACGAAGAACCATATACCCTGAGCGAGATTAGCAAGAACCTTGATACAACGAATGAGAAACTTCG
TGAAATATCATTATTACAAAATTTTAGCAATGCGCAAATTTTCAAATTGAAACAACGCGCTCTCCATGTGTATCAAGAG
GCGGCTAGAGTGCTCGAATTCCAACATATTAGTGAGAAAAATGCAATTATGGAAGAGGAGAAGCTAAAACAACTGGGCA
ATCTGATGTCCAACAGCCATTTCAGTATGCACAAACTATACGAGTGCAGTCATCCTAGTGTCAATTCACTCGTTGACAA
AGCTATGGCTTGTGGTGCACTCGGTGCAAGGCTCACGGGAGCTGGATGGGGTGGCTGCATAGTGGCCATCATAACGAAA
GACAAGGGTTCTCACATTGTGGATACACTGAAAAAAGAACTCGATCTATGCGGGATAAAGGATGGATTCAAGCTCCACG
ATTTGGGTTTTCCAACGGAACCGAACCAGGGTGCTGCAATTTATATGAGCTAAGTTTCATTTTCATCTGTTCCTGGCTC
TTGCATTTATAATTTCGACCATAATTATTAAAGGGTTCTAAGATATATTCTAATTTAAG
The C-terminus of this and the mammalian orthologs indicate that the
drosophila protein CG5288 needs to be longer, and the genome encode the
appropriate sequence
\************A. mellifera BB260008A20D9.F
TTACGTTAATATTTGAATGAATAAATCGAAGAAATTTGGTTCATAAAAAATTTTTAAGACATTAATCGTGATGTGTCTA
CAAATTATCCATTGATTTATTCGAAAAACTTTCGAGATATTCAACGGTCAGTTCCTATTAATAAATCGATTTTTCGCAT
TCGAGTGAAGTTGCAGAGAGTTCGATCGATGGAACGACCAGAGGATTTAGTGTCGAAAGAAATGTGCAGAATGTGGTCA
ATTTACAACGTATAAATTGTCTATCAGTTAAAACACGAAGAGCAAAATGTCGGTTGAAGTGAAGGGGGGTCGACCAACA
ATGCCAACAATTCCACAATCCAAGAGACCAACCATTTTCGTTTATCCCACAGTAACTCCAGAGAGCATTATCATCCCGA
TAGTATCATGCATACTCGGATTTCCATTACTGGCCCTTATGGTCATCTGTTGCTTAAGAAGAAGAGCAAAGTTAGCGAG
AGAACGTGCACGAAGAAGAAATTGTGATCTAAATCATGGAACCCTTAGTCTCGGTCGTTTTAGCCCTGTTCACCGGTTA
AGTAAATTAAACATCTTCTTTTTTATTTTAATACCATGTACATATCTAAAAAAAAGAAATGCAATTTATTTAAATTAAA
TTTTACTTGAATTATTCCATCCTTACTACCACATTATGGTCGAATTAAAAATTAATGGTTAAGTCTTAAACTTAAAAGC
CACGTTTTCCTACCAAACCTTTTTGTAACAAAAACACATGGAGGCCCCAAACAATTGGGTTCAACCCCAACCGCTAATT
CGGTTGTTTTTCCCCTAAAACGCTGCTGACT
Unspliced transcript has an exon in the middle, flanked by splice sites, with
good match to unannotated region of Drosophila genome \- no BLASTP or EST
matches
7kb in front and 3 after available
agcctttcccttttaaatccatttcagTTTCACCCGAATCGATTGTCATCCCGATCGTCTCCTGTATCTTCGGCTTCCC
CATCCTGGCGCTTCTGGTGATCTGTTGCCTTCGAAGGAGGGCCAAGTTGGCCAGGGAGAGGGATAGGAGGCGTAACTAC
GATATGCAGGACCATGCCGTCAGCCTGGTCAGATTTAGTCCAATACATAGGCTTAGTGAGTTTGAATGTCGTAATGTAT
TGTGTGTACGAAGAACGTTTCTTTTTCTCGTCTCGTCTCGTTTTCTTTTTTATCTCTGGTTTT
V S P E S I V I P I V S C I
F G F P I L A L L V I C C L R R R A K L A R E R D R
R R N Y D M Q D H A V S L V R F S P I H R L S E F E
C R N V L C V R R T F L
There are several good looking splice sites and ORFs in the 3kb 3' to this,
but can't easily put a gene together without any more guidance.
Same for the 5' 7kb.
\*************A. mellifera Contig1
GGAAAACTATTTTAACTGTCAAAAGAGAAATTTCGAACACATATTATATTATGTCTCATTCCACATCTCAGAATAAAAA
CTACAATATTATTATTGGAGATATGCGAATTGCATACAAAGACAAAAACGAAACTTTTACAGAAGATCATCTTGTAAGT
AAAGAACCAATTGGTCAATTTAGAGCCTGGTTCGATGAAGCATGCAAAATTCCGCAAATTTTTGAAGCAAATACAATGT
TTCTTGCCACAGCTACCAAAAATGGAATCCCGTCCGTGCGACCAGTATTACTCAAAGATTATGGAGAAGATGGTTTCAA
ATTTTACACTAATTATGAAAGTAGGAAAGCTCGCGAAATAGCTGAAAATCCAAATGTGGAGGTGAATTTTTACTGGCAA
CCTTTACATCGAAGTGTACGTATAGCAGGTACAATAAAGAAAACTTCCTTAAAAGATTCAGAACGTTATTTTCAAAGCC
GACCATATGCAAGTCAAATAGGATCAATGGCTAGTAAACAGAGTAGTGTAATTGCAAATAGAAATACACTTATAATAAA
AGAAAGAGAATTGTTAGCTCAATTTCCAGAAGGGAAAGTTAAAAAACCAGATTGGTGGGGAGGATATATTATTATTCCA
CATTCCATAGAATTTTGGCAAGGTCAAAGCGATCGCTTACACGATAGAATTCATTTTAGACGATTAAAACCAAACGAAA
AAATCGACAATGTACTTGTTCAT
Vertebrate matches are 260aa proteins, so suggests that CG2649 gene product is
a fusion of two related genes, since first half matches well and second half
matches more weakly
\***********A. mellifera Contig1058
ACTAATCTTCCTTTGCCCCGAGAGCCGGAAATTAGCTATTGCGGGGAGCGGAAGGCACGTTGTCTTGTTCAAATTCAAG
AAAGTAGAGAGTATGTCCGAGGTAGTGACTTTGGATATATCACTGACGGCCGAACCGGTTAAGGAAGTGGAAAGTTCAT
CCGATCACGATTCTCCGGCTGGCGGCAACACTTCAGGGAGCAGCGAGTCGAAGAATAATGAATCGAGCCAATCGCTGAA
AATTAAGACTGGTTTGCAGAAGAGGGCCGCGGGTTTCCAAGCGACCCTGGTCTGCTTGACGGTTACCAACAGCGGGGAA
CAAGCTGAGAACATAACTGCTCTCAGTTTGAACTCTTCCTACGGTTTAATGGCTTACGGGAACGAGTGCGGCATAGTGA
TAATAGACATTGTCCAGAAGATCTCTTTGATCGTGTTGAACACGGGCGACATAGGTGGTAACGTGGATCTGTGCCAACG
GGTGCTGCGCAGCCCGAAACGTCAGGACGAGTTGAAACGGGATAACGAGGACAAAGCGAGGAGTCCTAGCACAGATCAG
CCAACTATGTGTCTACCCACGTTGAAACAAGTTCAAATCAGTTTTGCGGTCTTCCCCGACAGCAAGGTTGATTCAGACA
AATATGACAGTTCGTTTCAACGGTCAAGGAGCTCGTGCATGTCGTCGCTCGAGAACATCACCACCGAGACCATCAGCTG
CCTTACATTCGCAGATTCCTACACGAAAAAGAGCGATACCAGCCCCGTGCCAACGCTTTGGATCGGTACCTCTTTAGGA
TCTATACAAACGGTGATATTTAACACGCCGCCTCGTGGAGAACGACACGCGCATCCGGTCGTTGTTTCTACGTGCAACG
GATCAACGTTCAAGTTGAAAGGATGCATTCTGTCTATGTCATTCCTGGATTGTAACGGGGCCCTGATCCCGTATTCCTA
CGAATCTTGGAAAGATGACAGTATGGAAAGCAAAGAGCGCAACAGGAGTC
Shows there are problems with annotation of CG17762; genome has exons for the
additional matches, and vertebrate homologs indicate there are sections
missing too
\*************A. mellifera Contig1107
TTTTTTTTTTTTTTGTCCACTTATATTAGTGTTCAATTTGTTTAATAGTTATAAACGTTTGAATTTTCAGTGAATTTTA
CTTATTTTTAGCAATTCAAAACTTCTAATGATGTCGAAGTTTTTATTGCTTCTTTGCTTTACTGCTGTCCACGTTCTAG
GTGAAGACCAGGAAAATGTTTTATCTAAAAAGAATGATACTCTTCTCACTTCTCCCAGGACATCTTTACAAAATGATTC
CGAATATTTAAAAGCAAATATTCTATCAAATAAGAATATTTCCATGACAGATTTAGGAATTATTTCAAACACAGAGAAT
AAAATAAAGCAACAAGAAGTTATTAAAAATTCTCTTCAAACTTCTACAATTATTCCCAATCATTCTATTGTCATGCCAT
TAGATATGACAGCTATTTCAATTTCTACAAATGAAACATCTAAAAAAATTTCGGATATAACTAATCCAATAGTTATACA
TGCTCCAATAAATTCTACCTTGTCTTCTTACACAACTGGTAAATGGACAGTTGTTAATGGAACAGATCAAATTTGTATT
GTAATACAGATGTCTGTAATGTTTAATATCTCTTATGTCAACATTAATAATAAGACATCTTTTATAACATTCGATATAC
CAACAGATAATGTTACTACAAAAGCAAGTGGATATTGTGGAAAACTGGAACAAAATTTGACATTAGAATGGTCTGCTAA
AAATATAACTAATGGTAGTATGACATTGCATTTTATGAGAAATGCAACTGAAAATGATTATTCTCTTCACCATTTGGAA
GTCATTCTTCCAGCATCAGATTTTCCTTCAAATTTAAAACTGAATGGATCAGTATCTTTAGTACATGAAACACCTGATT
TTGAAGTTAGATTATCTAATTCTTATAGATGTTTAAAACAACAAACACTCAACTTAAAACAGAATAATAGTAATGAGAC
ATCTGGTTATTTAATTGTATCAGGACTCCAATTTCAAGCATTCAAAGTTGATAATTCTACTATGTTTGGTTTAGCCAAA
GATTGCGCTTTTGATACACCAGACGTCGTACCAATAGCAGTAGGCTGTGCATTGGCAGGATTAGTGATTATAGTATTGA
TCGCGTACTTGATTGGTCGTCGTCGAAATCAAGCTCATGGCTATCTTAGTATGTAATGTTAATATGTTTTTTATTTTTA
ATTTTTTTCGTTGATTCAGTGAAAATTTCATTATTTAGTTTATATAATGTATTATAATGTTAGCTGCAAAACTTAAAAA
AAGATATTTCCCAAAAATCATATATTAAAAATTGCAATAC
Both this and the matches to vertebrates strongly indicate that the
N-terminus, and certainly the C-terminus of CG3305 are missing ; a good
C-terminus is encoded in the genome.
\**********A. mellifera Contig1163
AGGAGTACGCCACCGCTGGATACTGCACCCCACATTGCACGCACACGATGTTCCCCGAGAGCGGAGTGAACATCGTTTC
GGTGGTGCTGCACTCCCATCTGGCCGGTCGGCGGCTAAGCCTGAAGCATATCCGTCAAGGGAAAGAATTGCCGAGGATA
GTGGAGGACAATCACTTCGATTTCGAGTACCAGCAGTCTCACACTCTGGAAAAGGAAGTGAAGGTGCTTCCGGGAGACG
AGCTGGTGGCCGAATGCGTTTACGGCACTCTGGATAGAACCAAGCCCACTTTGGGGGGATACGCCGCTTCTCAGGAGAT
GTGTCTCGCATTCGTGGTCCATTACCCGAGAACCCCGCTTGCCGCCTGCTACAGCATGACTCCGTTGAAACATCTGTTC
AAAACATTGGGGGTGTACAGCTTCAAAGGCGTCACTATGGACCACTTGGAGAAACTCTTCCTAACGACCAGAACGGACG
CAGTAACCATTCCTTCGACCGGCCAACAACAACTTCCTATCTACCCGGCAACCAGGCCTAGCGAGGACATCGACGAAGA
GCTTATTAGGGAGGCCAAGTCAGCGTTGAGGGCCGTGAAGGATTACACTCTGGAGCAGGATAACGAAAATGTTTTCTCT
AGATTGATCATCGAGGAACCGGAAGAGTTCAGAGGTCGAACTTTGGCAGAGCACATGCTGGCGTTACCTTGGACCGAAG
AACTTCTGGCAAGGGCCATCGAGCAGAACCTGTACCACGGAAGGCACATGACTTTCTGCAGGAAGAGAGACGATAAACT
CGCTCTGCCAGCAGACATACAAACGTTCCCTAATTACACGGAATTACCGGAAGCAAATGAAACGATGTGCACGGAAATG
GCAAAATTATCCAATGCGTCGGGGAGGATGTCGTACCTCGATATCGCCACGTTCCTCGCG
Identifies a 1500bp ORF at the C-terminus of CG13075 that needs to be included
in the annotation
\***********A. mellifera Contig1167
GCAATAGGCGAGGAAATTTTAGATTTATCTGCTATTGCGCATCTATTCGATGGACCATTATTAAAAAACAAGCAAGATG
TATTTCGTCGTGATTATCTCAATGATTTTATGGCCTTGGGAAGATCCGCTTGGATAGAAGCCAGGAACAAACTTCAAGA
CTTATTATCAATCAGTAATCCAACCTTGCAGGAATCTAATATTCGTTCAAATGCCTTTGTAAAACAAAATGAAGCAACA
ATGCATCTACCAGCAAAAATTGGTGATTACACAGATTTTTATTCCTCGATTTACCATGCTACAAATGTGGGCATCATGT
TCCGTGGAAAAGAAAATGCTCTGATGCCAAATTGGAAACATTTACCAGTCGCTTATCATGGAAGAGCGAGTTCAGTGGT
CGTTTCTGGAACACCGATAAGAAGACCTTTAGGTCAAACAGTTCCGATAGAGGATGCAGATCCAGTTTTTGGCCCTTCA
AGATTAGTAGACTTTGAATTGGAAGTAGCTATCTTTGTCGGAGGACCACCTACAAATCTAGGTGACGCTGTTCCAGCAT
CCAAAGCTTACGATCATATTTTTGGAATGGTTACTATGAACGACTGGAGTGCAAGAGACATTCAAAAATGGGAATACAT
TCCATTGGGACCTTTCGGTGCAAAAAATTTTGGAACTACTATTTCTCCATGGATAGTCACTATGGAAGCTCTAGAGCCT
TTCAAAGTGCCCAATGTGCATCAAAATCCAACCCCATTCCCCTATTTACAACACAATGAATCTTGTAACTTTGATATTA
AATTAGAAGTTGACATTAAATCTCCAAACGGTACCGTCACAACCGTCTGTCGCAGTAACTATAAATTCCAATACT
Annotation of CG14993 needs fixing; N-terminus is missing, shown in a nearby
exon by this transcript and the mammalian proteins.
\**********A. mellifera Contig2454
ATTCTGATCCTAGCCAACAAGCAGGATCTGCCAGGTGCCAAAGAGGTGGGCGAATTGGAAAAGCACCTGGGCGTGCTGG
AATTGGCGGGGATGCCGGGGAGCGCGTGCATCAGGGTGCAGCCGGCCTGCGCGATCACCGGCGAGGGGCTTCACGAGGG
TTTGGACACTCTTTATCAGCTGATACTGAAGCGGCGCAAGCTCGCGAAGCTGAACAGGAAACGGGCCAGGTAGGCCAGG
GCCTCGGAGGACTGCACGTGCGTCTTCTGATCTTGCAAGTCGCGGACAGTCTCTCCTTCGGGCACAGCCACGCCTTCCA
CGGTGTCTTCTTCCTCCCCGTCACGATGCCGCGGCGCCGACCCGCTCGAGTGAACAACGGAGCTGCGCGCGCATCCAAC
GTCTCCACGAACATTGCCCACTTCGCCGTGGACGAAGTCGTCTCGTCCGTTCCAACGTAGTTGGAATCTCTGTTGCGTC
GCGTCGAAGATCTTTCCAAGCTTTCATCGATTCGTAGAGGATCTCCTCTTTATTCCTCTCGATCGGGAGATCTCGGTTC
ATCGAGTTTCGGATGAACTCGAAACAAGGATGGAATTTGGAACGGGCGCTTTTTTAACGCGCGAGGAGGAATGTCGAAC
GGGATATCCCTCTCCGCGAAGAGGAGGATAAAATAATCTAGGA
Matches region annotated as CG2219; yet does not show up in the translation?
\************A. mellifera Contig2801
TTTTTCTTTAAGTAACAATAAACATGACAACAGCACTGTATTTAGAACATTATTTAGACAGTTTGGAACATCTACCTAT
TGAATTACAAAGAAATTTCACTTTAATGCGAGACCTTGATGCTAGAGCACAAGGATTAATGAAAGATATAGATAAATTA
GCAGATGATTATTTAAAAAATGTAAAGAAAGAATCCCCAGAAAAGAAAAAGGAACAATTGACTCATATTCAAAACTTAT
TTAACAAGGCAAAGGAATATGGTGATGATAAAGTACAATTAGCAATACAAACATATGAATTAGTTGATAAACATATTAG
GAGACTAGATTCTGATTTGGCTAGATTTGAAGCTGAAATACAAGATAAAGCTTTAAATAGTAGTAGGGCACAAGAAGAA
AATAATGCTAGTAAAAAGGGCAGGAAAAAATTAAAAGAAAAAGAAAAACGAAAGAAAGGTGCAGGTACTAACAGTGAAG
ATGAATCGAAAACAGCTAGAAAAAAACAGAAAAAAGGAGGATCTGTTGCTTCTGCTTCATCAGCTGGAGCTGTAGGAAG
TGGTGCTCAAGTAGATTCTACTGCACTTGGTCATCCAGCAGATGTTTTAGATATGCCAGTTGATCCTAACGAACCAACT
TATTGCCTATGTCATCAAGTTTCTTATGGGGAAATGATAGGTTGTGATAATCCAGATTGTCCTATAGAGTGGTTCCATT
TTGCATGTGTTC
Encodes excellent match to N-terminus of CG9293; but alignment, along with
vertebrate matches, and confirmed by cDNA LD46333.5prime, shows two and maybe
three introns that need removing.
\***********A. mellifera Contig379
TTGTAATCGAAGTAAATATCGTGAGTTATTCGTTTCATTTTACGGGAGAAAAAAAATTTTCTCCTAACAGTGTCACAAT
GCTACAATCGTTAATCTAAACAAACTATAAAGAAAGGGGAACAAAGATGATGTTTTGCTGTTTGAGAAATTGTTTTGAC
GGCCTTGGCTTTGCCGCAACTCAAACACCGAAGAGAGAACCAAATCCTATATCTTTAGACACGTCTTATATGGGACATG
AAGTTGTGATAGTAAAAAATGGTCTAAGAGTATGCGGTCGTGGTGGTGCCTTAACAAATGCTCCTCTTGTCCAAAATAA
AAGTTATTTTGAAATAAAAATACAACAAGGTGGTATATGGGCTATTGGATTGGCTACAAGATCCACAGATCTCAATATT
ACTATTGGAGGAAATGATAAAGAAAGTTGGGCTCTTAATTATGATTCTATTATAAGGCATAATCAACAGGAAATACATA
AGATTCAAAGTTCGGTTCAAGAAGGAGATATTATAGGCGTATCTTATGATCACATAGAACTTAATTTCTATTTAAATGG
AAAACCAATAGGTGCTCCAGTAATGGGCATAAAAGGAACTGTTTATCCAGTACTTTATGTGGATGATGGGGCCATTCTA
GACTTAATTTTGGATAATTTTATTCATCCTCCACCTACAGGTTTTGAAAAATCATGTTGGAGCAATCATTACTCTAGCA
AAAAATATTATTACATAAGTATCAATCTTTTTATTCCTTGACACTAAAAGCATT
This, and homologous human and C. elegans protein, strongly indicate that
there are at least two in-frame ORF introns retained in the annotation of CG7785
\************A. mellifera Contig400
GAAAAATAAACTCAGGCCTAACAGTGGAAATGGTGCTGATCTGCCTAATTACAGATGGACGCAGACACTTCAGGATTTG
GAGATCAAAGTGCCTTTGAAAGTGAACTTCTCAGCCAGGCCCAAGGACGTGTCGGTGACGATCACGAAAAAACGATTGA
CCTGCGGCATCAAGGGTCAACCGCCGATCATCGACGGTGATTTTCCACACGAAGTCAAAGTCGAAGAATCCACCTGGGT
GATCGAGGATGGAAAAGTGTTGCTTCTCAACCTGGAGAAGGTGAACAAAATGCAATGGTGGGCTCACGTGGTAACCTGC
GATCCGGAGATCAGCACGAAGAAAGTGAACCCCGAGCCGAGCAAGCTTTCCGATCTCGATGGTGAAACTAGAGGCCTGG
TGGAGAAGATGATGTATGACCAGAGACAAAAGGAACTGGGTTTGCCAACGTCCGACGAGCAGAAGAAGCAGGACGTGAT
CAAAAAGTTCATGGAACAGCATCCAGAGATGGATTTCTCCAAGTGCAAGTTCAATTGAAATTCCAATTAGATAGGGGAG
CAGCCGAATATTCAAGGCGTTGTTGTGAAACTGATAATACAAATGATAAAACAGATGATATATTATCGATATATCCAAT
CCAGAAAAGCTCTTTATGATACTCTCTTGTTAATTGTCACCGCGGCGAAGTTTTTCTGCACCTACTTTATCGTATCGTA
TTCCAAAGCGTAAAAGATGGCGCCACGATCCATGCCACGATTGAATCG
This, and vertebrate matches, suggest there is an intron retained in CG9710
\***********A. mellifera Contig58
CCGCGTATTTTTAATACAATTGTTACAATATCGTTTTTTTCATTGTGGAAATAACGAATTTTCAACATGGTGCTAAGTG
AAGTGAATAAATTTCTTCATGAATTAGAAAAAGCTGAATTAGAGGCGCCTGGTGGAGTTGCATCATCACAAACTTATGC
TCAATTATTAGCTGTATATCTTTATCAAAACGATCTATGCAATGCCAAATACTTGTGGAAGCGGATACCAACGGATCTG
AAAAGCGGAAATGCAGAACTTGGTCAAATATGGATGGTAGGACAGCGTATGTGGCAAAGAGACTGGCCTGCAGTTCATG
TCGCCCTCAATGCAGAATGGAGTGAAGATGTTTCTGATATTATGGCTGCTTTGAAAGATAATGTTCGAGAAAGGGCAAT
CACCTTAATATCAAAGGCTTATTCTTCACTAAGTTTAACCGTATTTGCGTCAATGACAGGCTTAACATTAGAGGAAGCG
CGTCGTGTAGCAATTGAAAGGGGTTGGAACGTAGATGGAACGATGGTGCAACCTTGTAAGATTCAGAAAGAAGAGAGTA
ACCTCGTGAACGAGGTGTGTCTTACTGAGGATCAGCTGTACAAACTCACTCAATTCGTGTCTTTCTTGGAAAACTGAAC
AAGCAATGAAAGTCATCAATGATGAAATTCACGCAACACAAAGACACGATCGACAGTACTTTAAACTTAGTT
This and vertebrate matches show that the N-terminus of CG13383 is missing,
and it is there in the DNA separated by an intron.
\************A. mellifera Contig622
AAAAAATTCGTTGAGCGGTGAAAAACCAAAATGAGAAACTCCTAATGGACCAAAGAAGGACGTGCTTATGCTGTTACTG
CTGGGATGACTATGCCTTTGTGATCCAAGACGCACACCAACTCCAGCAAGGTCAGAAAATAAATCTTCAAAAGATGAAC
CGCCAAAGAATTCTCTAAATACTTCTTCAGGATCCCTGAACATGAAAGTACCAGCAAAGTGTGGATCAAAGTCTTCCTT
ATGTCGCCTCTTGCCACCAGGCATTTGAAGTCCTTCCTTTCCATATTGGTCATAAACCCTCCTTTTCTTTTCATCGCTT
AGCACTTCATATGCCTCAGATATTTCTTTAAATCTCTTGTTTGCTTCCTCCAAATTTTCAGGATTTTTATCCGGATGCC
ATCTCAACGCCAATTTTCTATATGCTTTTTTGATATCTCCGCTCGTGGCGGTTCGCTGCACTTCTAGTACCTTGTAATA
GTCAACCATCGTTCACAATATTCACGCTCGGATGTTAGGGCTTAGGTAACCACCTAGAAGTGGCTCCTGCTTCTTCTTC
GCCGTTCACGCTC
CG8448 is annotated for this ortholog as mRNA, but not translated?
\***********A. mellifera Contig78
GAAACATCAACAACTCCAGAATATGCTCAGGGATTAACTTCAATTAATCCACCTGCTACCCCCATTACTCCTGTAGCAT
CTGTACAATCTTATACACCTACTACTCCAAGTGGAGTAGTACCAGTAACAACACCACAAACTCCTACAACACCAAGTAC
ACCTACAAATCCGAGTACAGTCATACCTGTTACGACACCGACAGTTATTACACCTGTGGAAAGTACACCTGTTGGAGTG
CAAACGGTTAGACCAGCCCAAACAGTTACACAAATTCGTATACAAACTACTGCACAACCTGCTAATGCGGCAGCAAATA
CGAGAAAAGGCTTGTCTCTCACGCGAGAACAAATGTTGGAAGCACAAGAAATGTTTAGAACAGCTAATAAAGTAACTCG
ACCAGAGAAAGCTCTTATTCTAGGTTTCATGGCTGGTTCAAGAGATAATCCTTGTCCAAAGTTGGGTAATATAGTCACA
GTAATGCTCTCAGAAAATATAGAAGAAGTGACTCAACCGGATGGTACAACAGTTCCCATGTTGGTTGAGACACATTTCC
AAATGAATTATACAAATGGCGAATGGAAGAGGATAAAGAAAAATCGACGAATTATTACAGAAGAATCAACGTCCACTAC
GACTCCTACTCCCAGTGTGACGGCAACAGCTTCCAATTGAAAAAAAGATAAAAGAAATTTTTATCGAATTGCAATTATA
TAGGTCAATAATTCTGTCATTTTTTGGATGACGGTGTTTTAAAAAACTCCTGTTCTATAGATAGCAAAAGAATTTGAGTG
Indicates that CG5874 needs additional n and C-terminal sequence, encoded in
genome; vertebrate matches agree.
\***********A. mellifera Contig974
TTGGATACTCTTTTCATATTCTCGTGCTCACGTGCGTTATGCAAGAATTCATGCTTCTCCAAATTCTGATTGGTTTTTC
TTTCTTGGCAACGGCAATTCCGAAGCCGGAAAATGATCACAAGCCGCGAGTTATTAATAAGGAACCAAACAGTGAAGAA
CATTATGTCAATTCTCAACATAATCCTGCCTATGATCATGAAGTTTTTTTAGGTGAAGAAGCAAAAACTTTTGATCAGC
TTACTCCTGAAGAAAGTACAAGAAGATTAGGAATAATAGTTGATAAAATAGATAAAGATAATGATGGTTATGTTACTGG
AGAAGAACTTAAAGATTGGATATTATATTCTCAACGGCGTTACATACGGAACAATATTGAACATCAATGGAAATCTCAT
AATCCTGAAGAAAAAGAGAAGCTTCCATGGACAGAATACTTAGCAATGGTTTATGGAGATATGGATGAACAGGAAGCAG
AAAATCACGAAAAATCTAAAGATAATACTTTTTCGTATGCTGCTATGCTTAAAAAAGATCGCAGACGTTGGACAGCTGC
AGATTTAGATGGTGATGATGCTCTTACAAAAGAAGAGTTTGCTGCTTTCCTTCATGTAGAGGAAGCTGATCATACAAAA
GATATTGTAGTATTAGAAACCATGGAAGATATTGATAAAGATGGTGATGGAAAAATATCTCTTTCAGAATATATTGGTG
ATGTATATGA
This and several Drosophila ESTs suggest there are problems with the scf gene
annotation.
\***********A. mellifera BB260012B10B5.F
GGGTTCACTGGTGGGCCTGGGTGGTGCTAAATCAGGTGGTAATACCCCGATGAACCCATCGCTACAACAGCGGATCAAC
TTCCTCCAAAGTCATCTGAGCCAAGCACCAATGCCTTCCGTTGCTACCAAGAGGCGGCAACTGCCGTCTATAGAAGAGG
CTTGGAACTTACCCATTAGTGCTGAGATGTCTAGTAGACAGCAACAACAGCAACAAACACCCACTGGTCCGGGTTATAA
ATATGGTTCCACTCCTTCTGGACCACCACCTCCTTATCCTCAAGGACAAGGGCAGAATCTAAATACAAAAAGATTTAAG
CCGGGAGAAGAACCAATTTCTCCAGGTTCACAACAGAGACCACCACCATTTTATCTCACGTCTCAACAACTGCAGATGT
TACAGTTTCTTCAACAAAATCATGGAAGTTTAACGCAACAGCAGCAAGGTTTGCTTGCACAATTACAACAACAATACAG
ATGTATGCAACAACATCAACAACAAATTAGATTACAACAGCAACAAGCTGCTCAAAGAGGTTTAAGGCCAGGACAACCT
GGTTATCCTACAGGTTACAATCATTCACAACTAGGACAACCTGGCGTGATCAAGAATTACGGGATACCTCAGCAACCGT
TGCAACAAGGTGGAACTGTTGCTTTACAAACAGGATTCTCAGATTCTAATGTCGGTTATAACACGGCAGCAACTGGGAA
CAGTCAAAC
This, mammal matches, and cDNA HL02950.5prime all seem to suggest there is a
region missing from CG5640
\**********A. mellifera BB260013A20E9.F
GAAACCTAGTGCTATGATAGTTTTTTCATTTATACTTTTATCATATTTTCTAGTAACTGGAGGTATAATATATGATGTA
ATTGTGGAACCACCTAGTGTAGGCTCAACAACAGATGAACATGGCCATACAAGACCTGTAGCATTTATGCCGTATCGAG
TAAATGGGCAATATATTATGGAAGGATTGGCATCTAGTTTCCTTTTTACATTAGGTGGAATTGGTTTTATAGTATTAGA
TCAAACACATAATCCATCAACACCTAAGCTTAATAGAATTCTTTTAATATGTGTTGGATTTATTAGTGTTATTGTCTCA
TTTATTACCTGTTGGGTTTTTATGAGAATGAAACTACCGTAAGATTTATATATATATATATATATTATATATTGTAATT
TATAAAAGAACAAATAACACATTAATTAATTTATAAATACATTTTTTATATTTCTTTATACATTTAAATGAGACTTATT
TACATACTTTTTTGTAATATACATATATAATATAAATACATAAGAAAAATAAAAAAAAGACTAGATTTTAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAGCAAC
This and vertebrate matches and bombyx mori EST indicate that there is an
in-frame open reading intron in the annotation for CG9662 that should not be
there, that is, it should be coding.
\**********A. mellifera BB260014B10A5.F
AGCTTGCGGCCGCGTTGCTTTTTTTTTTTTTTTTTTAATCGTCTCAATACGAAAGACCACGCGGATGTGTCGTTAGTGT
GCACGTGTTCTTTACGCTCGCGCAGTGTTTCGTAAAGAGAAGAGGCCAAGATACATACGTACGCGGAACTGCAGCTGAA
AGGGAGGAAAAGAGAAGAAACCCAACGAAGAAGAAAAGGAAGAAGTAGAAGATGAGCAAGACGTGCGCCCGCTGCGAGA
AAACGGTCTACCCGATCGAGGAGCTCAAGTGCCTCGACAAGATATGGCACAAACAGTGCTTCAAGTGTCAGGGCTGCGG
CATGATCCTGAACATGCGGACGTACAAGGGTTTCAATAAACAGCCATACTGCGAGGCGCACATACCAAAGGTGAAAGCC
ACCACAATGGCCGAGACGCCGGAACTGAAACGCATCGCGGAGAACACGAAGATTCAGAGCAACGTGAAATACCACGCCG
AATTCGAGAAGGCGAAAGGCAAGTTCACCCAGGTTGCGGACGATCCAGAGACACTGAGAATCAAGCAGAACAGCAAAAT
TATCTCGAACGTTGCCTATCACGGTGAACTTCAAAAGAAGGCCATCATGGAGCAGAAAAGGACGATGACTGGTGAAAAT
GGCGAACAGATCGTGACGAATCCACCGACCAGAAAGATTGGTTCTGTAGC
This and vertebrate and C. elegans matches, and several Drosophila ESTs, show
that the N-terminus needs revision, including a 5' exon about 16kb upstream in
genome; plus current annotation is twice as big as the other matches.
\**********A. mellifera BB260018B20E5.F
TAAAAATGCCAAGGGGAAAATGGTATTGTTCTAATTGCCACAGTAAACAACCAAAGAAGAGAAATAGTAGTCGAAGGAG
TCATACCAAAGGGGGAGGCACCAGAGAAAGTGAAAGTTCTGATCATCCACCAGCTAGTCCAACGCCGTCAACGGCATCG
AACACACACGTAGAGGACGTCAGTTCATCGGAACCAGCAACCCCAACTGCCTCACCACGGAAGGAGGGAAACAATAGGA
CGCTCACGAAGAAACAACAACGAGAGTTGGCTCCTTGTAAGGTGCTACTCGAACAGTTGGAGCAACAGGACGAGGCCTG
GCCGTTCCTCTTGCCGGTGAACACCAAACAGTTTCCTACCTACAAGAAAATTATTAAAACACCCATGGATCTCAGTACT
ATTAAGAAGAAATTGCAGGATTCCGTGTACAAGTCTCGCGATGAGTTTTGCGCCGATGTCAGACAGATGTTCATCAACT
GCGAGGTATTCAACGAGGACGACAGTCCCGTGGGCAAGGCCGGACATGGGATGCGCAGTTTCTTCGAAATGCGTTGGAC
CGAGATTACTGGGCACCACCTCCACACCCGCAACGCATAGCTGAGGCTCGGTTCGCTCTCGCAACACCTGCAACAGTTT
TGCCCACTTCTACTACTAGAGGGTAAAACCCTCGAGAG
Ortholog is CG10897, is annotated as mRNA, but not translated \- Two
Drosophila ESTs show it well;
\**********A. mellifera BB260019A10H3.F
AGATTCTCGGGAAAGCCTTGAAGTAGCGATACAATGTTTAGAAAGTGCTTATAATGTACAAGCATCAGATACTCCAACA
AATTTTAACTTATATGAAGTTTATAAGTCTTCCGTAGAAAATGCAAAACCTTATTTAGCTCCAGAAGCTACTCCAGAAG
CAAAAGCTGAAGCTGAAAGATTAAAAAATGAAGGAAATACTCTCATGAAGGCTGAAAAACATCATGAAGCTCTTGCCAA
TTATACAAAAGCAATTCAATTAGATGGTCGTAATGCTGTGTATTATTGTAACCGTGCTGCAGCATATAGTAAAATTGGC
AATTATCAACAAGCAATTAATGATTGTCATACTGCATTGTCCATTGATCCCTCATACAGTAAAGCATATGGACGTTTAG
GTTTAGCATATTCCAGCTTGCAAAGACATAAAGAGGCTAAAGAAAGCTATCAAAAAGCTTTAGAAATGGAACCTGACAA
TGAAAGTTATAAAAATAATTTACAAGTAGCAGAAGAAAAATTAGCTCAGCCAAGCATGAGTAATATGGGATTAGGGGGA
AGTGCATTACCAGGCATGGATCTTAGTTCACTCTTGAGTAATCCTGCTCTTATGAACATGGCTCGTCAAATGTTATCCA
ATCCAGCTCTACAAAATATGGTGAGCAATTTTATGAGTGGACAAGTTGAACAGGGAGGACATATGGATGCTCTTATAGA
AGCTGGTCAACATTTTGCACGA
Human match is much better than Drosophila CG5094; but this may be because
Drosophila has several large insertions that might be unspliced introns \-
also not present in B. mori ESTs!
\**********A. mellifera BB260019A20B12.F
GTTGCTTTTTTTTTTTTTTTGTGAACCTGTGAGAATAGCGTTGCTTTCTCATTCGTGCCAAAGAGAATTCTGCCTGTCT
TGTGAGCTAGGATTCCTGTTTCACATGTTGGATACATCTCGAGGATTGCCGTGTCAAGCTGCTAATTTTCTTCGAGCTT
TTAGAACAGTACCTGAAGCGGCAGCTTTGGGACTTATACTCAGTGATCTCCATCCGGAGGCGAAAAGGAAAACAAATTT
GGTACGATTAATACAGAGTTGGAACAGATTTATATTGCACCAGATTCATTATGAAGTTTTGGAAACAAGAAAACGACAG
AAAGAGGAAGAAGAAGCTGCTCGATTAAAATCAGGACCAAAATGTCCACCGT
Together with a new Drosophila EST shows there is an exon missing from CG8232
\**********A. mellifera BB260019A20D5.F
AAAAAAGGAATTGAACTTCTTCGTATTCAATTTCCGATGTTCTGATTTAAAGATCAACTATAATGTTAACAGTGTCATT
AATTTCATTTAAACGTGTGTGATCAAGTTAATTAAAAATAAAGTGTAAATAAACAACAAAATTCTTTGAAATATTTTAA
GAGAGTACGAATGTTTTATTCGCTTTGATAACGCTACATTGCTTTGTCGTTTTTAACCTAAATCGAGATGGCTGATTCA
GAACAAGATTTCGGAGATCGTGGAGATAATGACAATTTAAAAACTGATAAATTATTTATCTTAAAGAAATGGAATGCTG
TAGCTATGTGGAGTTGGGATGTGGAATGTGACACTTGTGCAATTTGTCGAGTTCAAGTAATGGATGCATGTCTTCGATG
TCAAGCGGAGAGCAAAAAAGATGATAGCCGACAAGACTGTGTCGTCGTCTGGGGAGAATGCAATCATTCATTTCATTAT
TGTTGCATGTCACTTTGGGTGCAACAGAATAATCGTTGTCCATTATGCCAGCAAGAATGGTCCATTCAACGAATGGGAA
AATAACTAAATCAATCAAGCGAAACTTCAATACTTATTTTGTTTCCTTTTTGTTTCGTTATTTTCTGCTTATTTTTCCT
TCTTTCATTTCTTTCTCCCTTCTCTCTCACATATACACACATATGCACACGCACATACACATACACTCTCTCATTCACT
CACTAAGTGG
NEW GENE \- There is a Drosophila testis EST that has excellent match, but
gene is unannotated.
\***********A. mellifera BB260021A20F4.F
GCGTTGCTTTTTTGTCTGCTCTTGATAAGATGGTTTCAGATAATATACAAGATAGAATGCGAGATTCAGTAAAACCACA
ACAAGTAGATATTTCAGTTCCTTTACATGTAAAAAGTACTAAAAAAACATATGAACAATTGCAAGAAAGACCTTCTGAT
AATAGTACAGTTGATTTTGTACTTATGTTGAGAAAAGGTAACAAGCAACAATATAAAAATTTAGCAGTTCCAGTATCAT
CAGAATTAGCAATGAATCTTCGAAACAGAGAACAAGAACAGAAAGAAGAAAAAGAACGAGTTAAAAGATTGACATTAAA
TATTACAGAAAGACAAGAGGAAGAAGATTATCAAGAAACAATTAATCAGAGTACCAAGCCAGTAACGGTAAACTTGAAT
AGAGAACGGCGACAAAAATATAATCATCCCAAAGGTGCACCAGATGCCGATCTTATTTTTGGTCCTAAAAAAATACGGT
AGATTTAATATTTTAATGTTTTTGGGACAAATTAATACTTTCCTATTAGAAAATCACAAGTGATCTAAGTTATGGACTA
TTTGAAGTCCATATTTTTGTGTAGAATTTATAACAAATTAATAATTTTTATTTTTAATTTAGAATACATATCTACATAT
ACATTATTTAAACGAATTTTCAACCATACTATATGATTTTTGGTGTAAAAATACATTTTACAATTATATT
Comparisons with this and human homologs strongly suggest there is an extra
intron near the C-terminus of this protein
\**********A. mellifera BB260021B10D9.F
GAATTTTCAACATGGTGCTAAGTGAAGTGAATAAATTTCTTCATGAATTAGAAAAAGCTGAATTAGAGGCGCCTGGTGG
AGTTGCATCATCACAAACTTATGCTCAATTATTAGCTGTATATCTTTATCAAAACGATCTACCTGAACGAGAGAGgttt
cgatctgtctttgtttgatcgaaaaatcgccgtcgcagATGCAATGCCAAATACTTGTGGAAGCGGATACCAACGGATC
TGAAAAGCGGAAATGCAGAACTTGGTCAAATATGGATGGTAGGACAGCGTATGTGGCAAAGAGACTGGCCTGCAGTTCA
TGTCGCCCTCAATGCAGAATGGAGTGAAGATGTTTCTGATATTATGGCTGCTTTGAAAGATAATGTTCGAGAAAGGGCA
ATCACCTTAATATCAAAGGCTTATTCTTCACTAAGTTTAACCGTATTTGCGTCAATGACAGGCTTAACATTAGAGGAAG
CGCGTCGTGTAGCAATTGAAAGGGGTTGGAACGTAGATGGAACGATGGTGCAACCTTGTAAGATTCAGAAAGAAGAGAG
TAACCTCGTGAACGAGGTGTGTCTTACTGAGGATCAGCTGTACAAACTCACTCAATTCGTGTCTTTCTTGGAAAACTGA
ACAAGCAATGAAAGTCATCAATGATGAAATTCACGCAACACAAAGACACGATCGACAGTACTTTAAACTTAGTTNTACC
TCGTAATCATATTATATAAAGGAGGAAGTGTGCTTCAAAAGAGACACCTGTGTCCCTTNCCCTAGTATACACCCCGGAC
TACCTACGCGTNCTGCTCATTCAATCGGAAGACTCGCGATGAAGCT
Comparisons with human COP9 homolog show that the N-terminus is missing from
the Drosophila homolog annotation, CG13383, and there is one available in the
genomic sequences.
\**********A. mellifera BB260021B20G3.F
ATTTCCACAAAATTTATTAGCTTTGGCACCATTGCTACGTACATTGGATTTATCGGAAAATGAATTTGTTCATATTCCC
GATAATATTGGTAATTTTACGTTATTAAAGCTATTGAATGTTAATCATAACAAATTGACAACTTTACCCGAAGCACTTG
GAGCATTGACAAAATTAGAATGTTTAAATGCAAGTTCGAATCAAATAAAAACTATCCCATGGTCATTGTCAAAACTAAC
ACGATTGAAACAAGTCAACTTATCTGATAATCGTATAACCGAATTTCCTCCTATGTTTTGTGATTTAAAATTTCTGGAT
GTGTTAGATTTATCGAAGAATCGAATTACGACAATCCCTGATGCGGCTGGAGCGTTACATATAGTTGAACTTAACCTCA
ATCAAAATCAGATATCAACTATATCTGAGAAATTGGCGGAATGTTCGCGCCTAAAAACATTAAGACTTGAAGAAAATTG
TTTACAACTGAATGCAATACCTAGTAAAATTTTGAAAAATTCTAAAATTTCAGTCCTGTCTGTTGAAGGAAATTTATTT
GAGATGAAACAATTTGCTAATCTTGATGGTTATGATAACTATATGGAAAGATATACCGCTGTAAAGAAAAAACTCTTTT
AAGAGATATTTTAAATGAATATTTATTATTGAATCTATTATAGGTAATTATTATACATATATAATTATTTTATAATATT
GAAAAAGATGCCGCATCGTGTTCGCGTAATTATACTCGATACCTGCGATAATATAATATAGAAATAAATTTTCAATTAA
TTGATAAT
This and vertebrate matches and cDNA GM01152 indicated that major reannotation
of CG3040 is needed.
\***********A. mellifera BB260024A20E12.F
AGAAAAATCAGATGTTTGTTTAATTGGTCTACATGCTTGTGGAGATCTTAGTATACATGCATCAAAAATATTTCGAGAT
ATGAAAATAGCACGTATTTTTATTTTAATTCCTTGTTGTTATCATAAGCTTTCAATATCAAAAAGGATAAGAATAAATA
CATCAAGTGAAAAGCAATACTTTAATAATTTTCCTTTATCTAATTGTTTTAAAACTATTATTAATAATACTAATTTTGA
TATTGGTACTTTTTTGAGGCAACCTTTTTTACGACTAGCATGTCAAGAACCAGTAGATAGATGGTATAACATGTCTATT
GAAACACATAATAAACATTCTTTTTATGTTCTTGCAAGAGCTGTCCTTCAATTGTATGCAACTAAAAATGGATTTTCTC
TTAAGAAATGTACTCAAAAAGGAACAAGAAAATCACAATGTTTAAATTTTGAAACATATATTAAAGATGCATTGACTAG
GTACATTTTACAACCACAAGAAAAAGAAACATTCAAAAAACAAGATGTAGAATTTAATCTTGATACACATAAAAGAAAT
ATAATAGAATTATGGAAAAATCATTGTGATAAATTTAAAATTGTAGAAATATATACTGGTTTACAACTGATGTTGCAAG
CACCAGCAGAATCACTTGTTTTACAAGACAGATTATGTTGGATGGAAGAACAAGGTGGATTTCCTAATGATTGTCTGGA
GTTTGTTGCTGAATTCCAG
This and a B. mori EST and the human, Arabidopsis and C. elegans orthologs are
indicate that the annotated N-terminus of CG8447 is incorrect, it should be
another \+200aa.
\***********A. mellifera BB270001B10B9.F
ATCGCGATATAAAGGTGTTCCTGTAGAATATCTCGTGAGTTACATTAACCGAGTCATTCTGTCATCTGTTTCAAAGAAG
AGAGAGGTCGGTGGCTCTGTGTATTAGTGTACAATCGCGCCATGGGGCATCAGTGTTGGTTCTTCACAAACGACGGGAC
CTGAATGTGTGCATATTTCTCCTGGCTGACTGGCCTGCCTTCCACCGTTAAGCTGCATTCACAAAGGAAGCCGATTTTT
GGGCAAGGACCGTATCTCGGTCGTTCTTGCCAAGAGTACGAATCGAACCGGTTAATGAGGACTATCTGGCTTCGATCGT
CCAATCAGCTAGCAGCTAGACTGGACGAGCATAATCGGATGAAGTTGGGCCAAGCATGCCCTCTAAGAAGCAATATAAT
CTCGTACATAATGACGAGTACGACACGAGGATACCACTGCACAGTGAAGAGGCATTCCACCGTGGAATTGTCTTCCATG
CCAAGTTCATCGGCTCTATGGAGGTTCCTCGACCGACCAGCCGAGTGGAGATCGTGGCGGCGATGCGAAGAATCCGCTA
CGAGTTCAAGGCCAAAGGGATCAAAAAGAAGAAAGTGACGCTGGAGGTATCCGTGGACGGGTTGAAAGTCACTCTTCGA
AAGAAGAAGAAGAAGCAACAGCAGTGGATGGACGAGAATAA
This EST and it's mammalian orthologs suggest major problems with annotion of
CG17357 and CG3179
\***********A. mellifera BB270004A10C3.F
GTTGCTTTTTTGCATCTTCATAGGATTGTGGATGTGCATTCCATTCGCTTGGACCAATCCAAAGGTGCAATCTCTCAAG
TCTATGGAGGTGGATTGGATCGGTGAAGTGAAGCCTGGGGAATACTGGTCTTACGTGGATTATGGTCTTTTGTTGATAT
TCGGTGGTATCCCTTGGCAAGTATATTTCCAACGTGTCCTGTCCTCGAAAACTGCTGGAAGAGCGCAAGTGTTGAGCTA
CGTAGCCGCGATAGGGTGCATTATCATGGCCATACCACCTGTCCTGATCGGTGCACT
This EST and vertebrate matches show problems with the C-terminus of CG7708
\***********A. mellifera BB270007B20H3.F
GTTGCTTTTTTTTTTTTTTTTTTTATTAAAACATATAATTCATAGTAATTAATCAATAGATTCATATTGCAGCGTGAAA
CATGTCGAATAAATTTGAAGCGTTTGCAAGTGTGGAGCAGTTTTGGAGTCTTTACAGTCATTTAGTCCGGCCATCAGAA
TTAACAACATCTACAGATTTTCATCTTTTCAAAGTTGGCATAAAACCAATGTGGGAAGATGAGGCAAATCAAAAAGGTG
GTAAATGGATAGTACGATTAAGAAAAGGTTTAGTTTCTAGATGTTGGGAAAATCTTATATTAGCTATGTTAGGAGAACA
ATTTATGGTTGGAGAAGAGATATGTGGAGCTGTTGTATCTATAAGGTTTCAAGAGGATATAATATGTGTATGGAATAAG
ACTGCATCTGATTATGCAACAACAGCACGTATTAGAGATACATTAAGGAGAGTTTTACATCTTCCAGCAAGTGCCTCAA
TGGAATACAAAACTCATAATGAAAGTTTAAAGAATGTTCATCGGCTCTAAAATCTTGTGATGTCAACTCAAAGATTTGA
ATTCTTTATGGATTTTCAGCCAATTGATACTTGTT
This unspliced EST and it's vertebrate matches indicate there is a problem
with the C-terminus of CG10716; the appropriate sequence is available in the
genome sequences.
\**********A. mellifera BB270021A20F1.F
ATTATTGTGAATATTGTGATAGATCGTTTAAGGATGATCCGGAAGCCAGAAAAAAACATCTTTCAAGTTTGCAACATGC
GAAAAATCGTGCAGATCATTATAATATGTTCAAAGATCCAGAAATTATTTTAAGGGAAGAATCTACAAAGATACCATGT
AAATGGTATTTAACTAATGGTGAATGTGCATTTGGCCTTGGTTGCAGATATTCCCATTATACTCCTCCTATGATATGGG
AACTTCAACGTCTTGTTGCTATGAAAAATCAATCAAAGTTGAATATAAATCTCGAAAATGGCTGGCCAAATCCTGACGA
TATAATTAAAGAATATTTTGAGAATAATACGAGCACAAGCACTACAGATGATTTTACGTATCCAAATTGGCACAGACCA
TCGGAGCTACATGATTATTCTATGCTATCACCATCGTTATGGCCTATTACGCCTGAAAGTTTAGCAAATACTACAAGAT
TCGAAGAATGGGGTTAAAAATGTAAGGAATATAATGTTCCTTGTTAAATTAAATGAAAATACATATATTTAAGACTCGT
CCAAAAGAGTAAATAATATATGATATATAAACAATTTAAATATATAGTTTTTCAAAAAACTGAATATTTTGTTAATAAA
TGAACATATAAAATTGGGAAATGTGTTCACTTTCTTAAATAGGAAATATTATTGTTAATATTATATTGTAATAATATTA
TT
Bee translation Y C E Y C D R S F K D D
P E A R K K H L S S L Q H A K N R A D H Y N M F K D
P E I I L R E E S T K I P C K W Y L T N G E C A F G
L G C R Y S H Y T P P M I W E L Q R L V A M K N Q S
K L N I N L E N G W P N P D D I I K E Y F E N N T S
T S T T D D F T Y P N W H R P S E L H D Y S M L S P
S L W P I T P E S L A N T T R F E E W G Z
NEW GENE \- completely unannotated \-between CG5105 and CG5118 \- also
mammalian matches
AE003587
ATGGGTGGCAAAAGTTATTATTGCGACTACTGCTGTTGCTTTCTGAAAAACGATCTGAATGTGAGGAAATTGCACAATG
GTGGTATTGCACACGCAATTGCAAAGAGCAACTATTTGAAGCGTTACGAGGgtaaagcttttgttgcacgatattcccg
aaaaagctgaattttcaatatttgtagATCCCAAAAAGATTTTGACTGAAGAGCGGCAGAAAACTCCTTGCAAGCGATA
CTTTGGCAGTTACTGCAAGTTTGAAACATATTGCAAGTTTACCCACTATAGTGGCGATAATCTACGGGAACTGGAGAAG
TTGGgtgcgtgataaaacgcagatttaaaacgaaaaataacttcttcttgaaaactttcagTTCTCGCTAGAAAGAAGA
GAAAATCCCGAAAGAAAACCAACAAATGCAAGAGATGGCCCTGGAAAACTCATCTGCGAAAGGGATTACCCCCTTCCTT
GCAACCCATTAACCCGGAAAAACTCAAGCAAACCGACTTTGAACTCAGTTGGGGCTAAATATATTTACAGAATGCACAC
TTTATGTCAAGTATTCAGTATGCTAATTACTTTCTCCATGACGCGCGCCACTTTCAGGTTG
translation M G G K S Y Y C D Y C C C F L K N D L N
V R K L H N G G I A H A I A K S N Y L K R Y E
\---------------------------1---------------------------D P K K I L T E
E R Q K T P C K R Y F G S Y C K F E T Y C K F T H Y
S G D N L R E L E K L
\-------------------------------1-------------------------V L A R K K R
K S R K K T N K C K R W P W K T H L R K G L P P S L
Q P I N P E K L K Q T D F E L S W G Z
Nice small gene with two short introns
Fly
MGGKSYYCDYCCCFLKNDLNVRKLHNGGIAHAIAKSNYLKRYEDPKKILTEERQKTPCKRYFGSYCKFETYCKFTHYSG
DNLRELEKLVLARKKRKSRKKTNKCKRWP------------------------------WKTHLRKGLPPSLQPINPEK
LKQTDFELSWGZ
Bee
YCEYCDRSFKDDPEARKKHLSSLQHAKNRADHYNMFKDPEIILREESTKIPCKWYL-TNGECAFGLGCRYSHYTPPMIW
ELQRLVAMKNQSKLNINLENGWPNPDDIIKEYFENNTSTSTTDDFTYPNWHRPSELHDYSMLSPSLWPITPESLANTTR
FEEWGZ
\**********A. mellifera BB270024A20E1.F
AACAGGAAAGGCTTGACGCCGCTCTACTACAGCGTCATCTACAAAACCGATCCGATGCTGTGCGAAACGTTGCTCCACG
ACCACGCGACGATAGGCGCCCAGGATTTGCAGGGATGGCAGGAAGTGCATCAGGCCTGCCGCAACAACCTGGTCCAACA
CCTGGATCACTTGCTCTTTTACGGTGCCGACATGAACGCGCGTAACGCGTCCGGCAACACGCCGTTGCACGTATGCGCT
GTGAACAACACGGACTCGTCGTGCATACGCCAGTTGCTGTTCAGAGGCGCGCAGAAGGACAGCCTGAATTACGCGAACC
AGACTCCCTACCAGGTGGCGGTGATCGCTGGGAACATGGAGCTGGCCGAGGTCATTAAGAATTATCAGCCGGAAGAAGT
TGTACCGTTTAAAGGGCCGCCACGCTACAACCCGAAACGGCGCTCGGTGGCGTTCGGAGGCACGTCGACGATGACCACT
AGCTGCTCGGCCAGCAACTTGGGCACCCTGACCAGGATACCGTCCGCCGAACACCAACACGCCTCTGGCGGAACAGGAG
GAGGAGGAGGAGGAGGGGGAGGTGGTAGCCTGACCAGAACGATCTCAGTGGAGCAGTACGCGGTGACGAGAGTACCGTC
CGCCGAGCAATACGCGACCGGCAACCTGACCAGAGTACCGTCCACGGAACAATACGCCAGCCCTATCGCGACCGCGAC
This EST and it's vertebrate matches suggest there is a major unannotated
region upstream of CG8122 that should be added to it.
\**********A. mellifera BB270025A20A3.F
AGTTCTAGATCTTGCCACTGAAACTGCTACTGCTGTAAGAGAAACAAGTAGAAGTGCTCATCGTACGATACCAAAACGC
GATAGACCTCCTCGTGTGGCAAGTGGTTCTGCTGGTCTATTACCACCCTATAATCGCCAACAAGCAGAGGGCCAAGAAT
TTCTTTATATAATAAATGAACATAATTATTCAGAATTATTTGTGGCATATGAGTGTTTACGTAGTGGAACGGAGAATCT
AAGAATTCTTGTTTCTAATGAAAGAGTTCGAGTGATTTCCGGAGGTACCAAAGGAGTTGTAACCGAAGTCAGTCTAGCG
GACTTATTATATTGTCAACCAATGCATAAGCTAGAAAGTAATGGTGTTACTTTATACTATATTGAATTAATATCTAGAT
CAGATTCAACGATAACCGTTAACATGGACGGTCCAGAACTTCTAAGAAGACCTAAAGTTCGATGTGACAATGAAGAAGT
AGCCAAAAGAGTATCGCAGCAAATTAATTACGCTAAAGGAATGCACGAGGAACGTAGCTTGACTCTTTCTTCTTCGGAT
AATATGTTAGATGATGTACAGTACTATAAGTAGTTACAAACAATCATATATGAAAATTTATTTTGTATTTGACAAAAGT
TTGGAATCACTATGTTTTTACAAAAAATTTTTATGGGAAATTAATGCATTAAAATATTTTCATTTCAATGTTAATTCC
This EST plus it's human match show that the C-terminus of CG11003 is
incorrectly annotated.
\**********A. mellifera BB270031A10B1.F
GAAGACAGAAAGCTGTTCGTGGGAATGCTCAGCAAGCAACAAACAGAAGACGATGTCAGACAGTTGTTCACTGCCTTTG
GCACAATAGAGGAGTGTACCATCCTCCGAGGACCTGACGGCAGTAGCAGAGGCTGTGCATTCGTAAAACTTTCATCGCA
TCAAGAAGCGCTAGCGGCGATCAATACCTTACACGGTAGCCAAACTATGCCGGGTGCGTCATCCAGCTTAGTGGTAAAG
TTCGCAGATACTGAGAAAGAGAGACAACTAAGACGCATGCAGCAAATGGCCGGGAACATGAGCCTCCTTAACCCTTTCA
ACGTCTTCAATCAGTTCGGCGCTTACGGCGCTTACGCTCAGCAGCAAGCAGCCCTGATGGCTGCGGCAACGGCACAAGG
GACGTATATCAATCCAATGGCGGCATTGGCACACGTTGGCGCTGGCCAACTGCCGCACGCGTTGAACGGCATGCCAAAC
CCCGTCGTTCCACCGACTTCCGGTTTGCTCGTAGGTACCGGTACAGGGCAGCCTGTTAACGGGGCGATACCGTCGTTAC
CCA
This EST suggests that CG12478 connects to CG10046.
\**********A. mellifera BB270032A10D3.F
TCGTCCACTAACGCGGGAGGAGCGGGTGCACCCAGCGAAGACAATACGCAGATATTGGTGATGAACAATTACTTCGGTA
TCGGCCTCGATGCCGATCTTTGTTTAGACTTTCACAACGCCAGGGAAGAAAATCCGAATAAATTTAAAAGCAGATTGCG
TAACAAAGGGGTGTACGTAACCATGGGTTTGCGAAAAATGGTAAAACGGAAACCGTGCAAAGATTTGCACAAAGAGATA
CGACTGGAAGTGGACGGGAGACTCGTCGAATTACCTCAAGTCGAAGGAATAATTATTCTAAACATTTTAAGTTGGGGTT
CTGGGGCAAATCCTTGGGGACCAGACATCAAGGAAGACCACTTTCAAACACCGAATCACGGGGATGGGATGTGGGAAAG
TTGGCGAAGTCACGGTGTTTGCATCTTTGGACAAATCCAATCTGGTCTCCGTACAGCGATGAGGATAGCACAGGGTGGA
CATATAAAAATTCATTTGTACTCCGACATACCAGTGCAAGTAGACGGAGAACCATGGATCCAGAGTCCGGGGGATATCG
TAGTTCTGAAATCGGCACTGACGGCCACCATGTTGAAGAGCATAAGATCAAGCGTCGGAATACCGAACCTTCGATTCCA
CCCGCTAATGGGGTGGGGGGCAAGAGCTCGGACGAGTGTCGCGACGAAAGCTTGATGCCGCAGTTTCCGCGCTA
This EST and its nematode and human matches show that there is a long
C-terminus to CG5875; and there is a Drosophila EST.
\**********A. mellifera BB270032B10E8.F
TTCAATAACACCATCAACAGAAAACTCAATTAGTCCAGAACCTGAGATTAAACCATTGACAGACATTAATATTAATCTT
CATGATATTAAACCAGGTATTAATCCACCTATAACAGTAATTGAAGAAAAAAATGGTATATCCGTAGTGCTTCATTTTG
CTCGAGATAATCCAAGAAAAGATGTATTTGTTGTAGTGATTACAACAATGAGTAAGAATTTGAAGCCACTTACTAATTA
CTTGTTTCAAGCAGTGGTGCCAAAAAGATGTAAATGTAGACTTCAGCCACCTTCTGGAACAGAATTGCCTGGTCATAAT
CCATTTTTACCTCCATCTGCAATTACTCAAATTATGTTAATTGCAAATCCTACCAAGGAAACGGTATCGTTAAAATTTA
TGTTAAGTTATACTATGGATGATGAAACTTTCACAGAAATGGGTGAAGTAGAAAAATTGCCTTTAGTTTAAAGTACTTA
AAGTGTAATTATAATATAAATTAAATTCAAGAATTACAGACTCTAATAGCCAAAAGAAGAATATATTGTTTTTTAAGAT
ATGAATTTTCAAAAGATATTGTTTCTTTTACTTTATTTCTCAGATTGTAATTACAGTTTTATCTTTATTATTATATTCT
GAGATATATTAATGTTATGTAGATATTTATAAGTCATGTTGTGATTACATAACGAAAATATCAATAACAATCTATTTTA
AGATCGA
This EST and mammalian matches indicate the C-terminus of CG3002 is not
annotated \- it is there in the genome.
\------------------------------------------------------------------------------
--
DOI
Associated Information
Comments
Associated Files
Other Information
Secondary IDs
    Language of Publication
    English
    Additional Languages of Abstract
    Parent Publication
    Publication Type
    Abbreviation
    Title
    ISBN/ISSN
    Data From Reference