Subject: Drosophila odorant binding protein annotations Dear Dr. Smith Here are the files via our technician's e-mail account. Sincerely Laurie Graham \---- Hello We amplified cDNAs corresponding to 21 of the 38 Odorant binding protein homologues in Drosophila. This work is in press (Graham L.A. and Davies P.L. 2002 The odorant- binding proteins of Drosophila melanogaster: Annotation and characterization of a divergent gene family. Gene). I have given the coding sequences below with notes describing how our annotation compares with that done on the whole genome. Comparisons are also made with the annotations given by Galindo & Smith (Galindo K, Smith DP (2001) A large family of divergent Drosophila odorant-binding proteins expressed in gustatory and olfactory sensilla. Genetics 159(3):1059-72) or by Hugh Robertson via annotation updates. Our cDNAs were amplified from Canton-S so between 0-8 polymorphisms were observed per gene. The primers that we used to amplify the cDNAs are shown with each sequence. The 3' primer is shown in reverse complement so you can easily visualize its position relative to the cDNA sequence. The cDNA was derived from a mixture of all developmental stages (embryo-adult). All of the genes contain at least one intron which was absent in the cDNA sequence except Obp99d and Obp99b. Obp99b was independently confirmed by ESTs. OBPs typically contain 6 highly conserved Cys residues. However, some family members only contain 4 of these in which case Cys 2 and 5 are not present. Others contain insertions or deletions. These differences are noted below. In the cases where a gene name has not been given by Galindo & Smith, we have noted a compatible gene name below (Obp followed by the band number with an alphabetic suffix ex. Obp23a). > CG13517 Original gene prediction is fine. New name proposed...Obp59a. My coding > sequence shown below. Amplified from first strand cDNA using the following primers. > 5' end CACGGATCCCAAGATGAAACAGTTGAT 3' end GCTCTTCAATTAGGATTAAACTCGAGAGG. > This isoform contains two insertions (60 a.a. Gly-rich insert \+ 15 a.a. insert) each > containing a Cys residue (for 8 Cys in total). ATGAAACAGTTGATTTTCCTGCTGATTTGCTTGAGCTGCGGCACCTGCTC CATTTACGCACTGAAATGCAGATCCCAGGAGGGACTAAGTGAAGCGGAACTCAAGCGAAC TGTGCGCAACTGTATGCATCGCCAGGACGAGGACGAAGATCGAGGACGAGGTGGACAGGG CCGGCAAGGAAATGGCTATGAGTACGGTTACGGAATGGATCAGGATCAGGAGGAGCAGGA CAGGAATCCAGGCAACAGGGGCGGCTATGGCAATCGAAGGCAGCGAGGACTAAGGCAATC GGATGGCAGGAACCACACCAGCAACGATGGAGGTCAGTGTGTGGCCCAGTGCTTTTTCGA GGAGATGAATATGGTGGATGGCAATGGGATGCCCGATCGGCGCAAGGTGAGCTATTTGCT GACCAAGGACCTTCGGGACCGGGAGCTGCGCAACTTCTTCACGGACACCGTGCAGCAGTG CTTCCGCTATCTGGAGAGCAACGGAAGGGGCCGGCACCACAAGTGTTCAGCGGCCCGGGA ACTGGTCAAGTGCATGTCGGAGTACGCGAAGGCGCAGTGCGAGGATTGGGAGGAGCACGG CAACATGCTCTTCAATTAG > CG15883 Obp18a Original gene prediction incorrectly identified the first exon. Our cDNA > is compatible with the prediction made by Hugh Robertson in his annotation updates. The > sequence predicted in Galindo & Smith 2001 has 3 extra a.a. that we did not observe. > 5' end catgaaggttgtgtgcag 3' end ctatcgcctcgtaatttacctcgaggtg. Also confirmed by ESTs > (see BI243891. RE41704.5prime RE...<up> gi:14712817 </up>). ATGAAGGTTGTGTGCAGCATAGCTGTACTATGGATTTGCTTGATAACTATGTGG CAATCAGCTGGCCGCGTTAACGCAGAGGGTTGCCTAAAGCACCACAATCTGACCAGTGCC CAAGTGCAGGCAGTGGCTCCATCCACTCCCGTTGCGGATGTTCCAGTGGCCGTTAAGTGC TATAGCCGGTGTCTGATCCAGGATTATTTCGGTGATGATGGGAAAATCGATCTGCAGAAG GTGGGAAAGCGAGGATCTCAAGAGGACCACGTGATTTTGTCCCAGTGTAAGCAGCAGTTC GATGGCGTCACCAATCTGGACACGTGCGACTATCCATACCTAATTCTCCAGTGTTATTTT AAGGGCAAGCAGAGTGGAACTATCGCCTCGTAA > CG15457 Obp19c Original gene prediction is fine. > 5' end GAAGGATCCAAGATGAAGCCATCCACT 3' end GTCTGGTAGAGGACATCTCGAGGTG ATGAAGCCATCCACTCCAGTCGCAGCCATTCCGCTAATGACGATAGTGGTC GCTGTCCTGCTGCAGACGCACTGCGTCCGTGGCCAGACCCAGGCTTTCGATCTCGCCAAG CTGCTGCCCAAGACCGGAACGGAGCCCATCTGGGCGGTAATCGATCGCAACTTGCCGCAG GTGCAGGAGCTGGTCACCGCGGCCAGGATGGAGTGCATCCAGAAGCTGCAGCTGCCCAGG GACCAGCGACCGCTGGGGAAGGTGACCAATCCAAGTGAGAAGGAGAAGTGCCTGGTGGAG TGCGTGCTCAAGAAGATCAAGTTGATGGACGCCGACAACAAGCTGAACGTTGGCCAGGTG GAGAAGCTGACCAGCCTGGTGACCCAGGACAACAAGATGGCCATCGCCGTTAGCTCCAGC ATGGCGCAGGCCTGCAGCCGCGGCATCTCCTCGAAGAACCCCTGCGAGGTGGCCCACCTC TTCAACCAGTGCATCAGTCGCCAGCTGGAACGCAACAACGTAAAGCTGGTCTGGTAG > CG11748 Obp19a I agree with Hugh Robertson that the initiator Met is closer than > originally predicted. There is a high scoring TATA-containing promoter well > positioned relative to this later Met. The second intron proposed by Galindo & Smith > is longer than the intron we observed. > 5' end gaaggatccggaaatgaagttccatctg 3'end gtattcgtgtttccctaggctcgaggtg ATGAAGTTCCATCTGCTGCTGGTCTGCGTCGCCATATCCCTGGGCCCAAT CCCCCAGTCGGAGGCAGGGGTGACGGAGGAGCAGATGTGGTCTGCCGGAAAGCTGATGCG CGATGTCTGCCTGCCCAAGTATCCGAAGGTCAGCGTCGAGGTGGCCGACAACATTCGCAA CGGTGACATACCCAATAGCAAGGACACCAACTGCTACATCAATTGCATCCTGGAAATGAT GCAGGCAATCAAGAAGGGAAAGTTCCAGCTGGAGTCGACCCTCAAGCAGATGGACATCAT GCTGCCGGACAGCTACAAGGACGAGTACCGCAAGGGCATCAATCTGTGCAAGGACTCCAC CGTCGGCCTGAAGAACGCCCCCAACTGCGATCCCGCCCACGCCCTGCTCAGCTGCCTGAA GAACAACATCAAGGTATTCGTGTTTCCCTAG > CG15583 Obp83g This region contains three OBP motifs that were originally predicted to > be fused into one construct. I agree with Hugh Robertson that the first motif forms a > separate gene. One codon was absent in our Canton-S construct. We did check for > alternative splicing to the other motifs using PCR but this was the only product > recovered. 5'-end gaaggatccagaaatgcagtcccaatc 3'-end ggtcttcgcctgacctgaatctcgagagg. > EST data supports this gene structure (see BI229555.RE27361.5prime RE...<up> gi:14696819 </up>). ATGCAGTCCCAATCCCTCCTGCTGATCGCAGCCGTTGCCACGTTTCT GGTGGCCCAGACTACGGCCAAGTTCCGGCTGAAGGACCACGCCGACGCAGAGAAGGCGTT CGAGGAGTGCCGTGAGGACTACTACGTGCCGGACGACATCTACGAGAAGTACCTGAACTA CGAGTTTCCCGCCCACCGGCGCACCAGCTGCTTCGTCAAGTGCTTCCTGGAGAAGCTTGA GCTGTTCTCGGAGAAGAAGGGATTTGACGAGCGCGCCATGATCGCCCAGTTCACCTCCAA GAGCAGCAAAGACCTGTCCACGGTCCAGCACGGCCTGGAGAAGTGCATCGACCACAACGA GGCCGAGTCGGATGTCTGCACCTGGGCCAACCGAGTCTTCTCCTGCTGGCTGCCCATCAA CCGCCACGTGGTGCGTAAGGTCTTCGCCTGA > CG15583 Obp83f (Obp83e). This is the second portion of the original prediction > containing two OBP motifs. These motifs are not separated by an intron. Galindo & > Smith predicted a monomer at this location and Hugh Robertson suggested either a monomer > or a dimer and raised the possibility of alternative splicing. A cDNA containing a > single motif could be produced by removal of an intron which would introduce a stop codon > to the end of the first motif. However, the PCR products obtained were not spliced at > this location. Rather, the cDNA contains two fused OBP motifs (Obp83f \+ Obp83e) and is > an 'obligate heterodimer'. 5'-end CACGGATCCGCAGAGAGATGAGTTCC > 3'-end GAGGACAACGAAGAGCAGTAGAAGCTTGTC. To check for alternative splicing..used internal > primer ctgtcgacta agtgggactcgagagg in combination with primer at 5' end. Also used 5' > primer of Obp83g in combination with these two 3' primers and did not recover a product, > indicating that this locus does encode two separate genes (monomer and an obligate > heterodimer). No evidence of alternative splicing. ATGAGTTCCTCTCGTGCGGTTCTAGTCAGCCTGTTCCTGATCTGCAGTCAGGCACTAGCT GACCTTTCTGGTGATGCCCAGACTCTGGAAAAGTGCCTGCGGGAACTTAGTTCGCCGGAG AGCATTGCTGGCGATCTTCGAAAGCTGGAACGGTATTCATCTTGGACGCGGGAGGAGGTA CCTTGCCTGATGCGCTGCTTGGCCAGAGAAAAGGGCTGGTTCGACGTGGAGGAAAACAAG TGGAGGCTCAAGCAACTGACTGAGGACCTGGGCGCCGATGTCTATAACTACTGCAGATTC GAGCTGCGCCGGATGGGGTCCGATGGCTGCAGCTTCGCCTATCGGGGACTCAGGTGCCTG AAGCAGGCCGAGATGCATGCGGGCACCAGCCTGAGCACACTGCTACAGTGTTCCCGCCAG CTGAACGCCACCAACGTGGAGCTGCTGCAGTACAGTAAGCTGAAGTCAAAGGAACCTATT CCCTGCCTCTTTCAGTGCTTTGCGGATGCCATGGGATTCTACGATCCCGATGGAAACTGG CGGCTGGAAAACTGGAAGCAGGCGTTTGGGCCTTCCGGAAATGAGGATCAGTCTTCTGGC GCTGACTACAGTGGCTGTCGACTAAGTGGGACTCAGCGGGAGGTGGCGCTAAGCAAGTGC TCGTGGATGTACCATGAGTACAAATGCTGGGAGCGAGTAAATGGGAATAAGCTAGTGGAG GACAACGAAGAGCAGTAG > CG15582 Obp83c/d This was originally annotated as a fusion of two OBP motifs but the 5' > exon was not predicted correctly and did not encode a signal peptide. Galindo & Smith > predicted two separate genes and Hugh Robertson raised the possibility of alternative > splicing or a heterodimer. I agree with Robertson that there is no suitable signal > peptide for the second half of the gene. If the intron separating the two motifs was > not removed, a monomer would be produced. Alternatively, the second motif could be > spliced to the signal peptide encoding exon upstream of the first motif. Using primers > to test each possibility, we only recovered a cDNA encoding the obligated heterodimer > and found no evidence of alternative splicing. In addition, although the portion of the > ESTs sequenced does not include the entire coding sequence, it is long enough to confirm > the heterodimeric structure (see gi|15319558|gb|BI485590.1|BI485590). > 5'end-GAAGGATCCGCAATGCAGATGAAAAGTG GA 3' end-CTACG CGATGGCTGTTTAGCTCGAGGTG ATGCAGATGAAAAGTGGAATATTAATAGCATTATGTCTATGCCTTTCGTTG AACGAAGGCCTGGCCCTTCTGGAGCACGAAGGCGAGACCATCAACAGATGCATCCAAAAC TATGGCGGACTTACTGCGGAAAATGCCGAACGTCTAGAACGATTCAAGGAATGGTCGGAT AGCTACGAGGAAATCCCCTGCTTCACGCGCTGCTATTTGTCCGAGATGTTCGACTTTTAC AATAACTTAACGGGCTTCAATAAGGACGGAATTGTGGGCGTCTTTGGAAGACCCGTCTAC GAAGCCTGCCGAAAGAAATTGGAACTGCCATTCGAATCAGGCGAGAGCAGCTGCAAACAT GCCTACGAGGGCTTCCACTGCATCACCAACATGGAAAGCCACCCGTTCACGGTTATTGAC AACATGCCAAACATATCCCCGTCGGCCAAGGATGCAATGAAGGACTGCCTGCAGGATGTC CACCAGGACGAGTGGAAGAGCTTCGATGCCTTCGCCTACTATCCTGTCAATGAACCGATT CCGTGCTTCACCCGGTGCTTCGTGGACAAGCTGCATATCTTCGAGGAGAAAACGCGTCTT TGGAAACTGGAGGCGATGAAGCAAAACCTGGGCATTCCGGCCAAAGGAGCTCGCATAAGG ACCTGCCATCGGCACCGCGGCAGGGACCGATGTGCCACATATTACAAACAGTTCACCTGC TACGCGATGGCTGTTTAG > CG155505 New name proposed...Obp99d. This is a four-Cys isoform. The original gene > prediction is fine. This OBP has a small internal deletion of about 17 a.a. relative > to other OBPs. > 5' end-gaaggatccgcaatgaatca cttgagac 3'end-ggaagtggaa cgatgaaatactcgaggtg ATGAATCACTTGAGACTGGAGATCATCTGCTGGAGCTGCCTGCTTATTGCG ATGGCTGTTATCACGGAAGCTGCTTCTGTTTGGAAACTACCCACCGCGCAGATGGTCTAC GAGGATCTGGAGAAGTGTCGCCAAGAAAGCCAAGAAGAGGATGCTGCTACCCTGAGGTGT TTGGTTAAGAAACTGGGTCTTTGGACGGATGAGAGTGGCTACAATGCCAGGCGGATAGCA AAGATCTTTGCCGGACACAACCAGATGGAGGAGCTGATGCTGGTGGTGGAGCACTGCAAC CGGATGGAGCAGGACACGAGCCACCTGGACGACTGGGCCTTCCTGGCCTACAGGTGCGCC ACTTCCGGGCAGTTTGGCCATTGGGTCAAGGACTTCATGAGTCAAAAGGAAGTGGAACGA TGA > CG13429 Obp57e. I agree with Hugh Roberston and Galindo & Smith that the 5' exon was > missed and the 3' exon does not exist as indicated in the original prediction. > 5' end-GCTGGATCCGTATGTTGGACCAACTTACAC 3' end-GTTTCAATTAGTAATGCAAAGTAGCTCGAGAGG ATGTTGGACCAACTTACACTGTGTTTGTTGCTAAATTTTCTGTGCGCAAATG TTCTCGCTAACACTTCAGTATTTAATCCGTGTGTTTCGCAAAATGAGTTATCCGAATATG AAGCCCACCAAGTGATGGAGAATTGGCCAGTTCCGCCCATCGATCGGGCTTACAAATGCT TTCTAACTTGCGTCCTTTTGGATTTGGGCCTGATTGATGAACGGGGTAATGTGCAGATCG ATAAGTACATGAAATCCGGAGTGGTGGACTGGCAATGGGTGGCAATAGAGTTGGTAACAT GTCGCATAGAATTCAGCGACGAAAGGGATCTGTGCGAGCTATCATATGGAATCTTCAACT GCTTCAAGGATGTGAAGCTTGCGGCCGAGAAGTATGTTTCAATTAGTAATGCAAAGTAG > CG13874 Obp56h. I agree with Galindo and Smith & Robertson that the 5' signal-peptide > encoding exon was missed. 5' end-gagggatcctcaaaatgaagttcaccct, > 3'end-acatcactaatgcctgatcctcgaggtg ATGAAGTTCACCCTATTCTGTATTGCTCTGGCAGCTTTTTTGTCCATGG GACAGTGTAATCCGGACTTTCGCCAAATAATGCAACAGTGCATGGAGACCAACCAAGTGA CCGAGGCTGATCTCAAGGAGTTCATGGCCAGCGGGATGCAGAGCAGTGCCAAGGAGAACC TCAAGTGCTACACCAAGTGCCTGATGGAGAAGCAGGGTCATCTCACCAATGGCCAGTTCA ATGCACAGGCTATGCTCGACACTCTCAAAAATGTGCCTCAGATCAAGGACAAAATGGACG AGATTTCCTCGGGAGTGAATGCCTGCAAGGACATCAAGGGAACCAACGATTGCGACACGG CCTTTAAGGTTACCATGTGCCTGAAGGAGCACAAGGCCATTCCAGGACATCACTAA > CG13873 Obp56g. I agree with Galindo & Smith and Robertson that the 5' signal-peptide > encoding exon was missed. Two splice donor sites are predicted 3 bases apart. > Our transcript was spliced at the earlier (somewhat lower scoring) site, so the protein > encoded by our cDNA is one a.a. longer than that predicted by Hugh Robertson. > 5'end-gtcggatccatgagggctacattcgcattg 3'end-caggcagccaattagggtctctcgagtcc ATGAGGGCTACATTCGCATTGACTCTGTTGCTCGGCTGCCTTTCAGGAATTTTG GCGCAGCAAGCCAACATAGACAGTTCGGTGTCCAAGGAACTGGTGACGGATTGCCTCAAG GAGAACGGTGTCACTCCCCAGGATCTGGCTGACTTGCAATCGGGCAAGGTGAAGGCCGAG GATGCCAAGGACAATGTGAAGTGCTCCTCACAGTGCATTCTGGTCAAGAGCGGTTTCATG GACTCCACTGGCAAACTGCTGACCGACAAGATTAAGTCTTACTATGCGAACTCGAACTTT AAGGATGTCATCGAAAAGGATTTGGACAGGTGTAGCGCGGTCAAGGGGGCCAATGCCTGT GACACTGCCTTCAAGATATTATCCTGCTTCCAGGCAGCCAATTAG > FBgn0043532 Obp56i. I agree with both Robertson and Galindo & Smith that there is a gene > here (not originally predicted) and our cDNA matches their predictions. I have submitted > the cDNA sequence to genbank (AF457143) and release was requested on May 30th (still > pending). 5'end-GCTGGATCCTACCCTGCTGAAAAATGCAT 3'end-CGAAATCACAGAGGATAAAGTCTCGAGTCC ATGCATTTTTTCGCCTGCTGTGCATTATTGTTAGTCGTCG TTACTTTACCAACATGTTTCGTACAAGCAGGTCCCATTAAGGATCAATGCATGGCGGCGG CGGGCATCACAGCACAAGATGTTGCGAATCGTCATGAGACCGACGACCCTGGCCATAGTG TCAAGTGCTTTTTCCGCTGTTTTCTAGAAAACATTGGCATTATCGCCGATAACCAGATAA TACCCGGTGCTTTTGACCGAGTTCTAGGCCATATAGTTACCGCGGAAGCCGTAGAGCGAA TGGAAGCGACGTGTAATATGATTAAGAGCGAGACAACCTATGACGAGTCTTGTGAATTCG CCTGGCAAATCTCCGAGTGCTACGAAGGAGTAAGATTATCAGATGTTAAGAAGGGCCAAA GAACTCGAAATCACAGAGGATAA > FBgn0043536 Obp57d. I agree with both Robertson and Galindo & Smith that there is a gene > here (not originally predicted) and our cDNA matches their predictions. I have submitted > the cDNA sequence to genbank (AF457149) and release was requested on May 30th (still > pending). There is some uncertainty as to the initiator Met. The others have predicted > a start at the second ATG in this sequence. Our primer location does not unequivocally > differentiate between the two as is contains 15 nt which lie between these two ATGs. > 5'end-AGCGGATCCTATGATATCTTCAACTAGT 3'end-GTCATAAAAGCCGTTGTAAAGACTCGAGCCT ATGATATCTTCAACTAGTTTGATGCCTGAAAAAATGTCTCTAAGACTCATACC GCATCTGGCTTGTATTATTTTTATTTTGGAAATTCAGTTTAGAATTGCCGATTCTAACGA TCCGTGCCCCCATAATCAAGGAATAGACGAAGATATAGCCGAATCAATTCTAGGTGACTG GCCTGCAAATGTGGATTTGATTAGCGTGAAAAGGTCCCACAAGTGTTATGTGACCTGCAT TTTGCAATATTACAATATTGTGACCACTTCTGGTGAGATATTTCTGAACAAATACTACGA TACTGGAGTCATTGATGAATTGGCGGTGGCACCCAAAATCAATCGATGCCGATATGAGTT TAGAATGGAAACAGATTATTGTAGCCGAATTTTTGCTATATTCAATTGTTTAAGGCAAGA AATATTAACAAAGTCATAA > FBgn0043534 Obp57b. I agree with both Robertson and Galindo & Smith that there is a gene > here (not originally predicted) and our cDNA matches their predictions. I have submitted > the cDNA sequence to genbank (AF457147) and release was requested on May 30th (still > pending). 5'end-actggatcctacaatgttcatctacagac 3'end-cgtaattgattcggaataaatgctcgagagg ATGTTCATCTACAGACTTGTATTTATTGCGCCTCTGATTTTGTTATTGTT CAGCTTGGCCAAGGCTCGCCACCCCTTTGATATATTTCATTGGAATTGGCAAGACTTTCA GGAGTGTCTACAAGTTAATAATATTACCATAGGAGAATATGAGAAATACGCGCGACACGA AACTTTGGATTACCTGCTCAACGAGAAAGTCGACTTGAGGTACAAGTGCAATATTAAATG TCAGCTGGAAAGGGATTCAACGAAATGGTTGAATGCTCAAGGCAGAATGGATTTGGATTT GATGAATACCACCGATAAGGCATCCAAATCCATTACCAAGTGCATGGAGAAGGCTCCCGA AGAACTTTGTGCGTACAGTTTTAGACTGGTGATGTGTGCATTTAAGGCCGGCCATCCGGT AATTGATTCGGAATAA > FBgn0043535 Obp57a. I agree with both Robertson and Galindo & Smith that there is a gene > here (not originally predicted) and our cDNA matches their predictions. I have submitted > the cDNA sequence to genbank (AF457148) and release was requested on May 30th (still > pending). 5'end-actggatccaatgttcaacactagacttgc 3'end-gattacgataccatcgatttataactcgagagg ATGTTCAACACTA GACTTGCCATTTTTTTGCTTCTTATCGTTGTTTCGCTTAGCCAAGCTAAGGAAAGCCAAC CCTTTGACTTTTTCGAAGGAACCTATGACGATTTTATTGATTGTCTGAGAATCAATAATA TTACCATTGAAGAGTATGAGAAGTTTGACGATACCGACAATTTGGATAATGTCCTCAAGG AAAATGTCGAACTGAAGCACAAGTGCAACATTAAGTGTCAACTGGAAAGAGAGCCAACCA AATGGCTAAATGCTCGGGGTGAAGTCGATCTGAAATCAATGAAAGCAACCAGTGAGACAG CGGTATCCATATCAAAGTGCATGGAGAAGGCTCCCCAAGAAACCTGTGCCTACGTCTATA AATTGGTAATATGTGCATTCAAATCCGGACATTCAGTCATCAAGTTCGATTCATATGAAC AAATACAAGAGGAAACCGCTGGACTAATAGCTGAACAGCAGGCGGATCTGTTTGATTACG ATACCATCGATTTATAA > FBgn0043533 Obp56f. I agree with both Robertson and Galindo & Smith that there is a gene > here (not originally predicted) and our cDNA matches their predictions. I have submitted > the cDNA sequence to genbank (AF457146) and release was requested on May 30th (still > pending). 5'end-actggatcctatgaaagtattcctgttgttc 3'end-CCATTACCAAGCATTAGACTTCTCGAGACT ATGAAAGTATTCCTGTTGTTCATTTTCATCTCTGCTATCTGGCTCCAAGCATTTTGTATG AAATCTTCTGAAAAAATAAAAGCCTGCTTGAAACGGCAGCTGGGGTATA CAATTACAGAAAATACAAAATTTGATGCTAAAGAAGACTCTCTTCAAAGCAAGTGTTTTT ATCACTGCTTACTGGAAGTGAAAGGTGTTATTGCAAATGATGCGATCAGTTCGGAGCAAC CGAGGAAAGTACTTGAAAAAAAGTATGGCATTACTGACACAGATGAATTGGAAAAGGCTG AAGAAAAGTGTCATTCCATCAAGGCTTCAGGAAAATGTGAATTGGGCTACGAAATCTTGA AATGCTATCAGTCCATTACCAAGCATTAG > CG12944 Obp47a. I agree with Robertson and Galindo & Smith that the initiator Met is > earlier than originally predicted. However, I agree with the original prediction of the > splice acceptor site which lies 15 nt further downstream than that predicted by Robertson > and Galindo & Smith. However, there were 7 polymorphisms relative to the Celera > sequence, so we amplified the genomic sequence as well. The polymorphisms were confirmed > and the intron sequence had numerous additional polymorphisms which could potentially > result in use of an earlier junction (end of lowercase on bottom line) in the strain > sequenced by Celera. This is the intron alignment. > Canton-S (Mine) gtgagtttgggttcaaaaaacaag----caaagggatttgcttataaaagcacctcttttcttcag > ::::::::::::: :::::: :: ::: ::::::::::::::::::::: :::: : :::: > Celera gtgagtttgggttaaaaaaaaaaaaaaacaatgggatttgcttataaaagCACATCTTCT-GTCAG > 5'end-GAGTCTAGAAATGAATCGAGTTCTAGTGC 3'end-CTTAGAACTTTGCACAACACTCGAGGTG ATGAATCGAGTTCTAGTGCTATTGCTGGTGCTTAAAATGTTCGCTTTGAGCGAGTCCCGT TTCGCCAAGATAAACATCAATCTGGGACTAACCGTTGCTGATGAATCCCCCAAAACGATC ACCGAGGAAATGATTCGCCTGTGCGGAGATCAAACGGATATATCCCTCAGGGAGTTGCAC AAGCTGCAAAGGGAGGACTTTTCGGATCCCTCGGAATCCGTCCAGTGTTTCACCCATTGC CTCTACGAGCAAATGGGTCTCATGCACGATGGTGTTTTTGTGGAACGCGATCTATTCGGG CTTCTTTCCGATGTCAGTAATCCCGATTACTGGCCAGAACGTCAATGCCACGCGATTCGT GGCAATAACAAATGTGAGACGGCCTACAGGATTCATCAATGCCAACAGCAGTTGAAACAA CAGCAAAAGAACTTATTGGCCACCAAGGAGGTTGAGGTCACCACCACACCAGCTGGATCC GATGAAACAAAACCTTAG > FBgn0043530 Obp51a. I agree with both Robertson and Galindo & Smith that there is a gene > here (not originally predicted) and our cDNA matches their predictions. I have submitted > the cDNA sequence to genbank (AF457145) and release was requested on May 30th (still > pending). 5'end-atcggatccagtattcattggcctggttc 3'end-cttgtaaaattgtatctgaacctcgagagg ATGAAAGTATTCATTGGCCTGGTTCTGTTGTTAGCTGTCACTACGCTGTCATCCGCTTT ATTCGAATCTGAAGCGAACGAATGTGCTAAAAAGCTGGGAATTACCCCAGATTACTTCGA AAATTTTCCGCACAGCAGTCGGGTGAAGTGCTTTTACCACTGCCAAATGGAAAAACTTGA AATAATTGCCAATGGTGTGGTAACACCATTCGATTTGAAAGTATTGAACATATCACCGGA GAGCTATGATAAGTATGGTGTAAAGGTAAAACCATGCCTCAAACTATCGCATCGCGACAA ATGTGAGCTCGGTTACTTGGTGTTCCAGTGCTTGAAACGAGAATTCAACTTGTAA > CG15129 Obp56b. I agree with both Robertson and Galindo & Smith that the original > prediction has fused two separate OBP genes. I was not able to amplify a cDNA > corresponding to the prediction although I did amplify the cDNA for each OBP > individually (second one below). My cDNA matches the prediction of Robertson and > Galindo & Smith. > 5'end-GAGGGATCCGAATGTAGCTTTGGAAAGATG 3'end-GTTAAGGGTATTAGTGCCTAACTCGAGGTG ATGAAACTTATCTACTTGTTGGTTGTATTCCTAATT TTCGCTCTAAGCGAACTAGTAGCGGGCCAGTCAGCTGCGGAATTGGCAGCCTACAAGCAA ATTCAACAAGCCTGCATCAAGGAGCTGAATATTGCTGCCAGTGATGCTAATTTGCTGACC ACCGACAAGGAGGTGGCGAATCCCTCTGAGTCGGTGAAGTGCTATCACAGCTGCGTCTAC AAGAAACTGGGTCTCCTGGGTGACGATGGAAAGCCCAATACTGATAAGATCGTTAAGTTG GCCCAGATCCGTTTCAGCAGTCTGCCGGTGGATAAGCTAAAGAGTTTGCTTACCAGCTGC GGAACCACAAAGTCAGCCGCCACCTGTGACTTTGTCTACAACTATGAAAAGTGTGTTGTT AAGGGTATTAGTGCCTAA > CG15129 Obp56c. Second OBP in fused prediction. Differed from predictions of Robertson > and Galindo & Smith in that we did not observe a second intron. Therefore, in addition > to having longer than usual N and C termini, this isoform also contains an insertion of > about 40 a.a.. Galindo & Smith also predicted a different 5' signal peptide. Original > gene prediction, although fused upstream of Obp56b, includes the observed signal peptide > (predicted start codon 21 nt downstream of actual start) but not the second intron. > 5'end-GAGGGATCCTTAGATGTATTTTCGAGCCAG 3'end-GCTAAGTCGGAGTAAATAGCTCGAGGTG ATGTATTTTCGAGCCAGTTTGATGGCATTGCTTTGCCTCACTCTTAGTGAATTCGTTTCT AAAGCATGGACCCGATCGCTTTCCGTCTCGCTGAACATGTCGATGACACGAACCCTGGTT CCAGATCCGCTAAATGGAACAGAAAACAAACTCAGCCAGGAGATGCTGAGGGCTTGTATG CGTAGGACCGAGATCTCAATGTCGCAACTGAAACTATTTCACATGAGCCTGATGAACAGC GACTACAATAATGACAACGATATAGCCCCTACGCCAGTTCAATCCATTGGCGATGTAAAT AACCTGGGTGATCTGGACTTCAATGGCAACTCGCAGATGCCCTATCTCGATCTGAAGCAT AATGAGCCGCTGCAGTGCTTTGTGAGCTGTCTGTATGAGACCCTGGATTTGGATAGGTAC AATGTCCTGCTGGAGGAGGCCTTTAAGAATCAGGTGCAAACGATCATACAGCATGAGAAG GCGGAGATCAAGGAGTGTAGTGATCTTCAGGGCAAAACACGATGCGAGGCAGCCTACAAG CTGCACCTGTGCTACAATCACCTGAAAACTCTGGAGGCGGAGCAGCGTATCCGTGAGATA CTTGAGCGGACCGAGGCGGAGAACGAAGGATTCGGTCCGGAGGGCAGCGACTTTATCGAC GGCATCCAGCATTCCGGAGAAGCAATGACCACCGCTAAGTCGGAGTAA > FBgn0043539 Obp22a. I agree with both Robertson and Galindo & Smith that there is a > gene here (not originally predicted). Our cDNA did not contain the second intron that > they predicted. We designed three primers to test three possible scenarios; that there > is no intron at the 3' end, that there is a short intron as predicted by the others, > or that there is a longer intron. All three possibilities introduce only a small number > of a.a. to the C terminus of the protein, so the actual transcript was difficult to > predict. We only obtained the cDNA corresponding to the no 3' intron version. I have > submitted the cDNA sequence to genbank (AF457144) and release was requested on May 30th > (still pending). This isoform has a small internal deletion of about 12 a.a. relative > to other OBPs. > 5'end-actggatccttcgagatgcgagtgttgct 3'end-ggatagatagaggatagagtctcgagcgt ATGCGAGTGTTGCTGGCTTTTGTACTTCTGCTTGGCCTCTCAGTTTTG GCCACTAAGGAACCGGAAGAAGTTAAAATTGTAAGCGAGTGTGCCAAGGAGAACAATGTT CATAGGAAGAAGGCACTGGACCTTTTAATGAGCTATCGTTTGAAGAAGAAAACCCACAAC GTCATGTGCTTCATCAACTGCATCTTCGAGCGAACCAACATACTGCAGAAAGTTAAGGAA AAGGTTGTAAAGGAAAATCACAACTGCGACTCCATCAAGGACGCTGATAAGTGTGCAGAA TCCTTCCAAAAATTTCAATGCTTGGTCAAGATTGAGATGAAAGTGAGGGGGATAGATAGA GGATAG > CG11797 Obp56a. There are 18 ESTs (several which appear not to be the same > clone resequenced) which confirm the original gene prediction. > See gi|15481152|gb|BI589730.1|BI589730 for example. > CG13421 Obp57c. There are 2 ESTs which confirm the original gene prediction. > gi|15523123|gb|BI627598.1|BI627598 > CG2297 We propose the name Obp44a. This is a four-Cys isoform. There are 87 ESTs which > confirm the original gene prediction. gi|15531140|gb|BI628930.1|BI628930 > CG11218 Obp56d. There are 24 ESTs which confirm the original gene prediction. > gi|15484370|gb|BI592948.1|BI592948 > CG1670 Obp19b. There are 10 ESTs, several of which appear to be independent clones, > which determine the coding sequence to be as below. The initiator Met is downstream > of that originally predicted but upstream of that predicted by either Robertson or > Galindo & Smith. > See gi|15532575|gb|BI630365.1|BI630365 or gi|3833396|gb|AI238538.1|AI238538. atgatgcagtg cagccgaatg acgacgacgt tgaagatgac gaaccttctg ctagcagtgg cctgcgccgc cgtgctgatg ggatcggcga cggcggacga ggaggagggg tccatgaccg tggacgaggt ggtggagctg atcgagccct ttggcgacgc ctgcacgcca aagccgtcga gggagaacat cgtcgagatg gtgctgaaca aggaggacgc caagcacgag accaagtgct tccgccactg catgctggag cagttcgagc tgatgcccga ggatcagttg cagtataacg aggacaagac ggtcgatatg atcaacatga tgttcccgga tcgcgaggac gacggcaggc gcatcgtcaa gacctgcaac gaggagctaa aggccgagca ggacaagtgc gaggcagccc acgggatcgc tatgtgcatg ctgcgcgaga tgcgctcttc gggcttcaag attcccgaga tcaaggaatg a > CG8462 Obp56e. There are 24 ESTs which confirm the original gene prediction. > gi|15531268|gb|BI629058.1|BI629058 > CG18111 Obp99a. I agree with Robertson and Galindo & Smith that the first exon was not > identified in the original gene prediction. I don't understand why the intron report of > Steve Mount is here as the gene was not originally predicted to have an intron. However, > the intron in his report is definitely the actual one but with incorrect ends. There > are 15 ESTs, of which several were independently obtained, which confirm the following > coding sequence. See gi|15506074|gb|BI610549.1|BI610549, > gi|13692677|gb|BF500840.2|BF500840 atgaaggtt ttcgttgcca tctgcgtgct gattggactg gcctccgccg actatgtggt gaagaaccga cacgacatgc tggcctaccg cgatgagtgc gtcaaggagc tggccgtgcc cgtggatctg gtggagaagt accagaagtg ggagtacccc aacgacgcca agacccagtg ctacatcaag tgcgtcttca ccaagtgggg cctgttcgac gtccagagcg gtttcaacgt ggagaacatc caccaacagc tggtgggcaa ccacgctgac cacaacgagg ccttccacgc ctccttggcc gcctgcgtgg acaagaacga gcagggatcc aatgcctgcg agtgggccta ccgcggagcc acctgtctgc tgaaggagaa cctggcccag atccagaaga gcctggcccc gaaggcctag > CG12665 We propose the name Obp8a. I agree completely with Robertson that the first > exon was not identified in the original gene prediction. This is another 4-Cys isoform. > There are four ESTs confirming the following coding sequence. > See gi|4247802|gb|AI404715.1|AI404715 (orientation backwards). ATGATGCGGAGATCACAGATCGGTTTGCTCAGCAGGCTGCTGCTGTTGCTGCTGGTGGTGGAACTGACGC CCCCTGCTATTCCGGTGCCCATGCGATCCTCACCCCAATCGCTGGCCCTACTGCGAGCACGGGATCAGTG CGGCAGGGAGCTGACTGCTGCCCAGCGTCTGCAGCTGGACAGGATGCAATTCGAGGATGCTGCCCATGTG CGTCACTATCTCCATTGCTTCTGGTCACGGCTGCAGCTCTGGCTGGATGAGACCGGATTCCAGGCACAGC GCATCGTTCAGAGTTTCGGCGGCGAGAGGCGTCTCAATGTGGAGCAGGCACTGCCAGCCATCAACGGGTG CAATGCGAAAACGAGCTCCAGAGGATCGGGCGCTCAGACAGTGGTCGACTGGTGTTTCCGTGCCTTTGTC TGCGTGCTGGCCACTCCAGTCGGTGAGTGGTACAAGCGCCACATGTCCGATGTCATCAATGGGAATGCCTAG > CG7584 We propose the name Obp99c. This is another 4-Cys isoform. The original gene > prediction is confirmed by 36 ESTs. See gi|15521987|gb|BI626462.1|BI626462. > CG7592 Obp99b. The original gene prediction is confirmed by 12 ESTs. > See gi|3868657|gb|AI261132.1|AI261132I261132 Dr. Laurie Graham Department of Biochemistry Queen's University Kingston, Ontario K7L 3N6