FB2026_01 , released March 12, 2026
FB2026_01 , released March 12, 2026
Reference Report
Open Close
Reference
Citation
Robertson, H. (2005.7.22). Dpse Ors and Grs. 
FlyBase ID
FBrf0188558
Publication Type
Personal communication to FlyBase
Abstract
PubMed ID
PubMed Central ID
Text of Personal Communication
Subject: Re: Dpse Ors and Grs
Dear Peili and FlyBase,
I've completed a first pass at trying to reconcile all of my versions of
the DpOrs with yours. Below are comments on every gene. I am also
attaching a draft of the Or table from our manuscript if that helps at
all. I hope to have the Grs done by the end of the week. I don't imagine
these changes can make it into Release 2, but perhaps there will be a
Release 3? Any chance I could get GA numbers for those that don't
currently have them to include in the manuscript?
Hugh
Or1a \- GA14708 \- Fine as is.
Or2a \- GA16647 \- Needs nine aa on the N-terminus.
MQPGSENQPELNTHSAVDYHWRVWQLTGLIRPAGISKSLYRTYAIGLNLMVTLLFPLSLLARLILTRSMKELCENLTITI
TDITANLKFGNVYLVRRHLGEIRSLIQQLDCRALAICDRDELCALVQAVTTARNTFRTFAWIFVCGTSLSCVRVALARSR
QLLYPAWFGVDWKNSNEAFVGIYVYQLFGLIVQAVQNCASDSYPPAYLCLLTGHMRALEVRVARIGCECASGQQSYDQLL
ACIQDLTLIHRMHTIIQQILSVPCMAQFACSVAVQCTVAMHFLYVADADDRLAMILSVIFFVAVTLEVFVICYFGEKMRT
QSEALCNAFYACNWVEQLPRFRRDLLFTLARAQRPSLILAGSYIPLTLETFKEVMHFAYSAFTLLLRAK
Or7aP \- No GA# \- A fragmentary pseudogene in Dp \- here's chunks of
translated pieces
APEDSRPARTPLWRRIYRCFAAMIYVWQLXGIWGGMEITQVMTRKDLGSDLGSAAAATALDGQSRETSEFRRIGKAVRFC
NRLVWFYQLLLFYLHQAQQPIALTAMKIFPINLAIAKFSFSLYALIKAMTLGERE
Or9a \- GA13635 \- First two and last aa missing.
MAKKDETNSLRVQILVYRLMGIDLWSPTAVNDRHLVTFVTMGPLFAFMLPMFLSARENITEVSLLSDTLGSTFASMLTLV
KYLLFVYYRKEFVGMIYRIRSILEKEINVWPEAKEIVDAENRSDQMLSLTYTRCFCMAGVFAAIKPFVTMSLALLRDGGD
GSKLHLELPHMGVYPYDYQVMWFFVPTYLWNVMASYSAVTMALCVDTLLFFFTYNVCAIFKIARQRLVHLPAMGEADPRG
ELEAAVQVLLLHQKGLWIADFIADHYRPLIFLQFFLSALQICFIGFQVADLFPKPQSLYFIAFVGSLLIALFIYSRCGEN
IKEASLDFGDGIYESNWMEFAPSTKRVLLIASMRAQRPCQMKGYFFEASMATFSTIVRSAMSYIMMLRSFNV
Or10a \- GA14703 \- Needs five aa on N-terminus.
MWSFHRLLRRDQPLRSYFFAVPRLSLDVMGYWPMGDDLPVRAIVHFVILSIGVVTELHAGFMFLQNAQITLALETLCPAG
TSAVTLLKMLLMLRYRRDLTNVWRQLQRMLFDAGLNRPEQKAIIHDNSVLAARINFWPLSAGFFTCTTYNLKPLLIAFIL
YLQDPDQELPWNTPFNMTMPKVLLAAPFFPLTYAFIAYTGYVTIFMFGGCDGFYFEFCVHISSLFQSLQEETRAIFRPFE
EYLKLTPAQCARLELQLRGLIIRQNSVFELISFFRKRYTVITLAHFVSAALVIGFSICNLLTVGNNGLGALLYVAYTVAA
LSQLLVYCYGGTLVAESSFELSRVAASCPWSLGAPRQRRVILLLILRSQRAPTMAVPFFSPSLNTFASILQTSGSIIALA
KSFQ
GA17050 is the same gene, but with the N-terminus intact, an open intron
unspliced, and the C-terminus spliced into the Gr10a ortholog
immediately downstream. As in the Gr notes, I recommend removing the
Or10a coding region from the front of this annotation and making it just
the Gr10 ortholog.
Or13a \- Not in BLASTP or in list \- A single long ORF \- unclear why not
annotated.
MFNLQQYTAITFRLPVQCVWLKLNGSWPFVPSSRKDNSSMCSILYTAWAWYVIVSVGITIGFQTAYLVTHLSDIIMTTEN
CCTTFMGALNFVRLLHLRFNQRKFRQLIEQFERDIWIPENTHISVNAECHKRMFTFTIMTGLLSCLICMYCALPLVEIFF
RTGVDAVEKPFPYKMLFPYDPYSSWMRYVFTYMFTSYAGICVVTTLFAEDSIFGFFITYTCGQFRLLHERVDNVLTAAKE
RADEQHLQLQRLHNIVQQHNKIIRFAKCLEDFFNPILLVNLMISSLLICMVGFQIITGKNMFIGEYVKFIVYISSAISQL
YVLCENGDALIIHSTNTARHLYGCDWENPDIRNFYSYTSMSHQLRNDLKFMILCSQRPVRITAFKFSTLSLQSFTAILST
SMSYFTLLRSLYYEDQL
Or19a \- GA17168 \- This protein can be extended 18 aa at N-terminus,
although the Dm ortholog does not extend like this. I'm not sure what
the best approach is for examples like this, of which there are several
in Dp.
MGYKRVWPSRGQSKVLDDMRPVDSMSAFRYHWQIWRVMGMHPADPQTLWGRHYTLYGIVWNAFFRLGMALSLVVNFLLST
SLESFCESLSVAVPHTVANLKVFFLWRMRQQILQTHPILHHLDGRIGSLAEKQSILEGIDRAYFTFISFLRAIIFILAVG
ILILCLSSDRPLLYPSWMPWNYKDSSFTVYAMTVCLHSVGIIENALLVCNVDTYPGSYLNMLAAHTQALAHRVSRLGYDP
RLTRSQACDRLRSCILDHQIIMNLFKSLEHSLSMSCFLQFASTAIAQCATCFFVIFVSVGTMQSVNMIFLFLVFTTQTLL
LCSSAELVRHEGENLIKAIYDCNWLDQSVEFRRMLLLMLARSQRPMILRAGLIIPVQMSTFMVVCKGAYTMLTLLREVDN
SEVA
GA15104 is listed in your table as an ortholog of DmOr19a, and is shown
in the genome browser in the same location, but it doesn't come up in
the BLASt searches. Dm Or19a/b are very recent duplicates of a single
ortholog in Dp, GA17168, so perhaps GA15104 should be dropped?
Or22a/b \- This duplication is independent of a triplication in Dp
yielding two intact genes for which I propose the models below for the
GA numbers shown in the genome browser.
Or22a1 \- GA11469 \- not in BLASTP \- here's my model
MLSKLFPRIKAKPLTERIQSRDAFVYLDRVQKLWGWRATEDERWMVLYNIWAFIWNVLLLVLLPLSMSMEYVQRFKNFSP
GEFFGSLEICVDMYGCSLKCVYTMFGYKRFQAARKLLDRLDLRCTSDEDRASVHRSVALANRCYVTYHILYSGFVVINWT
GYLLLGSHAWRMYLPGLDSEKNFLVTSFFELLLMSGVVTMNQCTDVSPLAHMIMARCHMGLLKDRLNKLHSDPSKTEEEH
QEDLNRCIHDHCVILDYVNLLRPVYSVTIFVQFLLIGLVLGLSMIHIMFFSNFWTGIGTMCFIFDVCLETFPFCYLCNII
IEDCRELSESLFQSDWLGASRKYKSTLVYFLHNLQQPIVLTAGGVFPICMQTNLSMVKLAFSVVTVIKQFNLAEKFQ
Or22a2 \- GA18049 \- not in BLASTP \- here's my model
MLSKLVPRIKAKPLTERTSSRDAFVYLDRVQKFFGWTAVEDKRWRIPYILWGIFMNLLLIFFLPISMLVAYIQMFKSFTA
GEFLSSLEITVNMYGCVLKCIYTIWGFKGFTAARKVLDELDLRCTSDEERTSVHRCVALGNLSYVLFHIFYSGFVVINWT
GYVLMGRHAWMMYLPGLDAENNFFVASLCEILLMSGVVTMDQCTDVSPLAHMLMARCHICLLKDRLTKLRTDPTKDEDEH
YEELSNCVHDHRLILDYVKALRPTFSGTIFVQFLLIGIVLGLSMINVMFFSTLWTGLGTVCFMFCVCLETFPFCYLCNMI
IDDCQKLSDNLFQSDWTTASRRYKSTLVYFLQNLQKPIILTAGGVFPICMQTNLSMVKLAFSVVTVIKQFNLADKFQ
Or22aP \- Then there is a pseudogene most similar to GA11469 immediately
upstream of GA18049
RKSKLVPRITAKPXDVLVYLDRVKKLWGWRAAEDEQWRILYNIWAFVWNMLLLVLLPLSMSMEYVQRFKXFSPGEFFGSL
EICVDMXAKTLLDGLDLRCTSDEERASVHRYVAPANXCYVAYHILYSGFVVINWTGYLLLGRHAWRMYLPGLDSEWNFLV
TSFFELLLMTAVVTMNLARCHMGLFKDRFTKLHSDPSKSEEEHQEDLNRYIQDHSEILEYVHLLRPVYSSTIFVQFLLIG
LVLGLSMIHIMFFSNFWTGIGTMCFMFDVCLETFPFCYLCNIITDHCQDLADSLFQSNWMAASRRYKSTLVYFLHNLQQP
IVLTAGGIFPICMQTNLSMAKMAFSVVTIIKQFNLADKFQ
Or22c \- GA 13684 \- fine as is
Or23a \- GA 22094 \- fine as is
Or24a \- GA 11185 \- fine as is
Or30a \- GA12048 \- Needs an internal segment restored.
MDLKSMDTVDMPIFGSTLKLMKFWSYLFVHNWRRYVAMTPYIVINCTQYVDIYLSTESLDFIIRNVYLAVLFTNTVVRGV
LLCVQRGSYERFIEVVKAYYIQLLESKDAHILRLVEEITRLSITIGRINLLMGTCTCIGFVTYPIFGSERVLPYGMYLPA
IDEYKYATPYYEVFFVIQAIMAPMGCCMYIPYTNLVVTFTLFGILMCRVLQHKLRSLEKLEKGLVRREIIWCIQYHLKLA
GLVDAMNSLNTHLHLVEFICFGAMLCVLLFSLIIAQTIAQTVIVIAYMVMIFANSFVLYSVANELYFQSFDIAIAAYESN
WMDFDIDTQKTLKFLIMRSQKPLAILVGGTYPMNLKLLQSLLNVIYSFFTLLRRVYG
Or33N \- lost from Dm \- This gene is upstream of the three that are
orthologs of DmOr33a-c
MELPSPVIASDYIYRTYWLYWRLLGVEGEHPLRYLLLIMQFFFVTIWYPIHLIVGLICDGTLAEVCRGIPITASCFFASF
KIICFRWKLAEIKKVQQLFVELDQRIATAEERSSFYRETIRVAEFIGKSLLVAAFLAIITGTAFGLFRRERNLLYPGWFP
YDVYSSDQRFWLSFSYQAAGHSLAILQNLANDSYPPMTFCVLAGHVRLLSMRLSRMGYDLTKPKELIVRELKDNIEDYRK
LMKIVQLLRSTMHLSQLGQFISSGINIAITLVNILFFADNNFARTYYGVYFMAMLMEIFPSCYYGTLVSMELNGLTDSIF
SSNWVGMDRGYCRTLLIFMQLTLAKVEIRAGGMIGISLNAFFATIRMAYSFFTLAMSLRK
Or33a \- GA14239 \- Not in BLASTP \- My model is
MASDPPDVRSGHIYRTYWLYWRLLGVEGEYPLRYLLDVILNFFVTIWYPTHLIIGLFQERTIGHVCKNLPFTAESFFCSF
KIICFRWRLAEIKKIEQLLMELDQRAVSPEERVFFHQNTRSVAEFISKSYVAAGISATVTGTASVLFSSGRKLIYPAWFP
YDVQASALRYWLSFTYQATGATLTILQNMANDSYPPMTFCVVAGHVRLLAMRLRRMGHHEKASKQGNAKKLIENIEDHRK
LMQIVRLMHSTLYLSQLGQFISSGINISIVLINILFFAENGFAIIYYVVYFMAMVLELFPSCYYGTLMSMEFQKLPYAIF
SSNWLGMGRRYCHTLLVLMQFTLTEVDIKAGGMVGISMNAFFATIRMAYSFFTLAMSFR
Or33b \- GA14240 \- Fine as is.
Or33c \- GA18589 \- Just needs three aa on N-terminus.
MAAVIDSVRVYQPFWWCMRVMAPTFFGATQRPVQIYVGLLHLLVTFLFPVHLLVNLALQPTSAELFQNLSISMTCAACSL
KHVAHLYHLQEIAEIQKLLIELDGYVDSEEEHRYYVDHLQCQARRFTRCLYASFVVIYVLFLLNLMILIASEDRMLVYPA
YFPFDWQGNGYLYAIAVGYQSICLVLEGIQGVSNDTFSPLTLCFLGGHIHMWGLRMQKLGYEEEDEESPSVNHHQQLLNY
IEQHKILMRLHRLTRHTVSLAQLVQLGCSGASLCIIVCYVLFFVRDIITLLYYVIFIAVICVQLFPACYFASVVAEEMQS
FPYAIFSSKWYEESREHRRDLLIFTQLTLVGRSRVIKAGGLIELNLNAFFVTLKTAYSLFALVVQVKDI
Or35a \- GA14704 \- Not in BLASTP \- My model is
MVRYVPRLADGQRVRLAWPLALFRLNHIFWPLDPSTGKWGRYLDRFLAVLGCLIFVQHNDAELRYLRAEASNLNMDAFLT
GMPTYLILVEAQFRSLHVLLHFEELQRFLQRFYTTIYIDPRAEPDMFRRVDGQMLINRLVSAMYGAVISGYLISPVLSVI
NRRKDFLYSMVFPFDTEPLAVFVPLLLSNVWVGIVIDSMMFGETSLLCELIVHLNGRYLLLKRDLEESIQRILAERRRPQ
MARQLKELIIATLRQNVALNQFGEQLEAQYTVRVFIMFAFAAGLLCALSFKAYTNPMANYIYAIWFGAKTVELLSLGQLG
SSLAYTTDSLGSMYYHTHWEQVLEQSSNPLETLRLLRLIQLAIEMNSRPFYVTGLKYFRVSLQAGLKILQASFSYFTFLT
SMQRRQMSN
Or42a1 \- Not in BLASTP \- a recently duplicated copy of the gene below
MVLRKIFPAMYTLSEEAPACSRNGTLYLMRCIFVMGVRKPPARFFVAYCLWSIVMNLSSTFYQPIAFLTGYISHLSELSA
GELLTSLQVAFNAWSCSAKVLIVWALIKHFDDANDILDEMDRRLTQPSVRLRVHRAVSKSNRIFFIFMTVYMSYATNTCL
TAIANGKPLYQNYYPYLDWRSSSLHLGLQTGLEYFAMAGACLQDVCVDCYPVNFVLVLRAHMSIFADRLRQLGSDPEESP
EQRYEQLIQCIQDHKTILRFVDCLRPVISGTIFVQFLVVGLVLGLTLINIVLFANLGSAIAALFFMAAVLLETTPFCILC
NYLTDDCYNLADALFESNWIDGEQRYKKTLMYFLQKLQQPIKFMAMNAFPISVGTNIVVTKFSFSVFTLVKQMNIAEKLA
KVEGEADFN
Or42a2 \- GA14414 \- Fine as is
Or42b \- GA11791 \- Fine as is
Or43a \- GA14981 \- Needs three aa on N-terminus.
MATTIDDIGLVGINVRIWRYMAVLYPTPGTSWRKFAFVLPVCAMNLMQFFYLLRMWSDLPAFLLNLFFFAAIFNSLMRTW
LVIIKRREFEKFLEELFRLYRWILDSGDEYSRTILLEAEREAHRLAVFNLTASFLDIVGALVFTLFKDERSHPFGVALPL
LDMTRTPVYEIFYLLQIPTPLLLSVLYMPFVSVFAGFALFGRAMLRILVHKLSLIGGQQQDAGARYQRLTACIRFYIEVL
GYVRNLNNLVNLIVAIEAIVFGSIICSLLFCLNIITSPTQIVSIVMYILTMLYVLYTYYNRANDLVIENALVADAVYNVP
WYEGNMRFRKTLLIFLMQTQCPLEIRVGNVYPMTLAMFQSLLNASYSYFTMLRGVTNK
Or43b \- GA14700 \- Needs a couple aa on each end
MFFKLVYPAPLSEPIGTRDSTVYLLKTLHIAGLDFYNDFGIGRKILRVISFSYNIFYLPLSFPINYKIHFSQFPPDLLLQ
SLQLCLNTWCFSIKFFTLSILKERFEMANKCFDELDVYCVTPEEKRKVRVTVATINKLYLIFGIVYFLYATSTLVDGLFH
DRVPYNTYYPFIDWRLDRRQLYIQSFVEYFTVGYAIFVATATDSYPVIYVAALRTHILMLKDRIVRLGEANNEANADPDN
IFKSLVECIKAHRTMLNFCDTIRPIISGTIFAQFIICGSILGIVMINMVLFADQSTRFGIVTYVMAVLLQTFPLCFYCNA
IVDDCNDLADSLFHSAWWMQDKRYQSTALQFLQKLQQPITFTAMNIFTINLATNINVAKFAFTVYAIASGMNLDEKLQLQ
DSGADNP
Or45a \- GA15169 \- Needs a C-terminal exon.
MDDSYFSIQRRALEIVGFDPSTQRLHMRRPLWAGLLILSLVSHNWPMIVYGLQDLSDLTRLTDNLAVFMQGSLCTLKFL
AFIVKRRRIGALVHRLHGLNQEACASPLQREKILRENRLDMYVSRAFRNAAYAVTVASMIAPMLNGLIAYLTEGVFRPT
TPMEFNFWLDERQARFYWPIYAWGVLGVAAAVWLAIVADTLFSWLVHNVVAQFQLLKLLLADKERQQAADSDSHLAECI
RRHRLALELARELSAIFAEIVFVQYMLSYLQLCMLAFRFTRSGWSSQVPFRAAFLVTVFIQLSSYCYGGEYLKQQSSGV
ALAVYSGCDWSQMPPARRRLWQMMIMRAQRPAKVFGYMFDVDLPLLLWVTRTTGSFLALLRTFER
Or45b \- GA11917 \- fine as is
Or46aA \- GA14697-PA \- This is not in BLASTP \- needs to be alternatively
spliced to the same C-terminal exon as Or46aB, just as in Dmel. Proteins are
MSKRAEIFYTGQTFIFNIYSLMPQEQRWKRILHEINYWHVMGFWVLLFDLLLVLHVVSNLNNMFEIVRAIFVLATSAGH
TTKLISVKMNNVALQQLFDRLDDEDFRPEGPEERAIFAAACETTRKTRDYYAALSFAALAMILIPQFVLDWSHLPLGTY
NPFDDNPGSAGYWLLYCYQCLALSTSCLTNIGFDSLCCSLFIFVNCQLDILALRLQKIGQGKDNDNGNTKMSIDVQLKQ
CIRFHMAIVDLAETIERRLCTPISMQIFCSVLVLTANFYAIALLSDEKLALFKFVTYQACMLMQIFMLCYFAGEVTHCS
AELPHRLYNTNWMDWSRSDRRNALLFMQRLHYELRIRTINPSRAFDLALFSSIVNCSYSYFALLKRVNS
And
MNQQHLKVTGHFYKYQVWYFQILGIWKLPDGATGQQRRWHQLRFCSIFAILSGMLLLFAMELAGSIAHLREILKVFYMFA
TEISCMTKLMHLKLKSRKLAGLVTMIKSSSFSTKSEQEEKLMEAGRVSVVNLRNLYGISCLVTATLILLVPFFAGNSELP
LTMYELCSIEGRMCYWVLFLTHAVSLMSTCCLNIAFESVAYSVLTYLRVQVQMFALRLEQLGPAETPQDNQRIARELREC
SAHYNRIVQLKDLVEVFIKVPGSVQLMCSILVLVSNLFDMSTISIANGEAIYMTKTCIYQLVMLWQIFIICYASNEVTIH
SSRLCHSIYKSQWTSWNKENRQMILLMMQRLDSPLCLRTINPTFTFSLEAFGSIVNCSYSYFALLKRVNS
GA14698 also corresponds to Or46a gene.
Or47a \- GA12137 \- Needs a few aa on N-terminus
MSTTENFLLVQKATIAMLGFDLFSGSGEMWKYRHRCINVYSIATIFPFILAAVIHNMKNVMLLADAMVALLITILGLFKF
SMIIYLRKDFWRMIDTFRHLMTHEGEQGDEYAQIIVTANKQDQRVCGIFRTCFFLAWALNSVLPFVRMGLSYWLSGHVEP
ELPFPCLFPWDIHNKRNYALTFLWCAFASTGVVLPAVSLDTIFCSFTSNLCAFFKIAQYKVLRFKSETPEESQAKLNKIF
ALYQKSLDMCTELNHCYEPIICAQFFISSLQLCMLGYLFSITFSQTEGVYYASFIATIIIQAYIYCYCGENVKTESALFE
WAIYDSPWHESLGSGLESSSICRSLLISMMRASHGFRITGYFFEANMEAFSSIVRTAMSYITMLRSFS
Or47b \- GA12120 \- Needs internal additions and subtractions.
MAEPDYTSYLCLLRDFWGEFRSVQRQQTPGRIPRLLMHTQRAALVALCHYPNKKMSSKPVYRRINWILLFNQTLMFISMV
CGVHESSSIIDMGDDFVWLIGLGLISTKSYCMHARATEIDEVIRDMAYYDEVVRPIHDDEEILMWQRYCYMGEAYFGIGI
FSLVNAFGLAILLQPLLGEGRLPYHSLLPFGWHRQDLHPWTYRIAFGWLSVNSLHNLSTILFVDLLGISTILQTALNLKL
LSIELRKLGDLGSVSDNQFHVEFCRVVRYHQHIIRLVDKSNRAFYVTFIAQMIASFAMISISTFETMVAAADDPKMAAKF
VLFVMVGFVQLSAWCVAGNLVLYLSGEVGQAAFEISDWHTKSVSIQRDIAFIMLRAQKPLFYVARPFKPLSLGTYMIVLK
QCYRLLALLRESM
Or49a1 \- GA12084 \- Not in BLASTP \- My model for this duplicated gene is
MQEKQREYQDFTFLANIMFKTLGYDFLDSARPSWQTGLLRCYFFVCIASSSYEAFFVALECLQVESVAGSPSKIMRRALH
FFYMLSAAVKFVTLMIYRKRLRTLILSLKELYPADESLRREYEVNKYYLPRSTRYVFYSYYCFMAVMAIGPLPQSFMMYF
LKGHFPFLRTFPTQLCFRSDTPVGYAVAYFMDLTYSQFVVNVSVGADLWMMCVSSQICMHFGYLAKKLAAYLPSRERERE
DCEFLASLVQKHQLILRLHKEVNQIFGILLASNLFTTASLLCCIGFYTVVEGRSEEGMSYMIIFVVVSAQFYMVSSFGQQ
LIDLSSSISMAAYSQYWYDGSLRYKKDLLLIMARAQRPAEISAKGIIIISLDTFKILMTITYRFFAAIRQTVGK
Or49a2 \- Not in BLASTP and list \- My model for this duplicated gene is
MKEKEKQCEYQDFIFFANIMFKTLGYDFLDSARPSWQKVLLRCYFFLCIASNCYEASFVALRIIQWESVAGSPSKIMRQA
LHFFYMLSAEVKFVTLIIYRKRLRTLILGLQELYPTDDSLRREYEVNRYYLPRATRYVLYFYYFVMALMALGPLLQSFTM
YFLQGNDAKFLFLRIFPTRLSFRVDTPKGYAVAYIMDFTYSQFIVNVSLGTDLWMMCVSSQICMHFGYLAKKLAAYLPSR
ERERADCEFLCSFVQKHQQILRLHKEVNQVFGLLLASNLFTTASLLCCMAFYTVVQGLNAEGISYMMLFASVAAQFYMVS
SYGQRLIDLSFSISMAAYLQNWYDGSIRYKKDLLLIMARAQRPAEISAKGIIVISLDTFKILMSITYRFFAVIRQTVGK
Or49b \- GA14566 \- Fine as is.
Or56a \- GA11666 \- Needs and open intron removed and a C-terminal exon added.
MFRVKELLLPRGIFKNPMLRLHLRCFRLYGYVASKYQRRPWLSQARCILFTASIWMSCVLMLARVFQGYERLNDGATTCA
TALQYFTVSIATMNAIVRRERVVSMLREVHEDMQKLMKEADDQELDLVLSTQKYTKTITLILWVSSIGAGLMALSDCIYR
TLFMPQTVFNLPAVRRGEERPLLLFRLFPFGELYDNFVVGFLCPWYALGLGVTTIPLWHTFIMCLMKYVHLKLMILNKRV
PEMDIMRHNPFLDLDRLTPAQLNRWRIRLFTKFVTDHLKIRKFVKELELLICVPVMIDFIIFSILICFLFFALAVGSPTK
MDYFFMCIYIFVMASILLIYHWHATLISECHDELSFAYYSTPWYEFERSAQRMILFMMIHSQRPLQIRALMIPVNLGTFL
DIVRAAYSYSNLLRQIY
Or56N \- Not in BLASTP or list \- Gene downstream of Or56a and lost in Dmel.
MFRVKELLLPRAIFKSHILGLHLRGIRMYGYVAEKYQRWPLLSLVRCIIFMVSIWVSTVTMLARVFQGFENPNDGVLCWA
TTIMYISLSISALNSFVQRKRVKGMVRAIHEDIQKLMKEADDQELVLMLSTQKYIRMATWMLWYPALLTGIIAFTDSLYR
TVLLTLSVFNITERRDEQQYIFLLKVYPFGDVYNNFVFGLFGAWYALGLGINTIPLWNSFIVCLIKYVHLKLLILKKRVT
EMEITRFNPLLDLDRLTPAQLNRWRMRLIKEFVKEHLKIRRFVKELEQLICLPVLIDFIFFAISICFELYALIVGTPNEM
EYFLILCYISLTTLILFLTYWHVTLIGECHDDLCFAYYSTPWYEYDPTMKRTILFMMMHAQSPLRIRALMFPVDLKTFLD
IVLGAYNYFNILRGLY
Or59a \- GA22057 \- Fine as is.
Or59b \- GA17527 \- Fine as is.
Or59c \- GA14401 \- A couple aa missing from C-terminus.
MKKPLFERLRPVPLTKSVVSSDACIYFYRAATFLGWVPPKARLHRWAYLLWTCTTMVLGLVYLPLGLTLTYVVHFDKFAA
SEFLTSVQVDINCIGNCVKACVTFSQMWRMRRINAMIAPLDERCPTLNQRQILHKMVARGNRIIVFFLSMYIGFTTTTLF
SSVFAGKAPWQVYNPLVDWRQGTRQLWEASLLEYIVINIGICQELLSDSYPIVFLSIFRGHLAILKDRIKNLRCNPELSE
NENYQKLVDCIKDYRTIVQCCDLIRPIMSATIFAQFMLIGIVVGVASVNILFFTTSFWMTLSNIIFIAAICAESFPLCMT
CELLIEDCESLASGIFHSNWMDAERRYRSAIIYFLHRVQQPIQFWAGAIFPISVQSNITVAKFAFSIITIVNQMNLADKF
RKEA
Or63a \- GA22157 \- Needs a couple more aa on C-terminus.
MYSPEEAAALEKRNYRSIREMIRLSYTVGFNLMRPRRWDVALRIWTVVLSLSSLLSLYGHWQMFRHYVEDMPRIVETVST
ALQVLTSVFKMWYFLFAHRRIYELLRQARCHELLQRCELFATIADLPVAQVLRRRVAAIMQRYWGSTRRQLLIYLYSVIA
LTSNYFINSFARNLYRYLTQPPGSFEIVLPLPALYPGWEDKGLAFPYYHIQMYIETCALYICGMCAVSFDGVFIVVCLHG
VGLMESLGEMIAGATSPLVPPERRVEYLRGCIYQYQRVASFAEEINDCFRHLTLSQFLLSLFGWGLALFQMSVGLGTSSA
ITMIRMTMYLTASGYQVAVYCYNGQRFATASEQIAGAFYGCEWYAECREFRQLIRMMLTRTGRCFRLDVSWFLAMSLPTL
MSMVRTSGQYFLLLQNVSEKSG
Or65a does not appear to have an ortholog in Dp, instead the pair of
dmOr65b/c is orthologous to a quintuplication in Dp:
Or65b1 \- GA16875 \- Add 17aa, or even around 100aa, to the N-terminus, e.g.
MVEGPRDRHGAFGLKSYWNRFIGAFLDARGLLRDPKMVNRHSIAYYSRDQMKVMGLYINAEDKGQPLRRAWHVFLLVQFS
ALYASMFYGLLKSLDDIVETGRDLAFILGMFFIVFKMVFFSLYADEVDVLIDIMEDSHHAEVKGPGTETCRAIKRHDFLL
NVGLDFVWVVAVVVFVVLLIVTPFWADQSLPIHAVFPLELHDPAKHPIAHLVIYVCQSFSMAYLNIWLVATEGLSISLYA
QVTTALSVLCVELQQLRHFWGDSSEDRLRLELTRLVRTHQSIILTVDRCNQLFHGPLIMQMTVNFLLVSLSVFEALMARH
EPKVAAEFMVLMILALGHLSLWSKFGDMMSQQSLEVAHAAYEAYDPSVGSKRIHRDIGFMVRRSQRPLIMRASPFPAFNL
SNYMAILNQCYGILTLLLKTLD
Or65b2 \- Not in BLASTP \- My model is
MFELPRERVGLGLGEKWKSFMKVYMTFVTVYRSPDECPEHTVPHHCRAQLKAMGYYPNSEERRIPGRRTWFLFLFSQMTM
FFCSQCYGIFDSMDDLVEWGRDLAFIIASFFIYFKFIYFLLYADNVDEVVDGLEECYRWERAGPAASGVRSAKRLHYLII
IGMQIIWVVSMVAFVLLLVTTPLWTQQDLPLHVSYPFHLHDSSKHPVTHILIYISQSWSILYFLTGLISTEGLSITIYSQ
LTTGLTVLCVELRHLHQLCDGDEDLLRWEINRLVKYHQKIISLVDRSNEVFHGPLIMQMIVNFLLVSLSVFEAMMARHDP
KVAGQFILLMILALGHLSMWTKFGDMMSQESLEVAEAAYEAYDPNVGSKKIDCHIRLIMLRAQEPLIMRATPFPTFNMIN
YKAILNQCYGILTLLLNTLD
Or65b3 \- Not in BLASTP \- My model is
MSELPRERVGLGLGEKWKSFMKLYMTFVTVYRSPDESPEHTVPHQCRAQLKAMGYYPNSEERRIPGRRTWFLFIFSQITM
FFCSQCYGIFDSLDDLVEWGRDLAFIIASFFIYFKFIYFLLYADNVDEVIDGLEECYRWERAGPAAAGVRSAKRLHYLIV
IGMQIIWAFSMVVFVLLLVTTPLWTQQDLPLHVSYPFHLHDSSKHPVTHILIYISHSWSILYFLTGLVSTEGLSITIYSQ
LTTGLTVLCVELRHLHQLCDGDEDLLRWEINRLVKYHQKIISLVDRSNEVFHGPLIMQMIVNFLLVSLSVFEAMMARHDP
KVAGQFILLMILALGHLSMWTKFGDMMSQESLEVAEAAYEAYDPNVGSKKIDCHIRLIMLRAQEPLIMRATPFPTFNMIN
YKAILNQCYGILTLLLNTLD
Or65b4 \- Not in BLASTP \- My model is
MVEPSGERVGLRLSKKWKRFMKPYAPFRRVYRTPGKCPEHTVPYLNREQLISTGYYPNSTQNSVSGQRSFHLFLLVKGTI
FYSSILYAASESLDDIVELGRDLAFIIASFFIYFKLIYFLLYADNVDEVVDGLEECYRWERAGPAAAGVRSAKRLHYLIV
IGMQIIWVVSMIIFIVLLISTPFWTQQELPLHAAYPFHLHDSLRHPRIHILIYLSQSFDILYYLTWLTVTECMSVSIYSQ
LTTALSVLCVELRHLHQFCDGDEDLLRWEIHRLVKYHQKIIKLVDRCNEVFHGPLIMQMIVNFLLVSLSVFEAMMARHDP
KVAGQFILLMILALGHLSMWTKFGDMMSQESLEVAEAAYEAYDSSVGSKKIYWNIRFIIMRAQEPLIMRATPFPTFNMTN
YKAILNQCYGILTLLLNTLD
Or65b5 \- Not in BLASTP \- My model is
MVERVGFSLSDKGKSFLKPYIAIRMVYRRPDECPEHTVPYLNRDQLKAIGYYPNSEESSRPGRRNWHLFVMIKATIFFGS
VTYAIFESLDNIVEWGRDLAFIIASFFIYFKLIYFLLYADNVDEVIDGLEDCYRWERAGPAAAGVRSAKRLHYLIVIGMQ
IVWLAFMVVFVLLLVTTPLWTELSLPFHAAFPFHWHDPSKHPFTHILIYLSQTFDSAYFLMWLISIEGMSVAIYSQLTTA
LSVLCVELRHLHQFCDGDEDLLRWEINRLVKYHQKIINLLDRCNEVFHGPLIMQMIVNFLLVSLSVFEAMMARHDPKVAG
QFIVLMILALGHLSMWTKFGDVMSQESLEVADAAYEAYDPNIASKAVHKDLRVVIMRSQNPLIMRANPFPAFNMINYMAI
LNQCYGILTLLLNTME
Or67a \- Nothing in BLASTP or list. My model is
MSVKFLKEKIFSRRGKSEKPKNAYIVIEDFMKLPIYFYRTIGLNPYELTGTNNKPGIGFHILFLLHMINANMVLALEIFF
VYVSFRNNENFIESCMVMSYIGFVIVGDLKIGAVLLQKQKLTNLVRQMESVFPPARQKEQEEYDVRRYLRRCLRYTKGFG
GLYMTLVITYNLFAICQYSIQKWILHSPHAKQSVPYVPLTPWTWQDNWKFYPTYLSQSMAGYTATCGHISADLMIFAVAI
QVIMHFDRLAKSLTEFTVRAQSEEDGAEKDLKKLQELIAYHNKILGLTDVMNEVFGLALLLNFLASSTLVCFVGFQISIG
ISPEMLAKLILILISANSEIYLICNFSQMLIDASGSICYAVYDMNWSEADPRFRKMLIVLALRAQKPVCLTATVFLDISI
ETMSIFLRMSYKFFCAIRTMYQ
Or67b \- GA12805 \- Fine as is.
Or67c \- GA12792 \- Needs 8 aa on N-terminus.
MTGQEQEPDTARTFKDMMRVPVQFYRTIGEDIYAHRSTSPWRSLLLKVYLYGGFINFNLLVIGELVFFYKSIQDFETVRL
AIAVAPCIGFSLVSDFKQFAMAYYKGTLVRLLDELEEMHPKTLERQRAYRMPDFERTMKRVISIFTFLCLAYTTTFSFYP
ALKASVKFNLLGYETFDRNFGFLIWFPFDATSSNLVYWIVYWDIAHGAYLAGIAFLCADLLLIVVITQICMHFDYVSRRL
EEHPCEPGRDRENIEFLVWIVRYHNKCLTLCEHVNNLYSFSLLLNFLMASMQICFIAFQVTESTVEVIIIYCIFLMTSMV
QVFLVCYYGDTLIATSLRVGDAAYNQKWFQCSKTYCQMLKMLIMRSQRPASIRPPTFPPISLVTYMKVISMSYQFFALLR
TTYNAN
Or67d \- GA12793 \- Needs a couple aa on each end.
MSEGPFERYCKINRAIRFCVGLCGNDVIAEDYRMWWLTYAVIGAILFFFGCTGYTVYVGVVLDGDLTVILQAFALVGSAV
QGLAKLLVTARMAAVVRQIQATYEAIYREYARRGGDYGRCLERRIKTTWHMLMSFMWVYVVLVGGLIAYPFFHLILHHKK
LLVMQFRVPWIDESTDGGYLVLISIHVMLLSMGGFGNFGGDMFLFLFISNVPTLKDIFSAKLREFNEVAVRRQDYQRMRT
LLWDLLAWHQQYVSILRDTERIYRIVLFVQLSTNCVSILCTISCIFIGAWPAAPIYLVYSFIVMYSFCGLGTIVETSNED
FSKEIYANCLWYELPVKEQRLVILMLAKSQHEISLTAADVMPLSMSTALQLTKGIYSFSMMLITYLGYES
Or69aA \- No BLASTP and not in list \- My model is
MQLEDLMRYPDIAFRYFGMAPRFEWTARRTVAPKQTVTRQIVFMVGSLCLGYQNLGMIIYWVRFNSQQKEISMYVAKIAE
MGSVLALIFAGFLNIWALTSKRAQIEAVLAELQEMYPEPRQRLYRIRHYNDQAVGLMKFTVNFYVVFIIYYNIAPLVLLL
CEHLMDSQDISYRAQSYTWYPWQVYGSPLGYSAAYLCQAIGSILGVGFSMSSQQLICLFTTQLQLHFDAMANHLTAIDAK
EPTANQQLRSLILYHRRILRLGDRVNRLFNFTFVVSLIVSTIAICLTSIATMLLELHKALLYISGLIAFVFYHFLICYRG
SVLTLASDKVMPAAFYNNWYEGDLVYRKMLLILMMRSTKSYVWKTYNLAPVSIQTYMATLKFSYQMFTCVRSLK
Then there is a pseudogene version of this in between it and Or69aB \- my
model is
MMLEDFMQYPDSTCQWIQFRRFEWSGRGSLRPKKSLIKRIIFFAGHHQYGVPCDVIYAYRTERESKNSIMYEADLFSVGS
SLGFIMEGLCMIGMLIYYRLQIEELLEQLEDLFGIVRKKIYRLGHYEEZWRVMRKSZIIFIVCCTVYNLQSVLILFYEPT
EAQAVSYRIQRLPLGSXASMVNLGAMMTTMELELHLDGLVRQLEELDAKHPREKEKLRSLIYYHSRILRVADNVNRLFNF
SVFVSFSTSSLSMCFMFMFMCFTMTVRQLGSALKYMFGLVLFLVYTFSISYNGIZITEASDKVMPAAFYNNWYEGDLVYR
KMLLILMMRSTKSYVWKTYNLAPVSIQTYMATLKFSYQMFTCVRSLK
Then Or69aB \- not in BLASTP \- perhaps GA14713 in list? My model is
MVFISMQLADFMQYSDLGCQVALIRRYGWSGRQSPGAKQTLMKKVIFVLGALNMSCYFFSFITYGYHIERKTKEPIIYVA
ELSEVGGMLWFTVLGICNMYTLLIYRPQIEELLEGLEQLFAPARQSPYCTRYFYDDSALKMKRLSINFVCSATYYNLLPL
VKLLSELLTESQQVSYQVQSKAWYPWQVHGSTLGFWIAYASQAFASVMNLGMMMATECLVFVCTAQLELHFDGLARRLEA
LDARDPRAKEQLRALISYHTRLFKVADRANGIFNCTFLISYCVSSIAVCSMGFSMIMFDLGLALKYMVGMFLFMIYTFCI
CHNGTQVTMASDKVMPAAFYNNWYEGDLVYRKMLLILMMRSTKSYVWKTYNLAPVSIQTYMATLKFSYQMFTCVRSLK
GA16694 also corresponds to Or69a gene.
Or71a \- GA14707 \- Not in BLASTP \- my model is
MDLDRIRPVRFLLRILRWTRVWPASGGPATQRGWTNWEGYLLHSTLTLIFVVLLWVEAILSPDVEHTVDVLFFTLTMTS
MCVKVLNSWRYAHVAQRLLTEWSSADRFRLKTLQEVDMWRIGHLRFNRVVGVYMMCSVGVIPVLVTPCLFAVPNKLPFS
MWTPLDLQQPLGFWTAFSYQAVAIPFACLCDITLNLVNWYLMLHLSLCLRMLGQRLSALHRSGLQQDEEQLCAEFLELV
SLHRRIKQQALDIETVISKSTFFQILVSSLVICCTVYSLKMTPVIQDMGKFAGLIQYCLSMVLEILSPSIYGNEVTQSA
DKLPEALYSSDWPDLSPRLRRLILMFMIYLNRPLSLRAGGFFYMGLPMFTKVMNQAYSMLALLFNMNN
Or74a \- GA12488 \- Fine as is.
Or82a \- GA16295 \- not in list, but in BLASTP and fine as is.
Or83a \- GA10437 \- Needs aa on both ends.
MKERNPPLKIKGDSERRDLFVFVRYTMCIAAMYPFGYSLQGSSFLGLLVRVLDWFYEIFNYFVSVHILGLYICTIYINYG
QGDLDFFVNCMIQTIIYLWTIAMKLYFRRFRPKMLDEIMSIINENYHTRSALGFSYVTMSGAHHVSKLWIKTYVYCCYIG
TIFWLVLPIAYRDKSLPLACWYPFDYTQPIVYETVFFLQAIGQIQVAASFASSSGLHMVFCVLLSGQYDVLFCSLKNVLA
TSYIHMGGNMAELRQLQSEQSIADAEPNQYAYSHEEQTPLEQLLQHPPQDESSRDFLKAFKRSFRHCIDHHRYIVEVLKK
MERFYSPIWFVKIGEVTFLMCLVAFVSTKSTTANSFMRMVSLGQYLLLVLYELFIICYFADVVYQNSQRSGEALWRSPWQ
RHLREIRSDYMFFMLNARKQFQLTAGKITNLNVDRFRGTITTAFSFLTLLQKMDARG
Or83b \- GA10435 \- Fine as is.
Or83c \- GA13827 \- Needs a N-terminal extension, removal of an open
inframe intron, and addition of a C-terminal exon.
MSSPVAVNPAKRFRQLTKNINIWTNLLGVDVLAPKVKFNYRTWTTTFAIVNYTGFTIFSMVDNGGDWRVSLKASLMMGGL
FHGLGKFLTCLLKQRTMKRLTFFACNIYEEYEGKSEVHYRTLDMNIDRLLGLMRGIRNGYMATFLLMTILPMAMLLYDGT
RVTVMQYQIPGLPLENNVCYSITYLIQLVTIGVAGCGFYAGDLFVLLGLSQIFAFADILQLKVDDLNAALDRKSEARALV
SLGATITGEERRQELLLDLIKWHQLFTNYCHTVNELYHDLIATQVLSMAVAVMLSFCIILTSFHMPSAIYFLVSAYSMSV
YCVLGTKIEFAYDQVYESICSVSWQELSCDQRKLVGPMLREAQNPQSIKLLGILPLSVRTALQIIKLIYSLSMMMMQNRT
Or85b \- GA11167 \- Fine as is.
Or85c \- GA14720 \- Mine ends in a K.
MKFMKYANFFYKAVGIEPYTTDSRPQANSLKASIVFWANVLNLGAIVTGEILYLGVSLADGKLLDAVAVMSYIGFVIVGT
SKMFFIWWKKPALSDMVRDLEHIYPHGKEAEEEYKLQSYLRSSSRISVTYALLYSVLIWTFNLFSIMQFLVYEKLLHLRV
VGLALPYTVYYPWNWEAPWSYYMLLFCENFAGYTSAAGQISTDLLLCAVATQVVMHFDHLSTVLEGHELSGKWEEDSRFL
VNTVKYHQRILRLSEVLNDIFGIPLLLNFMVSTFVICFVGFQMTVGVPPDIMIKLFLFLFSSLCQVYLICHYGQLIADSS
LGLSNAAYKQNWNHADVRYRRALVFVIARAQKPAHLKATVFMNITRATMTDLLQISYKFFALLRTMYVK
Or85d \- GA11171 \- Needs aa on each end. And there is also a possible
long N-terminal extension with no Dm match. I don't know whether to
include it or not.
MEGTGKTPSTVEKAEKSEPITTERFLRYANIFYLSIGMEAYDHQGRRKMIELILRCIFIALILNLNAVLLSELIYVFLAI
GKGTNFLEATMNLSFIGFVIVGDLKVWHIWRKRDQLTNVVREMEKLHPKEGHHQKAYDVESHLSGYSRYSKFYFGMHLVL
IWTYNLYWAVYYLVCDFWLGIRHFVRMLPYYCWVPWDWSTNSSYYLMYVSQNMAGQTCLSGQLAADLMMCALVTLLVMHF
IRLGRGIEEHVAGLLSPQQDLEFLQAAVVYHQRLLQLCHNINEIFGVSLLCNFVSSAFIICFVGFQMTIGGKIDNLVMLV
LFLFCALVQVFMITTYAQRLLDASEHIGEAVYNHDWFQADLPYRKMLIFMVRRSQQASRLKATIFLNVSLVTVSDLLQLS
YKFFALLRTMYVK
Or85e \- GA21973 \- Not in BLASTP \- presumably because the Dm ortholog is
truncated in the genome sequence \- it is a polymorphic pseudogene in Dm.
Here's my model. Again there is a possible unaligned N-terminal extension.
MASLQFHGNIDADTRYDATLDPAREPELFRLLMGLQLAMGMNPSPRLPSWWPTWLRPVGGLMAKAYCSMVILTSLHLGLL
FTKTTLDVLPTGELQAITDALTMTIIYFFTAYANIYWCVRSQRLLAFMDHINREYRHHSLAGVTFVSSHAAHRWSRSFTT
IWILSCLVGVITWGVSPLMLGIRTLPLTCWYPFDALSPGTYTAVYATQLFGQISVGVTFGFGGSLFVTLCLLLLAQFDVL
YCSLKNLDAHSKLLSGETIAGLGLLQRELLQGAFTRELNQYAVLQEHATDLLRISAESQNLAQVKVFHSALVECVRLHRF
ILYCCAELENLFSPYCLVKSMQITLQLCLLVFVGVSGTREFLRIVNQIQYLALTLFELLMFTYCGELLSRHSVRSGEAFW
RGGWWKHAHLLRQDVLIFLVNSRRAVYVTAGKFYVMDVNRLRSVITQAFSFLTLLQKLAAKKATTEA
Or85f \- GA14128 \- The final intron has a GC donor, removing three aa
from your translation. A 72aa unalign N-terminal extension is also
possible in Dp.
MESVQRSYEDFPAMPSAVFRLMGYNVLDAPDETRSRRLLMWIYRWLCTCSHAVCVGFMIFRIFEVKTINSIPLIMRYVTL
VTYVINSDTKHATAMQRESIRNLNKKLADLYPKTTKDRVYYRVNEHYWSRSFLAMICIYIGSSIMVVIGPILQSIFAYFT
RHQFTYEHCYPYFIYDPNRHPVWVYIIIYATEWLHSTHMVISNVATDLWLLCFQVQICMHFSCMTRSLEEYQPDRTHDVD
DNRFLAQMVNKHEYLVILQNDLNGIFGGSLLLSLITTSAVICTVSVYSLIQGLTLEGITYVFFIGTSVMQLFLVCNHGQQ
LLDLSEDIGHAAYNHNWHKASIAYKKYLLIIIIRAQKPVELSAMGYLPISRDTFKQLMSVTYRGLAMIRQMIE
Or88a \- GA12932 \- Mine has a Q added on the end.
Or92a \- Not in BLASTP or list \- My model is
MLWGKRKEKRELRTFEDLTRFPMAFYKTIGEDLYSDRDKNLVRRYLLRFYLVMGFLNFNAYVVGEIAYFIVHITSTTTLL
EATAVAPCIGFSFMADFKQFGLTVNRGRLVQLLDDLKALFPTTLETQRAYNVSYYQRHMNQVMTLFTILCMTYTSSFSFY
PAIKATIKYYFLGSEIFERNYGFHILFPYDAETDLTVYWFSYWGLAHCAYVAGVSYVCVDLLLITTITQLTMHFSYMADT
LEAYDGDEHTDEENIKYLHNLVVYHSRALDLSEEVNSIFSFTILWNFIAASLVICFAGFQITASNVEDIVLYFIFFSASL
VQVFVVCYYGDEMISSSSRIGHAAFNQNWLPCSTQYKMILKYIIMRSQKPASIRPPTFPPISFNTFMKVISMSYQFFALL
RTTYYG
Or94a \- GA14408 \- Needs a K on the end.
Or94b \- GA19774 \- Fine as is.
Or98a1 \- GA18957 \- Not in BLASTP but is in list. My model is below.
MLNLLKEPTPGNLMSSPEAFKYMKNTMILMGWIPPQGGWTASSIIMSIFIVAVSVVYVPIGVFITFFVELKTLSPSEALS
VLQVALNAMGFPLKLLFFRLYMWRFYKIEKLLGRMDERCIDSTERSEVHRWVARCNIAYLIYQFIYISYTISTFLTATYS
GVVPWNIYNPFIDWRESTRNLWIDSVLELMFIIGIVIQTYMIDVFPLLYGLILRAHIKLLRQRVEKLCLDPSQSDDENNE
ELENCIEDHKLILEYASVIRPVIEPAILVQFMLVGLVLGISLINLYLFADTWAKLAIAAYIVVQIVQTFPFCYTCDLIRE
DCESLAVAIFHSNWKGSSRRYRSSLIYFLLNAQRTISFNAGSVFPICLNTNIRVAKLAFSVVTFVKHLGLGSR
Or98a2 \- Not in BLASTP or list \- Moved to another chromosome, but tree
shows is a Dp-specific duplication. My model is below, but a 17 aa
N-terminal extension is possible.
MLINLLKEPLPWNLMSSPEAFKYLEYAMILMGWTPPQEGWKPFRIILTIVISFWWILYVPIGVFITFFVELKTLSPSEAL
SVLQVGINAGGFPLKMIIMRLNMWRYHKIKELLGRMDKRCINITERLEVHRWVARCNIVYLIYQFMYTAYTLSTFFTAIF
SGVLPFHLYNPFIDWRESTMNLWIASVLELIPMNGIVSQTYMMDVFPLLYGLILRCHVKLLRQRIENLCSDPRKSDDENN
EDLVFCIKDHKLILEYVSVMRPVIETIIFVQFLLIGLLLGITMLNLFFFADFWAKLAIAAYINGLIIQTFPFCFTCDLLK
DDCESLALAIFHSHWKSSSRRYKSSLIYFLHNAQMPLSFTAGSIFPIGLKTNITVAKLAFSVVTFVQQLNIADKLRKE
Or98a3 \- Not in BLASTP or list \- Moved to another chromosome, but tree
shows is a Dp-specific duplication. My model is below
MSSPEAFKYLEYAMILMGWTPPQEGWKPFRIILTIVISFWWILYVPIGVFITFFVELKTLSPSEALSVLQVGINAGGFPL
KMIIMRLNMWRYHKIKELLGRMDKRCINITERLEVHRWVARCNIVYLIYQFMYTAYTLSTFFTAIFSGVLPFHLYNPFID
WRESTMNLWIASVLELIPMNGIVSQTYMMDVFPLLYGMIVRCHVKLLRQRIENLCSDPRKSDDENNEDLVFCIKDHKLIL
EYVSVMRPVIETIIFVQFLLIGLLLGITMLNLFFFADFWAKLAIAAYINGLIIQTFPFCFTCDLLKDDCESLALAIFHSH
WKSSSRRYKSSLIYFLHNAQMPLSFTAGSIFPIGLKTNITVAKLAFSVVTFVQQLNIADKLRKE
Or98b \- GA15044 \- Needs two aa on C-terminus.
MLTEKFLRLQSFYFRLLGLELLDQQEVQSVGDIRRSICCILAVATFLPLSIAFGLNNIHNMDKLTDTLCSVLVDLLALCK
IGIFLGLYKDFRRLIQRFHDMLERECHWEVAAKIVARQNDRDQFISSLYTICFLVAGGSACLMSPLHMIIRFWRTAEMEP
DYPFPSVYPWDNRRFHNYLLSYLWNVFAAFGVALATICVDTLFSSLTHNLCGLFEICRHKMLTFKRRRPVETQQNLRHIF
QLYGECLELGSSLNGFFRQIVFAQFIAASLHLCVLCYQLSSNLMQPGMLFYAAFMAAILGQVAIYCHGGACVQAESQLFA
QAIYESDWLPLLLGGRLDVGRSLQIGMVRAQRGCRLDGYFFEANRKTFDLIVRTAMSYVALLRSTS
OrN \- No GA \# \- This gene was lost from Dm.
MTFFTQCRVVSATTYKRILPDESQAHCEMERLRELTQRIRPSDVDEGRIGSIELNVWLAQLTGLPLSGLKPETKAESIRI
LVVSGVVLPLLFCYVVLEIYDLVLNWDNVDIMTQNVVMTLTHVGYWFKVLNTFYYYEDIRRIVFTLKHLTRTCVLSPGQR
ETFHQVEVENKVVCLFYFCLVVFSSTLAMVMLLIVPDNLAGKRFPYRVHMPHFLPPIVQYLYMGLSIIWISCGIPTIDNV
NMLFMNQICMHLKILNMAFDVLQRQVDPNIWMVSIVKYHSVLIKLRQRLEQIYRLPVLFQFVSSLLVVAMTAFQAIVGDG
SGSSVLIYFLFGGVMCQIFLYCWFGNEVFEQSKTLSTSAFGCNWHEFDAQFKRTLLIFMINADRPFLFTAGGFMGLTLTS
FANILGKSYSIVTVLRHMYGRAH
Subject: Re: Dpse Ors and Grs
Dear Peili and FlyBase, I've finished the Grs too now, and comments for
each are below. Almost every model needs at least slight changes, many
are new. Again I've attached the table from the manuscript for
additional detail \- the second table in the file is the Grs. Hugh
Gr2a \- GA14976 \- Needs a D added to the C-terminus
Gr8a \- GA13680 \- Needs a Q added to the C-terminus
Gr9a \- GA17078 \- Needs a different C-terminus to align well with DmGr9a
MSADWLDRCLGGYFQLLALRCSYSKRRAGRLLSNLYLMLVILDLVGQMRSYAHGEEPMVDRMLFFPKAIQAVNVFYKLVH
ALIALFALVGCQRERRLLQQLPPTHATPGIYRQVALEFLMVVYALWISFFDSLNAGQILENLRYVFSSQAVRARYLQMLL
LVGRLQAQLEQLQRQLIDCSLDDYQQLRSSYAHLARLCRSLSQLYGPSLLLLNVLVLGDCLIVCNVYFMVEHLEAVPATW
LVLWQAVYIVVPTLVKIWTVCAACDRFVMGSKILRQQLSDRRGRTSEERSQIEEFVLQIMQDTLQFNVCGIYHLNLQTLA
SMFFFILEVLVIFLQFVSVIR
Gr10a \- GA17050 \- but this is the second half of this annotation and
perhaps the GA17050 number should be retained for the first half, which
encodes the Or10a ortholog. My model is
MSAAEEHEPKESFWERHEFKFYKYGHIYAVIYGQVVIDYVPQQPMRRGLKTVLIAYSHLLSLILIVVLPGYFVYHFRTLT
ETHDRRMQLMLYVSFANTAIKYATVIVTYVANTVHFEAINYRCTVQRQRLETAFVGAPKQPKRSFEFFMYFKFCLINLMM
LVQVAGIFALYEAADGPSVSQVRLHFAIYAFVLWNYTENMADYFYFVNGSVLKYYRQLHLQLVSLRRQLGGLRPGGMLME
HCCRLSDRLGELRQRFGEIHDLYSESFQMHQFQLLGLVLATLINNLTNFYAIFNMLAKQSLEEISFPIVIGSVYATGFYI
DTYIVTLLNEHIKQELEGFALTMRTFAEPPSTVEQRLTQEIEYFSLELLKCRPPMLCGLLNFDRRLIYLIAVTAFCYFIT
LVQFDLYLRKSS
Gr10b \- GA11723 \- Needs a few aa on the N-terminus. This gene is also
truncated at the C-terminus by the end of a contig, hence is missing the
last exon, which sadly is only available from a single bad read. Here's
my best version.
MGLFLSLAGWVRTHSKRTLVYRQLLSAMWARWLYTLLLRIWMIQGLVLGITGHYYSPSRRRLVPSRVLHWYSWLMMLATL
ALYWGYWLYAQDYFVQGTFRRHGFVQSLSYGTVVLQLIALVVQTVLRMFREQEVCAVFNELMAMRSTVSRVHPQGQPSRF
YYVVFFGRLFNYLQNLNFSLSILLIVDMRSVSVWDYLSNFYFVYMSLVRDTLLMSYILLLLELGEVLRVNGEHPRTSYAG
LMRQLQRQERLLRLVRRVHRLYAWQVLATMFFQLYFNMATFYVGYSFLAASSAPVSGFRVWNAKFLLTVLTFITKLFDGL
LLQIVGERLLAQGNKPCAGPRVEDVTYMQAAQRQMEMASLKRAIPAGSPKKRFLRMVCIDMKGPFPGINCNLSYGIKILP
LGISPG
Gr21a \- Not in BLASTP, but GA12646 in list and genome browser \- My model is
below, starting where aligns with Dm. This is an extraordinarily
conserved protein, but it has a potential N-terminal extension of 55 aa,
but these do not align at all with Dm (which has its own potential 47aa
extension), and Anopheles gambiae, which has a highly conserved
ortholog, has no possible N-terminal extension. I there judge that this
is the likely true start codon for this protein.
MSFWAVSRGLTPPSKVAPMLNPNQRQFLEDEMRYREKLKLVARGDAMDEVYVRKQETVDDPLELDRHDSFYQTTKSLLVL
FQIMGVMPIHRNPPVKNLPRTGYSWGSKQVMWAIFIYSCQTTVVVLVLRERVKKFVTSPDKRFDEAIYNVIFISLLFTNF
LLPVASWRHGPQVAIFKNMWTNYQYKFFKTTGSPIVFPNLYPLTWSLCIFSWVLSIAINLSQYFLQPDFRLWYTFAYYPI
IAMLNCFCSLWYINCNAFGTASRALSDALQTTIRGEKPAQKLTEYRHLWVDLSHMMQQLGRAYSNMYGMYCLVIFFTTII
ATYGSISEIMDHGATYKEVGLFVIVFYCMGLLYIICNEAHYASRKVGLDFQTKLLNINLTSVDAATQKEVEMLLVAINKN
PPIMNLDGYANINRELITTNISFMATYLVVLLQFKITEQRRTNSQAA
Gr22a \- 16376 \- Needs for aa on N-terminus
MPQQPHRGYRQRVAHFTLKSTLYASWALGLFPFTYDSRTRQLTRSRWLLSYGLVLNLGLIGVVLLPGTEDHRDVRIDMFE
RNPIIQQVENMVEIISFLTAVAMHLGIFWKSREMVTILNELFFLEKRHFSNLILAHCHQFDKYVIQKCILVVLEVGSSLL
IYFGVPDSNLVVTRAFCIYLVQVGVLLGVTHFHLAVIYIYRFVWTINGQLLELANQQRRGQKVDPARIKRLFWLYSRLLE
VNSRLAAIYDIPVTLFMVTLMSANIMIAHVLIIIWINQFSLLDILLLFPQALLINFYDLWLSIAFCELVERTGRQTSDIL
KLYNDGEDMDEELQRSLSDFALFCSHRRLRFRHCGLFYVNYEMGFRMIITNILYLVFLVQFDYMNLKYK
Gr22b \- This is the ortholog of the Gr22b/c split in Dm (Gr22b is a
polymorphic pseudogene in Dm). In your list it is given the GA16572, but
it is not in BLASTP. My model is
MLRRREGFAPKLCRLVLQVTLYGSWAFGLFPFTLDPQTRQLRRHRWLLVYGVAINLFLLCIVVLLSEYSEETTQSLEVFQ
RNHLLGLMNVIIGVLALIASSGIILISFWKSGQALCIFNELLGLEHRHFGCLDVEDSSKFNLYVIQKGMSVVGELMGLVV
VNYGMPEYTMSYLYLALLCLTQFCVNLVVTQIYLAILYIYRSVWLINRQLLGVVSRLRVDPLSDSSRIPLLLSLYGRLLA
LHKQMETTFNGQITLILTSALAGNIVVIYFLIVYTVSLGQLSISLAIFPYSLIMNVWDYWLSISVCDLTERTGRHTSTIL
KLFSDLEHNDEDLERSINEFTWLCSHRRFRFGLYGLCSVNSETGFQMIVTSFMYLLCLVQFDFMNL
Gr22dP \- Like Dm, this is a pseudogene, but the defect is different, so
it is independently a pseudogene. My model is
MFRPHRGCRHKLVYFILYSILYSSWALGIFPFTYVSKERKLRRSKWLLVYGIAFNATLVVLMLRPHGGEGESMANDPKLD
VFQRNFVLKQISLLLGIGGVITICAMYLRTFWRSRDLCRIYNQLLHLEVTYFKDYSVECPTFDRYVIQKGLLVIGGVAST
LVIHMGMPNESVSLVXAGSFLLAIHFHLGVAFIYRFVWLINRELLDLANRLRVRPEGSSSRVRFLLTLYGRLLDLNTRLT
ACYDYQTAMMMAIFLGGNIIVSFYMIVFSVSLSKMSVFVMLIMFPLALINNFLDFWRTGRQTSMILKLFNDMEILDKEME
RSFAEFALFCSHRRLRFHHCGLFYVNYEMGFQMLVTSVLYLLFLVQFDYMNL
Gr22e \- GA16578 \- Fine as is.
Gr22f \- GA16574 \- I consider the pseudoobscura version of this gene to
have been lost, but it is a close call \- if not then the Gr22b gene
above is the equivalent.
Gr23aA \- GA13700 \- In list and browser only. This is an alternatively
spliced gene like DmOr23a; here are my two isoforms.
MPREPFTTITWSSDCSMNSFECLTRRCLAGVFWLMGLVPLPPSSQLCSLLLSLAIRCCWIVYLVYLLSIGIGFWSVATES
VGNVVGTMLFMGSTVLGLLLLLESVLKQKTHRQLEDLRFQSQLQLQRLGRFGRGRQAAYLLPLIGTQFACDLVRVLISAE
VGTMISPVFLVSLPLMWLLRLRYVQLVQHVMDLNHRSLQLRRSLLALAAGNDLWQPYGVQEWAQLQTLRKTYQRIFECYE
VLSDCYGWGMLGLHLATSFEFVTNAYWMITGLYEEQNLLILTFNGATAVDFGTPIATLSWYGDAGAENGREIGCLISKLV
KPLGSRRYNHLVSEFSLQTLHQRFVVTAKDFFSLNLHLLSSMFAAVVTYLVILIQFMFAERSANAYSG
And
MFPLSRKQAFARVVLRFLHLTLSALGLTSRRHSRAVQWLQFGCWLSWYTAIWALTVHRATRTEDCDLDCVLRYVLLVCET
GSHAIIVTNTFLQQDDSDSLECCDPVVGVTVVGLLVPIIAAQYLVCSNLDKFSERVISYYWKTLPSFLGLQFQIIAFIAQ
VMYVNLRVRLARRQLQALARELACSWPQSKLQAMYLDHQTARLVDLKRRYNELYHLYSRINERYGGSLLIIFIVFFAGFV
CNAYWLFIDLRTTPSRLYPILQNLGFIVNVALQMSAACWHCQQSYNLGREIGCLISKLVKPLGSRRYNHLVSEFSLQTLH
QRFVVTAKDFFSLNLHLLSSMFAAVVTYLVILIQFMFAERSANAYSG
Gr28a \- GA12527 \- not in BLASTP \- here's my model.
MAFKLWERISQADNVFQALRPLTYISLLGLAPFRLNLNPRKEVQTSTYSFVAGIVHYLFFVLCFVTSGLEGDSIIGYFFQ
TNITRLGDKTLRLTGILAMSTIFGFTMFKRQRLVSIIQNYIVVDEIFVRLGMKLDYRRILLFSFLISLGMLLFNVIYLCV
SYGLLVSATISPSFETFTTFALPHINISLMVFKFLCTTDLAKSRFSMLNEILQDILDAHIEQYNALELSPMHSVVNHKRY
SHRLRNFISTPMKRYSVTSIIRLNPEYAIKQVSNIHNLLCDICHTIEEYFTYPLLGIIAISFLFILFDDFYILEAVLNPK
RLDVFETDEFFAFFLIQMSWYIVIIILIVESSSRTILEGNQSAAIVHKILNITDDPELRDRLFRLSLQLSHRRVLFTAAG
LFRLDRTLIFTITGAATCYLIILVQFRATHHMEDAVGANASQLHFLHD
Gr28b \- GA12528 \- Like Dm28b, this is an alternatively spliced gene like
other Ors and Grs, where the last two exons are shared by multiple
alternative first exons. Here are the five proteins in order, A-E.
MIRSGLKSYRACRRRVGDALSARDYYGAIQMLIAIAYVLGITPFVVTHSAKGESGMQQSWYGFVNAISRWVLLAYCYTYI
NLHNESLIGYFMRNRISQFSTRLHNICGIWSAVFTFIMPLLLRRHLQRFIEDVMEVDRRLDRLRHPVNFNAVFGVATLVL
ALVALLDTIITVTCLVCLAQMEVRASWQLIFILVYELMAISMTIVMFSLITRTVQRRLNYLHTVLKNLSHQWDSHSLKGV
VQKQRSLQCLDSFSMYTIVTKDPAEIIQESMEIHHLICEAAATANKYFTYQLLTIISIAFLIIVFDAYYVLETLLGKSKR
ESKFKTVEFVTFFSCQMILYLIAIISIVEGSNRAIKKSEKTGGIVHSLMNKTKSPEVKEKLQQFSMQLLHLKINFTAAGL
FNIDRTLYFTISGALTTYLIILLQFTSNSPHNSNGNYYSCCETFNNITNHTN
MSTALRRVRKYFISSQVYEALRPLFFLTFLYGLTPFHVVRRKMGESYLKMSCFGVFNIFIYICLCGFCYISSLRQGESIV
GYFFRTEISTIGDRLQIFNGLIAGAVIYTSAILKRCKLLGTLTILHSLDTNFSNIGVRVKYSRIFRYSILVLVFKLLILG
VYFVGVFRLLVSLDVTPSFCVCMTFFLQHSVVSIAICLFCVIAFSFERRLSIINQVLKNLSHQWDSHSLKGVVQKQRSLQ
CLDSFSMYTIVTKDPAEIIQESMEIHHLICEAAATANKYFTYQLLTIISIAFLIIVFDAYYVLETLLGKSKRESKFKTVE
FVTFFSCQMILYLIAIISIVEGSNRAIKKSEKTGGIVHSLMNKTKSPEVKEKLQQFSMQLLHLKINFTAAGLFNIDRTLY
FTISGALTTYLIILLQFTSNSPHNSNGNYYSCCETFNNITNHTN
MEAGKDTDTVEIEVESGLCQPLRRRVRRFLSAKQLYECLRPVFHVTYWHGLTSFYISSDSATGRKDIKKTIFGFVNGIIH
ITLYAICYTLTIFNNCESVASYFFRSRITYFGDMMQIVSGFIGVTVIYLTSIIPNHRLERCLEKFHTMDMQLQTVGIKIM
YSKVLRFSYMILFSMFMVNICFTCGTFSVLYSSLVAPSMALHFTFLIQHTVISIAIAVFSCFTYLVEMRLVMVNKVLKNL
SHQWDSHSLKGVVQKQRSLQCLDSFSMYTIVTKDPAEIIQESMEIHHLICEAAATANKYFTYQLLTIISIAFLIIVFDAY
YVLETLLGKSKRESKFKTVEFVTFFSCQMILYLIAIISIVEGSNRAIKKSEKTGGIVHSLMNKTKSPEVKEKLQQFSMQL
LHLKINFTAAGLFNIDRTLYFTISGALTTYLIILLQFTSNSPHNSNGNYYSCCETFNNITNHTN
MSGFWRELFHPRDAHGSEQTLLLCTYILGLTPFRLRGEAGARHYQLSKIGYLNALIQLSFFSYCFLTALIQQQSIVGYFF
KSEISQVGESLQKFIGMTGMSILFLCSTIRVRLLIHLCNLISRIDGHLLDVGVVFNYPAIMRLRHSQLFLMSTVQLAYLI
SSTWMLLRNDVRPSYPAAVAFYVPLIFLLSTVILFGAFLHRLWQHLEALNKVLKNLSHQWDSHSLKGVVQKQRSLQCLDS
FSMYTIVTKDPAEIIQESMEIHHLICEAAATANKYFTYQLLTIISIAFLIIVFDAYYVLETLLGKSKRESKFKTVEFVTF
FSCQMILYLIAIISIVEGSNRAIKKSEKTGGIVHSLMNKTKSPEVKEKLQQFSMQLLHLKINFTAAGLFNIDRTLYFTIS
GALTTYLIILLQFTSNSPHNSNGNYYSCCETFNNITNHTN
MWLFNRLVYMARPHDVYTCYRVTLWLALWLGIVPYYLTSTSAGSSRLSASYFGYFNIIFRMIVYVVNFVYSALDPRSLMS
NFFLSDISNVIDGLQKVNGMFGIIAILLISLAKRRKLLHVLAVFDCLERESFPRVGITHHQGPAARRMNRLVLVMTGTTT
AYITCSFLMISMRDAGTFSISSVISYFSPHFIVCAVCFLAGNLMIKLRIYLSALHKVLKNLSHQWDSHSLKGVVQKQRSL
QCLDSFSMYTIVTKDPAEIIQESMEIHHLICEAAATANKYFTYQLLTIISIAFLIIVFDAYYVLETLLGKSKRESKFKTV
EFVTFFSCQMILYLIAIISIVEGSNRAIKKSEKTGGIVHSLMNKTKSPEVKEKLQQFSMQLLHLKINFTAAGLFNIDRTL
YFTISGALTTYLIILLQFTSNSPHNSNGNYYSCCETFNNITNHTN
Gr32a \- GA13351 \- Not in BLASTP \- here's mine.
MSPNTWVIEMPTQKARLHPYPRRISPYRTPSVNRYAFSHETPPPPPPPPPRTLEHPVFEDIRTIMSVLKASGLMPIYEQL
SSHEVGPPTKTNEFYSFFVRGVVHALTIFNVYSLFTPSSAQLFYSYRETDNVNQWIELLLCILTYTLTVFVCARNTKNIL
RIMNEILQLDDEVRRQFGANLSQNFGFSVKYLFGIAACQTYIIVLKIYAVDGVITPTSYVLLAFYAVQNGLTATYIVFAS
ALLRIVYIRFHFINQLLNGYTYAQQQRKKGGHRRQAAGATLMENFPEDSLFIYRMHNKLLRIYKGINDCCNLILVSFLGY
SFYTVTTNCYNLFVQITAKGMVSSNILQWCFAWLCMHVSLLALLSRSCGLTTREANATSQILARVYAKSKEYQNIIDKFL
TKSIKQEVQFTAYGFFAIDNSTLFKIFSAVTTYLVILIQFKQLEDSKVEDNIQDQQQT
Gr33a \- GA14395 \- The first exon on this is nearer and shorter, and
nicely shared with Dm, which needs updating as well.
MIQIMNWFSMAIGLIPLNRQQSESNVILDYAMMLLVPVFYLGCYFLINLTHAFGLCFLDACNSVCRLSNNLFMHLGAFLY
LTVTLMSLYRRKDFFLQFDERLNAIDAVIQKCRHVAEMDRVKVTAVKHSVAYHFTWLFLFCVFAFALYYDIRALYLTFGN
YAFIPFMVSSFPYLAGSIIQGEFIYHVSVISQRFEQINTLFEKINQEARHRHAPLTVFDIESEGKKQERKNLTPATAMDS
RGPSFGNEQKLSGEMKRQMAAPPPPPPQGQQKNEEDEMDSSYDEDEDDFDYDNATIAENTGNTSEANLPDLFKLHDKILS
LSVITNGEFGPQCVPYMAACFVVSIFGIFLETKVNFIVSGKSRLLDYVTYLYVIWSFTTMVVAYIVLRLCCNANNHSKQS
AMIVHEIMQKKPAFMLSNDLFYNKMKSFTLQFLHWEGYFQFNGIGLFALDYTFIFSTVSAATSYLIVLLQFDMTAILRNE
GLMS
Gr36a \- GA16444 \- Fine as is. The other GA models in here should be
dropped, that is, GA16445 and 16442. This is the single ortholog of a
triplication of DmGr36a-c.
Gr36d \- I dropped this from the chemoreceptor superfamily but the name
is stuck in FlyBase.
Gr39a \- GA16340 \- This gene is complicated. In Dm this is alternatively
sliced to give four proteins, each with its own first exon and sharing
the last 3 exons. In Dp there are seven isoforms similarly structured,
but two were lost from Dm, while one Dm first exon is duplicated in Dp.
It would be great to have these letters PA-PG along the chromosome as below:
MRDALHDLLKYQRRLGLTTVDTEDQNGNCCKLRPNWGTFFQFWLLQGVVVFTCSVFIIFWDHKFEATHTGVANHFAHVLE
VLEPLSISWLLVWMRLHEGRQVRLLNRLQEMARVCHQVVTIPRWLLRLWLISSVGIVLSCLLYAFTLTGLELVSLVPYGT
FILRHTYYNYLITFFTAIIFGMEQILMAHRRRIERSLRSTNKRELARSLCAIDEIHLLCETDINYIFGGSLALQMLYIVL
STASFGYILSLEWFELLTCGAIVLCIFPTMFYSTMPAWSIRLQVEANKTAKILAKVPRTGTGLDRMIEKFLLKNLRQQPI
LTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF
MGQKLLLTFLHYQRYLGLSDLDYSEAGRGYWLHATWYSYAAQFLVAGTFFSALVAALSEPLYYINTGSMTGNIFDNAVMM
TASVTQLLANLWFRSQQQTQVALLQRLSKIKEHLKVDTVALSSPRRMYRLWVGTWFFYGYMVGSFAASFWLAKPKLSHAL
TLLGFGLRVMSANFQYTCYSGMVCLLQRLLRAQAEELQILVDTHPIPLEALAKSLRVHDEILMLGQREFVQVYGGVLLFL
FLYQVMQCVLIFYVSTLKGSLNLRTTLTMSGWLAPMLLYLILPLMVNDVSNQANKTAKILAKVPRTGTGLDRMIEKFLLK
NLRQQPILTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF
MKLLKFLHYQRYLGLSNLDYSQHQQRYTLRGTCVSYALQFVVAVIFVSAFVSALAESVNYMQTNSLTGNIYDHAVILTVS
VTQLLANLWFRAHQQAQVTLLRRLSKVMRLLKVNTLALGQPRWVYRLWVAVCLWYAMMIGSFGSSIWLSGMKLSHILTLL
AFALRLLCANFQFTLYSSMVCVLQRLLSVQGELLQVLLGDPSGISRGALARCLRLHDEILMLSQGEFVQVYGGVLLFLFL
YQVMECVLIAYVSTLEGFRSMQELARIVCWISPMLVYLILPLMINDLSNQANKTAKILAKVPRTGTGLDRMIEKFLLKNL
RQQPILTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF
MEDPPSGELCVYYKICRYLGIFCIDYNLSKQRFWLRRSLICYVVHFAVQTYLIGCIAIMVLYWNHAFKEEMTQTGNHFDR
LVMLLALGMLLVQNAWLIWLQAPHLRIVRKLEFYRRKHLQHLRLQLPKRLLWIIIISNMLYLYNFVKICIFEWLSDATRL
FVITALGFPIRYLVTSFTMGTYCCMVHLMRHLLLSNQSQISLIISQIQDPKLGSANVLRLRGCLDMHDRLVLLCNVEISL
VYGFIAWLSWMFASLDVTGVIYLAMVVPSLRNPCVQIVGYLVWLTPSLMTCGASFMSNRVALQANKTAKILAKVPRTGTG
LDRMIEKFLLKNLRQQPILTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF
MSLGVELRVYLSSLKCLGLLCITHEPHDSEYRTHASRTDETLALAGLVCSQCIALLALGHAIVQPRVYELPLYTNVGNVY
YVANYGLTCLTVSLFYAYFYLRRRSFLSLVSVLLYHNRVDLGNCHSRQFLRLYIIFVSQVLLTGLLQMMIMLYCNIDPLH
SFLLFFFVSFSYMLIALVIAFYTCLVQIVASLVRLYNRDLTAAVHSRAPLSTTLCRLRLLQRNRLLWVCQQHLTSDFGLV
LTIIIAFLLFSAPAAPFFMVTIVFEIDARLVGMRHLLIPLAVTLLWNLPIVVALLMTLRSDLVGKEANKTAKILAKVPRT
GTGLDRMIEKFLLKNLRQQPILTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF
MRMHSFGELRTHLRTLRLLGVFRFHIDYDKCVVSSTPSEERVARFYLWGVLWILLNIQTYCTYMPQHFFMVNYNATGNCY
ALINIRTCNVTTVLIYTMLYVRRCRYARLLETMLRLNRASRDPQSSSLYGIHLTLFVLCMINYGHGYWRAQVRPTSIPIY
LFQYGFSYMLMGQLVVLFVSFQRILLSSLRCYNRKLLGSRQLSRECREFYEDFRDYNQIIRLCHEDINDCFGLLLLPITG
YVLVTTPSGPFYLISTLFEGLFRTPWRFAFMFLTCVFWSMPWVTLLVLAMGTTNVQREANKTAKILAKVPRTGTGLDRMI
EKFLLKNLRQQPILTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF
MSVCRDLRLYLRALKFLGMMCWQFETERCLLQSTPECERYAQVYTVVVLSGTTGALAYAHLQPDRFRMKVYNRTGNFYEA
VIFRCSCVVLWLLYVCLYLRRHRHMELVQSLLTINRTCLAGSADRQFRNNFVLYGALSGLIFGNHINGYRHAGLDSIALT
LNVVLYTYAFLVLCLLLVFFVCLKQIMAAGLTHYNEELRQGIASLDVATIFCGRQKIPSLRGRQQILALRGRQQLLALCE
GELNECFGLLMLPIVALVLLLCPAGPFFLISTVLEGKYGPKQYILMTLTSIFWDVPWMIILFLMLRTNGITVEANKTAKI
LAKVPRTGTGLDRMIEKFLLKNLRQQPILTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF
Gr39b \- GA16339 \- Just needs an S on the end.
Gr43a \- GA14333 \- There is an open intron that needs to be spliced. It
might also need to be removed from DmGr43a.
MEISQPSIGIFYISKILALAPYATRKNSKGQYEIRRSWIFTVYSASLTVTLVFLTYRGLLFDANSKIPVRMKSATSKVVT
ALDVSVVVLAVVSGVYCGLFSLRETLELNARLYRIDNTLNACNYSRRDRWRALGMASVSLVTISVLLGLDVGSWMRKAEE
MNIEESDTELNVHWYIPFYSLYFILTGLHINFANTAYGLGRRFRRLNQMLSSSFLGEPKEASALKLRKMSTVKTVSTNTT
MTPTALHSNLTKLSSEMLPNESAAKNKGLLIQTMADNHESLTKCVLLLSNSFGIAVLFILVSCLLHLVATAYFLFLELLS
TRDNGYLWVQALWILLHSFRLLMVVEPCHLAARESRKTIQIVCAIERKVHEPILAEAVKKFWQQLLVVDSDFSASGLCRV
NRTILTSFASAIATYLVILIQFQKTNG
Gr47a \- GA11896 \- This is a pseudogene in Dp \- my best version is
YCLIHHSLCTTIPEFVSPPZPZHTACIAITCWQGRHGVRKVFKDLLHLERQYFKGQPSGTPLKARYKAEEQGGTDEEVAX
LLFSKMLRIGSRZIVSTRSALSVCCPATAGSRSPSTPATPKSLDLSLSLRFFVILSYXLSTDITQLREDSLRILLETNRM
DRMVRSYPRTYVTLELALHPRRVEFLNVFAFDRKLTLTLLAKTLLYAICWLQGDYMTLKR
Gr47b \- GA15589 \- Here's my model
MKPRKSSSDGFIYCYGNLYSLLFYWGLVTFRVRTQSDGGAASSPVSVLYAVCVRCFTVCGYLGSVLIKLEDERMAAAMIG
HLTPVVKVIIMWECLSSSITYVENFLTLDVQRRRHVKLLANMQGFDLDISEEFPSVRWNYQRTRSKYWYGTVIVTICFFS
FSLSLILNMARCTCGLSSTLLMAGTYTLLTSSLGLVGFVHIAIMDFVRLRLRLIQKLLHQEYEGSTGRDRPQTTVHRRIA
KLFQFTKRCSHLLAELNAVFGFSLMTCFAYVICQKVLGSEAWDLEYTFMLLHVTLHSYKLIITSTYGYLLKREKRNCLRL
LGLYSQHFPQQPLARSQVEDFQHWRMHNRQAALIGSSIQLSVSTIFLVYNGMANYVIILVQLLFQQQQIKERQRELGRDV
DIVGPMGPRTHLD
Gr57a \- 12290 \- Fine as is.
Gr58a \- 15816 \- Here's my model.
MSIGLLLKTFHSYGLGSGLLPAPLRLDLDLDRIHFSKRSHQRRFYLAYTACLNVLLIVLLPCTFPVFMYDESYMRDKLVL
QWTFNLTNVTRIMAMVACGYLTWTKREPLLQLGEGLARHCHRCRQLENGALHPSGYRELQKRIRGLLRQQVFVLNLSIVS
GTLLLMRIDTDVRRSNIIMVVVHMLQFIYVSIMMSGLYVICLLLYWQMERVNLALKELCSRLHHEERNALLLSASLARQT
LHSLGHLVQLHCEGQRLMRSLFGIFDVTIAFLLLKMFVTNVNLLYHAVQFGNDSIDTTSVTKLWGESLIVTHYWTAVLLM
NLVDNLTRQNGLETGEILRQFSDLELVKREFQLELERFSDHLRCHSTAYKCCGLFVFNKQTSLIYFFSALVNVLVLYQFD
LKNSVLKIP
Gr58b \- GA12328 \- Needs two aa on the C-terminus, plus internally seven
aa need to be added back at the end of the intron.
MLHPKLGQALRVAYYHALVFGLMGTTLHIRGNSRLIRVEKVSWIYLAYSLVISGGLLLDTYFMVPKAILDGYIHHNIVLQ
WNFFLMLGLRIVTIFCSYGLVWLQRRQLVKLYVDSRHLWRNYRRLLKRMVDQQDLEKLQLSLTSLFWRQTIVVYGALLCS
SVVQYQLLSVINRQSLTALCARLSQLLHVLAVKMTFYALLLMLDHQFQAVLLALRTLQRRKGGKQKAKDLRRIAALHLDT
YHLARHFFGLYDVANAMLFINMCVTTTSILYHAVQYRNQSIPSDGWGNLFGSGLVLFNLCGTLMLMEKLDRVVSSCNVGP
ALRQFCDLRKISKELQMELEIFSTQVHRNRLAYKICGLVEVNNSACLSYIGSILSHVIILMQFDIRRQQME
Gr58c \- GA12324 \- Needs five aa on the N-terminus, plus a C-terminal exon.
MVHVSLLRFYFELSRLIGLCNLHYDPPHHRLVLNHVPTVVYCLVLDVAYVLIMPFAFSLLVGNIYGCKQLGMFDTVYNVM
GQAKLFSMLVLIGGVWLRRCRMEGLGNEYLKLLFHFRSAALNHVRRLCLWKVALTSSRFVMLIQILLTPNSLMHCKYTLD
RTGVAPFYLAAMGYALIMELMVTYVDVTVYMIHVSGNWLISSMTERLQEMIDDVEVLPKRLGRPRDMGLRQILSAWLLLW
HRCLHLDDLLKQLRDIFQWQILFNLGTTYIFSIATVFRLWIYIDYSKDFSLWTCLIMLFVFLAHHCEVMMQFSIFHTNTS
KWRKLQEQLQHLWFLNQSQNGAGLTAEVVLSRKLEFAILYLNRKLQARPQRVRHLHILGLFDLSRASGHAMTSSVFSNAL
VLCQIAYKIYG
Gr59a1 \- GA15713 \- Please use this for the first of these three
duplicated genes.
MGILLMLDIFQWFAVLIGLTSYRPVDDRFIQTRIAKAYTLILNVITVTMLPVALMSAVNYFYVAVWLPRFMWITPFVLYA
VNYVVIVQTLISRCHRDSILMELHHLVVKLNREMGRAEKKMNSKLRRLFYVKTLTTSYLSLCYILGTFLFSDELTFSMML
SAFLINNGYNILIATTHFYFVSFWQVARGYDFVNQQLEELISTPSPLTSRYTEEMRSLWSFHSSLGQTAHKINRIYGRQM
LASRFDYIIFTVINGYIGLMYSSREPTTLFAKCYGGLLYWIRTVDFFMTDYICDLVAQYQSMPKHTASEGVMSNELSSYV
IYQNSMNLNLKVCGLFPANRKQWLNMMGAILCHSVMLLQYHLMMSAKQRNQ
Gr59a2 \-
MGILLMLDIFQWFAVLIGLTSYRLVDDRFIQTRIAKAYTLILNVITVIMLPVALVSMVDYFYVAVWLPRFMWITPFVLYA
VNYVVIVQTLISRCHRDSILMELHHLVVKLNREMGRAEKKMNSKLRRLFYIKTLTTSYLSLCYVLGTFLSTNELTFSMML
SAMLINNSYNILIATTHFYFVSFWQVARGYDFVNQQLEELISTRSPLTSGYAEELRNLWSLHGSLSQTAHKINRIYGRQM
LASRFDYITFTVINGYLGLMYSSSEPTNLFEKCYEGLLYLIRTVDFFMTDYICDLVAQYQSMPKHTASEGVMSNELCSYV
IYQNSMNLNLKVCGLFSANRKQWLNMMGAILCHSVMLLQYHLMMSAKQHNQ
Gr59a3 \-
MRRFLLLDIFQWFAVIIGLTSYRVVDDRFVQTRLSRMYTLIVNVITVTMLPAATLDLFNSFSMGFWLPQFMWITPYVLHA
VNYAVIVHTLIFRGHRNIIQMELHHLSVKLNREMGRAGKQMNSTLRRLFYLKTLTLSCMCLWHFLQSFIVIGVSSLSKIL
GLIIINSGFMILMSIVHCYFVSLWKVAIGYDFVNQQLEELIATRSPLTSRYAEELRSLWSLHASLSLTAHKINRIYGRQM
LASRFEYITFTAVDSYMEIIYYFSESAPAISKCFGISFFSIRTIDFFMADYICDLVAQYQSMPKHTASEGVMSNELSSFV
IYQNSMNLNLKICGLFPANRKQWLNMMACILGNSAVLMQYHLMMSGKEEKYFKS
Gr59b \- GA15716 \- Just needs an N on the C-terminus.
Gr59c \- GA15710 \- Needs an open intron removed.
MADFVWIIQRFVYLYGRLVGVVNFTVDWRTGRAMITRWATIQAAVQNICLIGLLTFQLLHGDTVLFTFKHAKYLHEYVFL
MVTAVRHWAVLLTLVSRWRHRGDIVLIWNRLFRATQQRPDVIPLYRRRLILKFIFAVLSDNLHMVLDLSALRQKFSPALV
LKLIVWYLFTTIFNMIVAQYYLAMLQVNVSYTLIKRDLRELLTETQALCGSTNRRGGVFVTKCCALSDRLDRIAETQSKL
QALVDGMSKIFQIQSFSMTIVYYLSTIGTIYFAFCTIKYSSTGLGASNWGLLLIVLSTTFFYADNFITINIGFIIMDSNP
ELMKMLEERTLLCEELDERLKSSFESFQLQLARNPLEFYVMGLFKIDRGRIMSMANSLITHSIILIQWELQNN
Gr59d \- GA15766 \- Needs a C-terminal exon.
MPDLVKWCVRISYLYGRVTGTLNFEIDLKTGRTRVTKKATIYSALAHVFLITILAHHLWRVKPTSDLLANANALHENVFM
VVAWMRVSCALAALAGRWYHRRRYMRLISSFRCLYLKNTEVMQHCRRGFVSKCFIATMAESMQFLMALLVVWDRLTFSLL
IGIWSVMTVTAVMNVIITQYYFALGNARGHYKLLNRDLREVLAEARSLGPKRKRQNGVFITKCCFLADRVDEIAQTQSEL
QTLIERMSRIYELQVLCLFCTYYLTSVGNAYLLFSIYKYNNITQGWSKLSLLAGATFLVFYYADCLINSYNVFYLLEAHQ
EMHKLLEQRNLFPWGLDERLESAFDSLELSLARNPLQLHCFGLFKLDRSSAFDVGNSLLINSVLLIQYDMQNY
Gr59e \- GA17326 \- Needs 5aa on the N-terminus
MSSSISNRWGNLLLTISRCLAVAPTARQEGRFARWIHCFWCLVLLGYVWTGCIWKCIVFDAEMPTIEKLLYLMEFPGNIT
ITGFLVYHAVLNCPYARDVETQIHLLIGRQDFGVAQRLYQKHGKRTRHLLVQTIVFHGACIVVDIVNYDFNWWTTWSSNS
VYNLPALMISLGVLQYALAVHLLWLLKSHLCHCLEQLQKRRRLPQGIVNLDARYDRFFASLVDAGGCSSLVLEELRATYT
SIDRLHRQLLDKFGLFLLLNFGNSLCSFCEELYMVFNFFERPQWAAGMLLFYRILWLVMHGGRIWVILAVNEQLVEQECQ
LFLQLNQMEVCGSHLERTINRFLVQLQTSIGQPLLACGVIDLDTLAMGGFVGVLMAIVIFLIQIGLGNKSLMGVALNQSG
WIYI
Gr59f \- GA17325 \- Needs a T on the C-terminus. The third intron has a GC
donor. And for the N-terminus I'm pretty much stumped because I can't
build a phase 1 intron like Dm and DyGr59f and all the related genes.
Rather I have a phase 2 intron there.
MEAFLMSLAVDMGKPKKVPPKHPMTAERLLKGHPSFHQQTRRLYKALHWLLLISVLANTAPIAVLPGRQGIVYRHLHLCW
MAVSYGWFCLASYWEFVLITLNKVSIDCYLNAMESAIYVVHSASILILTFQWRHRAPAVIDRIVKSDLERGYSINCRQSK
HFLRVQLSLVLVLACSAFAIDICSQRFVVYKAILSIHSFVMPNIISSLSFIQYYVLLQGIAWRQAAVTESLQSELQHLPC
PRRWEVQRLRLQHVELTRFTKLVNTAYQYSIVLLIVGCFFNFNLNLFLVYKGIDVPELADWVRWIYMVLWLAMHMGKVYG
ILYFNHKVQDEQRKCLALLNGVQCVGPDLLDTLNHFVLQLQTNVRQHVVCGVIVLDLKYLSALLVASANFFIFLLQYDVT
YEALYKLT
Gr61a \- GA12601 \- Needs 8 aa on the C-terminus. And a 15aa N-terminal
extension is possible beyond that of DmGr61a.
MSKAPDSILRRLKVRRQKQRTILAMRWRCAKGGKEFKELDTFYRAIRPYLCVAQLFGIMPLSNVLSRDPQDVKFRLRSVG
MCFTGLFLLLGGIKTVMQANILFRTGLNAKNMMNLVFLIVGIVNWLNFTGFARSWSKLILPWSSLDILMQFAPYAPSKHS
LRSKLRLIGCVVGSLAVVDHLLYYASGYYSYHMHIFHCHTNHSRLSFGSYLEKEFSETFELLPYNMFSVCYGFWLNAAFT
FLWNFMDIFIVLTSIGLAQRFRQFADRVLALQGRQVPDTLWYDIRRDHIRLCELASLVDESMSNIVLMSCANNVYVICNQ
ALAIFTKLRHPINYVYFWYSLLFLLSRTSLVFMSASKIHDASLLPLRTLYLVPSTHWTEEVQRFVSQLTSEFVGLSGYRL
FYLTRKSLFGMMATLVTYELMLLQMDAKSHKAGLPDLCA
Gr63a \- GA13400 \- Needs six aa on the N-terminus.
MDMKFPHSFSKMANYYRRKKDAVFHNAKPINSGNAQAYLYGVRKYSIGLAERLDADYQPPPSDRKKSSDSTGSNNPEFTP
SVFYRNIAPVNWFLRIIGVLPIVRRGPARAKFEMSSASFVYSVVFFMLLACYVGYVANNRIHIVRSLSGPFEEAVIAYLF
LVNILPIMVIPILWWEARKIAKLFNDWDDFEVLYYQISGHSLPLRLRQKALYIAIVLPILSVLSVVITHITMSDLNINQV
VPYCILDNLTAMLGAWWFLICEAMSTTAHLLAERFQKALKHIGPAAMVADYRVLWLRLSKLTRDTGNAMCYTFVFMSLYL
FFIITLSIYGLMSQLSEGFGIKDIGLTITALWNIGLLFYICDEAHYASVNVRTNFQKKLLMVELNWMNSDAQTEINMFLR
ATEMNPSTINCGGFFDVNRSLFKGLLTTMVTYLVVLLQFQISIPTDKGDSDGGTNITVVDMLMDSLGNDMTILSASSSTT
THSTATSSTTPPPTSAKHGRGHRG
Gr64a \- GA16796 \- Fine as is.
Gr64b \- GA16793 \- Fine as is.
Gr64c \- GA16792 \- Needs MK on the N-terminus.
Gr64d \- GA17330 \- Mine is
MEWSAQSVPNKNTLHHAIGYVLVVAQFFGVLPLSGVEPSVPVASVRFRWFSPLNLLPVAALCFVLLDFVLSAKLVIQNGL
KLYTIGSLSFSVICIFCFGAFLLLAPRWPHIIRRTFECERIFLQSCYNSSIGRRFSQRLRRWAIALLVTALCEHLSYVVS
AVWSNWKQIRECHLDIDFWQNYFLRERQELFSILPYSTWFALYVEWCTLSMTFVWNFVDIFLILVCRSMQMRFQQLHWRI
RQHIGRRMSDEFWQEVRYDLLDLNDLLKLYDKELSGLVLVACANNMYFICVQIYHSFQVKGAVLDEVYFWFCLLYVVSRI
VNMVLAASSIPQEAKQINFTLDEVPTSCWSKELERLSEIFHNEAFALSGKGYFVLNRRLLFTMAATLMVYELVLINQMEG
EEVQRSICNRGAGSSMSIFFS
Gr64e \- No GA number \- Mine is
MARTTGDPVKRQKCIARIKFWRRSRVGSDITLGILKYKVVSNQAQRFQFSKINAFLRRAVRDDYRYSGSFQEAIKPVLII
AQIFALMPVRGIGSKLAEDLTFAWSSARTYYALAMMISFGVTSGYIVAFMTNISFDFDSVETMVFYGSIFLISMSFLQLA
TRWPAIAQEWQAVETKLPPLRLGKERRSLAHHIKMITLVATTCSLVEHLLSMTSTMTYSVACPRWPGHPVDNFLYFNFAT
VFHFVDYSTFLGLLGKVINVLSTFAWNFNDIFVMAVSVALASRFRHLNDYMQREARSATTVGLLDAVQSQFRNLCKLCQV
VDDGISTITLLCFSNNLYFICGKILKSMQTKPSASHTMYFWFSLTYLLGRTLVLSLYSSSINDESKRPLRIFRMVPREYW
CDELKRFSEEVHMDTVALTGMKFFRLTRGVVISVAGTIVTYELILLQFNKEETTAFTCENA
Gr64f \- GA16791 \- Needs an internal segment restored.
MKFLPAKLERKFRRLKKHSRSSLTRKLDVMHESARKKVIEENCDAYKNQKQSEYECRKRPTKFPGGTRETFLSEGSFHQA
VGRVLLVAEFFAMMPVKGVTAKHPGDLSFSWRNVRTCFCLVFIASSLANFGLSLFKVLNNPISFNSVKPIIFRGSVLLVL
IVALRLAQQWPTLMMYWHEVEQGLPQYPSQVGKGQMGHTIRMVMLVGMMLSFAEHLLSMISAIHYARYCNSTSDPIKNYF
LRTNDEIFYVTSYSTALALWGKFQNVYSTFIWNYMDMFVMIVSIGLAAKFRQLNNDLRNFKGMHMAPSYWSERRIQYRNI
CVLCDKMDDAISLITMVSFSNNLYFICVQLLRSLNTMPSVAHAVYFYFSLIFLIGRTLAVSLYSASVHDESRLTLRYLRC
VPKDSWCPEVKRFSEEVISDEVALSGMKFFHLTRKLVLSVAGTIVTYELVLIQFHEDNDLWDCNQSYYS
Gr65a \- removed from the superfamily as not clearly homologous.
Gr66a \- GA20169 \- Needs three aa on N-terminus, and a further 20 are
possible but don't align with DmGr66a. I also have a somewhat different
internal splice \- but not sure who is right.
MPPAQTESALPMVQPLLKEFQLLFYISKIAGILPQDLEKFRTKNVLEKSRNGMVYMLAMLIVYVLLYNILIYSFGEEDRT
LKASQSTLTFVIGLFLTYIGLIMMGTDQLTALRNQGRIGELYERIRQVDERLYKEKCVVDNSHIGGRIRFMLIMTFLFEL
SILLATYIKLVDYTQWMSVLWIVSAIPTFINTLDKIWFAVSLYALKERFEAINQTLEELVATHEKFKLWLRGDHDTSSRT
LDSSQPPEYDSNLEYLYRELGGLDMGSLKGSGKNKVAPVSHSMNSFGESIEASDKATHHPISVNMAHESELSNASKVEEK
LNNLCQVHDEICEIGKAMNELWSYPILSLMAYGFLIFTAQLYFLYCATQFQSIPSLFRSAKNPFITVIALSYTSGKCVYL
IYLSWKTSLASKRTGISLHKCGVVADDNLLFEIVNHLSLKLLNHSVDFSACGFFTLDMETLYGVSGGITSYLIILIQFNL
AAQQAKEAIQTFNSLNDTASLVGAATEMDNGSSTLYDLVTTTMLTPTV
Gr68a \- GA20248 \- Needs EV added to C-terminus.
Gr77a \- GA16898 \- Not sure why, but this time I extended the N-terminus
well beyond the Dm alignment.
MKFNQANSWRNQCSNAPMLQCANAPLPLALIRLCVLDSLWRTAQNASCIHNDMASSSLAGLTFYWLRKVAIAALLVLYGL
AKVFGLMAASTPRGGHRVRQSLYWRIHGYAMLVFVGCFSPIAFASVYHRMAFLRQNRLLLLIGFNRYVLMLLCAFATVCI
HTTKQEEIVGCLNQLLRCRRRLMRLMHTPELRQSIDRLSTRGNLLIVGVLIGVFILSPVHTLQILAWDPAVRENFLYVFS
LLFIYACQLILGLSLGLYVLVLVLLDHLGHHSNQLLARLLADAASLRAPLGCGIIRRRQQLYHSQQTWLALELWRLLHVH
HQLLALFRSICSLTGLQAVCFVSLVAMECMVHMFFTYFMHYSKFILRKYHRAFPFNYYAMAFVTGLFANLVLVICLTHRM
ICRFAHTREVLRSGVLALPPGGTVKQLNETLHYYGLYLKNAEHIFAVRACGLFKLNNVLLFCIVEGMLNYLMILIQFDKV
INK
Gr85a \- GA16233 \- This gene is truncated by the end of a contig, so I
built the rest of it from raw traces \- here's my extended version.
MSSLKRLVQLSFGFFCALNGIVPFYFGFSSGKLHWSRVLAYYRIIHNFIVIGLTIKFITNYWHFHTHVEHSRSKLMELNT
FTHFTIVLLSLLSCMECAHRQQDRIYGMIEKLLDLDRLSIELGYIAPKSHQRYIGLLVLAITPLLALRLFIHVGLNKIRT
RLGFDFPCNCFMSECMILGMSSVGFGIMAEICQCWWRLQTGLKRTLLDDSLPDQLNQLLQLQRMFQCLIDITAEFCTVFK
FVLLCFLVRNVWCGIVIGYMLVRIFCGHGVSELHLYQLYLAFVICIQPLLYSLLLNCLTHTTDSLMETTKEIVRESQGHE
LLVERSIQWFSLQLAWQHTNVHVYGTYRINRRLVFQSASVILLHVSYMVQSDYRSM
Gr89a \- GA13339 \- Not in BLASTP \- My version is below, but there is also
an 18aa N-terminal extension possible beyond the DmGr89a alignment.
MSRLPHVCGLCLLLWLWQLLALAPFSYSRSRGARCRRLLTLSGVLRWLLLIGLAPLMLWKSAAMYDATNVRHSMIFKNIA
LAAMTGDVFISLALLGAHLWHRRGLARLLNGLAQLHRKRKLGWGSTLLLWSKLLLSLYELLCNVPFLQGAGSRLPWTQLL
AYGVQLYVQHVSSVYANGIFGGMLLLLASLDHLEQESPALARLLKRERGWLRLSASFVDLFQLGIFLLVIGYFVNILANM
YAYMSYFVSQHGVPLTISNYCLIVTIQLYALILAAHLCQVRHGRLRQRCLELGYLPPELTHHQAMAWTPFPLFAPLDSLK
FSVLGLFTLDHAFWLFLVSYAMNFIVIILQFSLENMQHADDN
Gr93a \- GA12269 \- This needs N- and C-terminal extensions.
MDKYSQQKRGGGGVVAEAWSRGLLLTLYRAARVLGLISFRLDREELQLKLPKSGSRNRILETVWRCLVVLTYAGVWPQLS
AHLITDRPESYADMFALMQSFSVSILALVSFIIQAKGEDKFRTVLNRYLTLYRRICAVTRVDQLFPMKFIVYFLLKLLLT
IGGCVHEFPPLLKNEHFKDARNMVAVIVGIYMWLGTLFVLDACFMGFLVSGILYEHMAFNILLMLQRMQPIECEEKAVRM
SKYQRMRLLCDYADELDECASIYSELYGVTIAFRRMLQWQILFYVYYNFISICLILYLCILHYLNANEIALVSLAMATIK
LFNLILLIMCADYAVRESQKPNRLPLDIVCTDMDQRWDKSVETFLSQQQTQRLEIKVLGFFQLNNEFILLILSAIISYLF
ILIQFGITGGFEASEEVRKQFNSSSHQIQELLN
Gr93b \- GA16189 \- Needs 10 aa on N-terminus
MTGSSSARSTVMPRVSPWLNGPRISAGLLRGCFYYATVFGVATFRIGLQDDASKLRASSRKGYKWLSILIRVLGSCFYGY
SYGAWADQYTDWYLRLFFGLRLVGCLVCSVIILVLQVCYEKRILHLVNSFFGLFRRLRALTRTVKAGFGGRLELTLLMFK
LLSLAFVFLAFQWQYSPWVLLTILCDLYTSIGTGMIMHFCFVGYLSIGVLYAELNRYVDHQLRAQLSSLQDQVEEDDIQQ
QPDVQAHANLDECLAIYEEIHQVTCSFQRLFDLPLFLTLVQNLSAMAMVSYHAIMSREYHFSLWGLVLKLLIDVLLLTLA
VHGAVSSSRLVRRLSLENYTIGQSKSYHIKFEIFLGRLNHQQLRVCPLGMFEVSNELTLFFLSAMVTYLTFLVQYGIQTK
QF
Gr93c \- GA16064 \- Fine as is.
Gr93N \- GA16188 \- Not in BLASTP. In our opinion the ortholog of the
DmGr93d was lost in Dp, while the ortholog of this gene was lost in Dm.
Thus they are paralogs, hence the name Gr93N. This gene is actually more
similar in sequence and trees with DmGr92a, but that gene was lost from
Dp. Complicated, I know.
MFGVLRKMNASKLSAGILLVMYYHAIFMGIFSFKLQRHWISENGQMLMELRALPRPWLMRFYAIYRILAIGILAYFYLPW
ILRLEIFFERLVHFIRIITATLVCVCILRLQLLHKADSKQLMNTFFRLFRRVRALPSRKTFGYGGRRELVLLSLALICRI
HELVYILESDRQHFSMARFLSWWCDTFIVFGINMMMQMSFVCYLSIGILYSELNDFVRFQLRSELQALKRPHGLQPRRQH
LRTVRRKLNECLALYREIYALATTFQKLTDFPFFLSIVHNYTLLGVVIYRLTIVGWFDKHKIQLSILTTKVILDFFLVTM
AVEGAMTQFRVIRRLSLENCYISDHKDWHTTFDMFVTHLSLYEFRVRVLGLFDVSNELVLIVLSALVTFVIYVVQYRMQS
TGEAE
Gr94a \- GA16146 \- Needs aa on both ends.
MASSIDVTHRRMVKVLTITLILLMTVFGLLANRYDSRRRQSFKLSKVYLAYAILWTTAFAGIYGYQIYQDYIQGQINLRD
AVSLYSYMNITVAIINYVTQMIMNDTVAKTMSQVPLFQTLKMFHLDNASLLRSIAMALVKSVGFPLILETTFILQQRRLE
PELSLIWTVYRLLPLIISNLLNNCYFGAMIVVKEILKAINVRLESHRQQVNIMQREDQLKLNTSFYRMQRFCSLADELDR
LADRYIVIYVNSDKYLSLMSLSIILSLICNLLGMTVGFYSLYYALADTFLGSKPYDGLGALISLVFLFISLTEITMLAHL
CNNLLVATRRTAIILQEMNLQHADCRYRQAVHSFTLLVTLTKFKIKPMGLYELDMQLISNVFSAVSSFLLILIQADLSHR
FKMY
Gr97a \- GA17266 \- Fine as is.
Gr98a \- GA12669 \- I don't understand how you avoided the 11bp
frameshifting insertion relative to Dm around position bp90 in this
gene. My translation is below, with X for the frameshift. Thus I
consider this to be either a pseudogene or the translation would start
after this frameshift.
MRLMSGELDSCSLRRMHRVMKCLGIIPFESXPIQHFYLKVLCNLFMVFVIGTASSWRFSFNYEFTYDFLNDHMSRILDLT
NFLVLMSAHFTIVMEILWGNRSAEIEQQMEQILHDLRVHLGREVSLKRFRHYSNAIYGSLISRFLLLFAVAVYNNEGLVF
SAMYSEAVMQLRFTEFSLYCGVALAFHQELCSAGSSLLVELHLTRFDLWPLRRFTLEKLSRLQQIHGRLWQTIRLIERNF
QRSLSIMLLKFFVDTAALPYWLYLSKIQHTSVSVQYYCATEEFCKLMEIIVPCWLCSRCDLMQRKFRSIFYRLATGRRNG
QLNAALIRICIQLRQEKYQFSAGGFAYISTEMLGTFLFGMISYIVIGIQFNLNFNASNSSKLAAEAAVTDAPI
Gr98b \- GA15973 \- This needs aa on both ends, plus there is a possible
53aa N-0terminal extension not alignable in Dm. And GA15975 and GA15976
which are kept for DmGr93c/d orthologs can be dropped since their
ortholog in Dp was lost.
MKGRRRLLAAARPYLQIFSVFALTPPPNFFDRTTNRRLRRFLIVGYGVYSLFLLGILICMSYVNVLALNAEIEQFQLEDF
TRAMGRVQKVVLASMGIVIHLNMFLNYRRLGHIYEDIADLEMEIDDASQCFGGQPGHNSFRYRLATNCGLWLVALVGLMP
RFTIEAMGPFVSWPSKILSELVLIMLQLKCLEYCVFVVMVYGLVLRLRHTLRQLQVELADCNQRDMLQALCVALRRNQQL
LGRVWRLVGELEKYFTLPMMFLFLFNGLTILHVVNWAYINQFNPADCCRYVRLGNCVLLLINPLVACYLSQRCVNAYNSF
PRILHQIRCLSVANNFPILSMGLREYCLQLQHLRLLFTCGGFFDINLKNFGGLIVTILGYTIILVQFKFQAVAEEKGRFD
LNNSQSF
DOI
Associated Information
Comments
Associated Files
Other Information
Secondary IDs
    Language of Publication
    English
    Additional Languages of Abstract
    Parent Publication
    Publication Type
    Abbreviation
    Title
    ISBN/ISSN
    Data From Reference
    Genes (7)