Subject: Re: Dpse Ors and Grs Dear Peili and FlyBase, I've completed a first pass at trying to reconcile all of my versions of the DpOrs with yours. Below are comments on every gene. I am also attaching a draft of the Or table from our manuscript if that helps at all. I hope to have the Grs done by the end of the week. I don't imagine these changes can make it into Release 2, but perhaps there will be a Release 3? Any chance I could get GA numbers for those that don't currently have them to include in the manuscript? Hugh Or1a \- GA14708 \- Fine as is. Or2a \- GA16647 \- Needs nine aa on the N-terminus. MQPGSENQPELNTHSAVDYHWRVWQLTGLIRPAGISKSLYRTYAIGLNLMVTLLFPLSLLARLILTRSMKELCENLTITI TDITANLKFGNVYLVRRHLGEIRSLIQQLDCRALAICDRDELCALVQAVTTARNTFRTFAWIFVCGTSLSCVRVALARSR QLLYPAWFGVDWKNSNEAFVGIYVYQLFGLIVQAVQNCASDSYPPAYLCLLTGHMRALEVRVARIGCECASGQQSYDQLL ACIQDLTLIHRMHTIIQQILSVPCMAQFACSVAVQCTVAMHFLYVADADDRLAMILSVIFFVAVTLEVFVICYFGEKMRT QSEALCNAFYACNWVEQLPRFRRDLLFTLARAQRPSLILAGSYIPLTLETFKEVMHFAYSAFTLLLRAK Or7aP \- No GA# \- A fragmentary pseudogene in Dp \- here's chunks of translated pieces APEDSRPARTPLWRRIYRCFAAMIYVWQLXGIWGGMEITQVMTRKDLGSDLGSAAAATALDGQSRETSEFRRIGKAVRFC NRLVWFYQLLLFYLHQAQQPIALTAMKIFPINLAIAKFSFSLYALIKAMTLGERE Or9a \- GA13635 \- First two and last aa missing. MAKKDETNSLRVQILVYRLMGIDLWSPTAVNDRHLVTFVTMGPLFAFMLPMFLSARENITEVSLLSDTLGSTFASMLTLV KYLLFVYYRKEFVGMIYRIRSILEKEINVWPEAKEIVDAENRSDQMLSLTYTRCFCMAGVFAAIKPFVTMSLALLRDGGD GSKLHLELPHMGVYPYDYQVMWFFVPTYLWNVMASYSAVTMALCVDTLLFFFTYNVCAIFKIARQRLVHLPAMGEADPRG ELEAAVQVLLLHQKGLWIADFIADHYRPLIFLQFFLSALQICFIGFQVADLFPKPQSLYFIAFVGSLLIALFIYSRCGEN IKEASLDFGDGIYESNWMEFAPSTKRVLLIASMRAQRPCQMKGYFFEASMATFSTIVRSAMSYIMMLRSFNV Or10a \- GA14703 \- Needs five aa on N-terminus. MWSFHRLLRRDQPLRSYFFAVPRLSLDVMGYWPMGDDLPVRAIVHFVILSIGVVTELHAGFMFLQNAQITLALETLCPAG TSAVTLLKMLLMLRYRRDLTNVWRQLQRMLFDAGLNRPEQKAIIHDNSVLAARINFWPLSAGFFTCTTYNLKPLLIAFIL YLQDPDQELPWNTPFNMTMPKVLLAAPFFPLTYAFIAYTGYVTIFMFGGCDGFYFEFCVHISSLFQSLQEETRAIFRPFE EYLKLTPAQCARLELQLRGLIIRQNSVFELISFFRKRYTVITLAHFVSAALVIGFSICNLLTVGNNGLGALLYVAYTVAA LSQLLVYCYGGTLVAESSFELSRVAASCPWSLGAPRQRRVILLLILRSQRAPTMAVPFFSPSLNTFASILQTSGSIIALA KSFQ GA17050 is the same gene, but with the N-terminus intact, an open intron unspliced, and the C-terminus spliced into the Gr10a ortholog immediately downstream. As in the Gr notes, I recommend removing the Or10a coding region from the front of this annotation and making it just the Gr10 ortholog. Or13a \- Not in BLASTP or in list \- A single long ORF \- unclear why not annotated. MFNLQQYTAITFRLPVQCVWLKLNGSWPFVPSSRKDNSSMCSILYTAWAWYVIVSVGITIGFQTAYLVTHLSDIIMTTEN CCTTFMGALNFVRLLHLRFNQRKFRQLIEQFERDIWIPENTHISVNAECHKRMFTFTIMTGLLSCLICMYCALPLVEIFF RTGVDAVEKPFPYKMLFPYDPYSSWMRYVFTYMFTSYAGICVVTTLFAEDSIFGFFITYTCGQFRLLHERVDNVLTAAKE RADEQHLQLQRLHNIVQQHNKIIRFAKCLEDFFNPILLVNLMISSLLICMVGFQIITGKNMFIGEYVKFIVYISSAISQL YVLCENGDALIIHSTNTARHLYGCDWENPDIRNFYSYTSMSHQLRNDLKFMILCSQRPVRITAFKFSTLSLQSFTAILST SMSYFTLLRSLYYEDQL Or19a \- GA17168 \- This protein can be extended 18 aa at N-terminus, although the Dm ortholog does not extend like this. I'm not sure what the best approach is for examples like this, of which there are several in Dp. MGYKRVWPSRGQSKVLDDMRPVDSMSAFRYHWQIWRVMGMHPADPQTLWGRHYTLYGIVWNAFFRLGMALSLVVNFLLST SLESFCESLSVAVPHTVANLKVFFLWRMRQQILQTHPILHHLDGRIGSLAEKQSILEGIDRAYFTFISFLRAIIFILAVG ILILCLSSDRPLLYPSWMPWNYKDSSFTVYAMTVCLHSVGIIENALLVCNVDTYPGSYLNMLAAHTQALAHRVSRLGYDP RLTRSQACDRLRSCILDHQIIMNLFKSLEHSLSMSCFLQFASTAIAQCATCFFVIFVSVGTMQSVNMIFLFLVFTTQTLL LCSSAELVRHEGENLIKAIYDCNWLDQSVEFRRMLLLMLARSQRPMILRAGLIIPVQMSTFMVVCKGAYTMLTLLREVDN SEVA GA15104 is listed in your table as an ortholog of DmOr19a, and is shown in the genome browser in the same location, but it doesn't come up in the BLASt searches. Dm Or19a/b are very recent duplicates of a single ortholog in Dp, GA17168, so perhaps GA15104 should be dropped? Or22a/b \- This duplication is independent of a triplication in Dp yielding two intact genes for which I propose the models below for the GA numbers shown in the genome browser. Or22a1 \- GA11469 \- not in BLASTP \- here's my model MLSKLFPRIKAKPLTERIQSRDAFVYLDRVQKLWGWRATEDERWMVLYNIWAFIWNVLLLVLLPLSMSMEYVQRFKNFSP GEFFGSLEICVDMYGCSLKCVYTMFGYKRFQAARKLLDRLDLRCTSDEDRASVHRSVALANRCYVTYHILYSGFVVINWT GYLLLGSHAWRMYLPGLDSEKNFLVTSFFELLLMSGVVTMNQCTDVSPLAHMIMARCHMGLLKDRLNKLHSDPSKTEEEH QEDLNRCIHDHCVILDYVNLLRPVYSVTIFVQFLLIGLVLGLSMIHIMFFSNFWTGIGTMCFIFDVCLETFPFCYLCNII IEDCRELSESLFQSDWLGASRKYKSTLVYFLHNLQQPIVLTAGGVFPICMQTNLSMVKLAFSVVTVIKQFNLAEKFQ Or22a2 \- GA18049 \- not in BLASTP \- here's my model MLSKLVPRIKAKPLTERTSSRDAFVYLDRVQKFFGWTAVEDKRWRIPYILWGIFMNLLLIFFLPISMLVAYIQMFKSFTA GEFLSSLEITVNMYGCVLKCIYTIWGFKGFTAARKVLDELDLRCTSDEERTSVHRCVALGNLSYVLFHIFYSGFVVINWT GYVLMGRHAWMMYLPGLDAENNFFVASLCEILLMSGVVTMDQCTDVSPLAHMLMARCHICLLKDRLTKLRTDPTKDEDEH YEELSNCVHDHRLILDYVKALRPTFSGTIFVQFLLIGIVLGLSMINVMFFSTLWTGLGTVCFMFCVCLETFPFCYLCNMI IDDCQKLSDNLFQSDWTTASRRYKSTLVYFLQNLQKPIILTAGGVFPICMQTNLSMVKLAFSVVTVIKQFNLADKFQ Or22aP \- Then there is a pseudogene most similar to GA11469 immediately upstream of GA18049 RKSKLVPRITAKPXDVLVYLDRVKKLWGWRAAEDEQWRILYNIWAFVWNMLLLVLLPLSMSMEYVQRFKXFSPGEFFGSL EICVDMXAKTLLDGLDLRCTSDEERASVHRYVAPANXCYVAYHILYSGFVVINWTGYLLLGRHAWRMYLPGLDSEWNFLV TSFFELLLMTAVVTMNLARCHMGLFKDRFTKLHSDPSKSEEEHQEDLNRYIQDHSEILEYVHLLRPVYSSTIFVQFLLIG LVLGLSMIHIMFFSNFWTGIGTMCFMFDVCLETFPFCYLCNIITDHCQDLADSLFQSNWMAASRRYKSTLVYFLHNLQQP IVLTAGGIFPICMQTNLSMAKMAFSVVTIIKQFNLADKFQ Or22c \- GA 13684 \- fine as is Or23a \- GA 22094 \- fine as is Or24a \- GA 11185 \- fine as is Or30a \- GA12048 \- Needs an internal segment restored. MDLKSMDTVDMPIFGSTLKLMKFWSYLFVHNWRRYVAMTPYIVINCTQYVDIYLSTESLDFIIRNVYLAVLFTNTVVRGV LLCVQRGSYERFIEVVKAYYIQLLESKDAHILRLVEEITRLSITIGRINLLMGTCTCIGFVTYPIFGSERVLPYGMYLPA IDEYKYATPYYEVFFVIQAIMAPMGCCMYIPYTNLVVTFTLFGILMCRVLQHKLRSLEKLEKGLVRREIIWCIQYHLKLA GLVDAMNSLNTHLHLVEFICFGAMLCVLLFSLIIAQTIAQTVIVIAYMVMIFANSFVLYSVANELYFQSFDIAIAAYESN WMDFDIDTQKTLKFLIMRSQKPLAILVGGTYPMNLKLLQSLLNVIYSFFTLLRRVYG Or33N \- lost from Dm \- This gene is upstream of the three that are orthologs of DmOr33a-c MELPSPVIASDYIYRTYWLYWRLLGVEGEHPLRYLLLIMQFFFVTIWYPIHLIVGLICDGTLAEVCRGIPITASCFFASF KIICFRWKLAEIKKVQQLFVELDQRIATAEERSSFYRETIRVAEFIGKSLLVAAFLAIITGTAFGLFRRERNLLYPGWFP YDVYSSDQRFWLSFSYQAAGHSLAILQNLANDSYPPMTFCVLAGHVRLLSMRLSRMGYDLTKPKELIVRELKDNIEDYRK LMKIVQLLRSTMHLSQLGQFISSGINIAITLVNILFFADNNFARTYYGVYFMAMLMEIFPSCYYGTLVSMELNGLTDSIF SSNWVGMDRGYCRTLLIFMQLTLAKVEIRAGGMIGISLNAFFATIRMAYSFFTLAMSLRK Or33a \- GA14239 \- Not in BLASTP \- My model is MASDPPDVRSGHIYRTYWLYWRLLGVEGEYPLRYLLDVILNFFVTIWYPTHLIIGLFQERTIGHVCKNLPFTAESFFCSF KIICFRWRLAEIKKIEQLLMELDQRAVSPEERVFFHQNTRSVAEFISKSYVAAGISATVTGTASVLFSSGRKLIYPAWFP YDVQASALRYWLSFTYQATGATLTILQNMANDSYPPMTFCVVAGHVRLLAMRLRRMGHHEKASKQGNAKKLIENIEDHRK LMQIVRLMHSTLYLSQLGQFISSGINISIVLINILFFAENGFAIIYYVVYFMAMVLELFPSCYYGTLMSMEFQKLPYAIF SSNWLGMGRRYCHTLLVLMQFTLTEVDIKAGGMVGISMNAFFATIRMAYSFFTLAMSFR Or33b \- GA14240 \- Fine as is. Or33c \- GA18589 \- Just needs three aa on N-terminus. MAAVIDSVRVYQPFWWCMRVMAPTFFGATQRPVQIYVGLLHLLVTFLFPVHLLVNLALQPTSAELFQNLSISMTCAACSL KHVAHLYHLQEIAEIQKLLIELDGYVDSEEEHRYYVDHLQCQARRFTRCLYASFVVIYVLFLLNLMILIASEDRMLVYPA YFPFDWQGNGYLYAIAVGYQSICLVLEGIQGVSNDTFSPLTLCFLGGHIHMWGLRMQKLGYEEEDEESPSVNHHQQLLNY IEQHKILMRLHRLTRHTVSLAQLVQLGCSGASLCIIVCYVLFFVRDIITLLYYVIFIAVICVQLFPACYFASVVAEEMQS FPYAIFSSKWYEESREHRRDLLIFTQLTLVGRSRVIKAGGLIELNLNAFFVTLKTAYSLFALVVQVKDI Or35a \- GA14704 \- Not in BLASTP \- My model is MVRYVPRLADGQRVRLAWPLALFRLNHIFWPLDPSTGKWGRYLDRFLAVLGCLIFVQHNDAELRYLRAEASNLNMDAFLT GMPTYLILVEAQFRSLHVLLHFEELQRFLQRFYTTIYIDPRAEPDMFRRVDGQMLINRLVSAMYGAVISGYLISPVLSVI NRRKDFLYSMVFPFDTEPLAVFVPLLLSNVWVGIVIDSMMFGETSLLCELIVHLNGRYLLLKRDLEESIQRILAERRRPQ MARQLKELIIATLRQNVALNQFGEQLEAQYTVRVFIMFAFAAGLLCALSFKAYTNPMANYIYAIWFGAKTVELLSLGQLG SSLAYTTDSLGSMYYHTHWEQVLEQSSNPLETLRLLRLIQLAIEMNSRPFYVTGLKYFRVSLQAGLKILQASFSYFTFLT SMQRRQMSN Or42a1 \- Not in BLASTP \- a recently duplicated copy of the gene below MVLRKIFPAMYTLSEEAPACSRNGTLYLMRCIFVMGVRKPPARFFVAYCLWSIVMNLSSTFYQPIAFLTGYISHLSELSA GELLTSLQVAFNAWSCSAKVLIVWALIKHFDDANDILDEMDRRLTQPSVRLRVHRAVSKSNRIFFIFMTVYMSYATNTCL TAIANGKPLYQNYYPYLDWRSSSLHLGLQTGLEYFAMAGACLQDVCVDCYPVNFVLVLRAHMSIFADRLRQLGSDPEESP EQRYEQLIQCIQDHKTILRFVDCLRPVISGTIFVQFLVVGLVLGLTLINIVLFANLGSAIAALFFMAAVLLETTPFCILC NYLTDDCYNLADALFESNWIDGEQRYKKTLMYFLQKLQQPIKFMAMNAFPISVGTNIVVTKFSFSVFTLVKQMNIAEKLA KVEGEADFN Or42a2 \- GA14414 \- Fine as is Or42b \- GA11791 \- Fine as is Or43a \- GA14981 \- Needs three aa on N-terminus. MATTIDDIGLVGINVRIWRYMAVLYPTPGTSWRKFAFVLPVCAMNLMQFFYLLRMWSDLPAFLLNLFFFAAIFNSLMRTW LVIIKRREFEKFLEELFRLYRWILDSGDEYSRTILLEAEREAHRLAVFNLTASFLDIVGALVFTLFKDERSHPFGVALPL LDMTRTPVYEIFYLLQIPTPLLLSVLYMPFVSVFAGFALFGRAMLRILVHKLSLIGGQQQDAGARYQRLTACIRFYIEVL GYVRNLNNLVNLIVAIEAIVFGSIICSLLFCLNIITSPTQIVSIVMYILTMLYVLYTYYNRANDLVIENALVADAVYNVP WYEGNMRFRKTLLIFLMQTQCPLEIRVGNVYPMTLAMFQSLLNASYSYFTMLRGVTNK Or43b \- GA14700 \- Needs a couple aa on each end MFFKLVYPAPLSEPIGTRDSTVYLLKTLHIAGLDFYNDFGIGRKILRVISFSYNIFYLPLSFPINYKIHFSQFPPDLLLQ SLQLCLNTWCFSIKFFTLSILKERFEMANKCFDELDVYCVTPEEKRKVRVTVATINKLYLIFGIVYFLYATSTLVDGLFH DRVPYNTYYPFIDWRLDRRQLYIQSFVEYFTVGYAIFVATATDSYPVIYVAALRTHILMLKDRIVRLGEANNEANADPDN IFKSLVECIKAHRTMLNFCDTIRPIISGTIFAQFIICGSILGIVMINMVLFADQSTRFGIVTYVMAVLLQTFPLCFYCNA IVDDCNDLADSLFHSAWWMQDKRYQSTALQFLQKLQQPITFTAMNIFTINLATNINVAKFAFTVYAIASGMNLDEKLQLQ DSGADNP Or45a \- GA15169 \- Needs a C-terminal exon. MDDSYFSIQRRALEIVGFDPSTQRLHMRRPLWAGLLILSLVSHNWPMIVYGLQDLSDLTRLTDNLAVFMQGSLCTLKFL AFIVKRRRIGALVHRLHGLNQEACASPLQREKILRENRLDMYVSRAFRNAAYAVTVASMIAPMLNGLIAYLTEGVFRPT TPMEFNFWLDERQARFYWPIYAWGVLGVAAAVWLAIVADTLFSWLVHNVVAQFQLLKLLLADKERQQAADSDSHLAECI RRHRLALELARELSAIFAEIVFVQYMLSYLQLCMLAFRFTRSGWSSQVPFRAAFLVTVFIQLSSYCYGGEYLKQQSSGV ALAVYSGCDWSQMPPARRRLWQMMIMRAQRPAKVFGYMFDVDLPLLLWVTRTTGSFLALLRTFER Or45b \- GA11917 \- fine as is Or46aA \- GA14697-PA \- This is not in BLASTP \- needs to be alternatively spliced to the same C-terminal exon as Or46aB, just as in Dmel. Proteins are MSKRAEIFYTGQTFIFNIYSLMPQEQRWKRILHEINYWHVMGFWVLLFDLLLVLHVVSNLNNMFEIVRAIFVLATSAGH TTKLISVKMNNVALQQLFDRLDDEDFRPEGPEERAIFAAACETTRKTRDYYAALSFAALAMILIPQFVLDWSHLPLGTY NPFDDNPGSAGYWLLYCYQCLALSTSCLTNIGFDSLCCSLFIFVNCQLDILALRLQKIGQGKDNDNGNTKMSIDVQLKQ CIRFHMAIVDLAETIERRLCTPISMQIFCSVLVLTANFYAIALLSDEKLALFKFVTYQACMLMQIFMLCYFAGEVTHCS AELPHRLYNTNWMDWSRSDRRNALLFMQRLHYELRIRTINPSRAFDLALFSSIVNCSYSYFALLKRVNS And MNQQHLKVTGHFYKYQVWYFQILGIWKLPDGATGQQRRWHQLRFCSIFAILSGMLLLFAMELAGSIAHLREILKVFYMFA TEISCMTKLMHLKLKSRKLAGLVTMIKSSSFSTKSEQEEKLMEAGRVSVVNLRNLYGISCLVTATLILLVPFFAGNSELP LTMYELCSIEGRMCYWVLFLTHAVSLMSTCCLNIAFESVAYSVLTYLRVQVQMFALRLEQLGPAETPQDNQRIARELREC SAHYNRIVQLKDLVEVFIKVPGSVQLMCSILVLVSNLFDMSTISIANGEAIYMTKTCIYQLVMLWQIFIICYASNEVTIH SSRLCHSIYKSQWTSWNKENRQMILLMMQRLDSPLCLRTINPTFTFSLEAFGSIVNCSYSYFALLKRVNS GA14698 also corresponds to Or46a gene. Or47a \- GA12137 \- Needs a few aa on N-terminus MSTTENFLLVQKATIAMLGFDLFSGSGEMWKYRHRCINVYSIATIFPFILAAVIHNMKNVMLLADAMVALLITILGLFKF SMIIYLRKDFWRMIDTFRHLMTHEGEQGDEYAQIIVTANKQDQRVCGIFRTCFFLAWALNSVLPFVRMGLSYWLSGHVEP ELPFPCLFPWDIHNKRNYALTFLWCAFASTGVVLPAVSLDTIFCSFTSNLCAFFKIAQYKVLRFKSETPEESQAKLNKIF ALYQKSLDMCTELNHCYEPIICAQFFISSLQLCMLGYLFSITFSQTEGVYYASFIATIIIQAYIYCYCGENVKTESALFE WAIYDSPWHESLGSGLESSSICRSLLISMMRASHGFRITGYFFEANMEAFSSIVRTAMSYITMLRSFS Or47b \- GA12120 \- Needs internal additions and subtractions. MAEPDYTSYLCLLRDFWGEFRSVQRQQTPGRIPRLLMHTQRAALVALCHYPNKKMSSKPVYRRINWILLFNQTLMFISMV CGVHESSSIIDMGDDFVWLIGLGLISTKSYCMHARATEIDEVIRDMAYYDEVVRPIHDDEEILMWQRYCYMGEAYFGIGI FSLVNAFGLAILLQPLLGEGRLPYHSLLPFGWHRQDLHPWTYRIAFGWLSVNSLHNLSTILFVDLLGISTILQTALNLKL LSIELRKLGDLGSVSDNQFHVEFCRVVRYHQHIIRLVDKSNRAFYVTFIAQMIASFAMISISTFETMVAAADDPKMAAKF VLFVMVGFVQLSAWCVAGNLVLYLSGEVGQAAFEISDWHTKSVSIQRDIAFIMLRAQKPLFYVARPFKPLSLGTYMIVLK QCYRLLALLRESM Or49a1 \- GA12084 \- Not in BLASTP \- My model for this duplicated gene is MQEKQREYQDFTFLANIMFKTLGYDFLDSARPSWQTGLLRCYFFVCIASSSYEAFFVALECLQVESVAGSPSKIMRRALH FFYMLSAAVKFVTLMIYRKRLRTLILSLKELYPADESLRREYEVNKYYLPRSTRYVFYSYYCFMAVMAIGPLPQSFMMYF LKGHFPFLRTFPTQLCFRSDTPVGYAVAYFMDLTYSQFVVNVSVGADLWMMCVSSQICMHFGYLAKKLAAYLPSRERERE DCEFLASLVQKHQLILRLHKEVNQIFGILLASNLFTTASLLCCIGFYTVVEGRSEEGMSYMIIFVVVSAQFYMVSSFGQQ LIDLSSSISMAAYSQYWYDGSLRYKKDLLLIMARAQRPAEISAKGIIIISLDTFKILMTITYRFFAAIRQTVGK Or49a2 \- Not in BLASTP and list \- My model for this duplicated gene is MKEKEKQCEYQDFIFFANIMFKTLGYDFLDSARPSWQKVLLRCYFFLCIASNCYEASFVALRIIQWESVAGSPSKIMRQA LHFFYMLSAEVKFVTLIIYRKRLRTLILGLQELYPTDDSLRREYEVNRYYLPRATRYVLYFYYFVMALMALGPLLQSFTM YFLQGNDAKFLFLRIFPTRLSFRVDTPKGYAVAYIMDFTYSQFIVNVSLGTDLWMMCVSSQICMHFGYLAKKLAAYLPSR ERERADCEFLCSFVQKHQQILRLHKEVNQVFGLLLASNLFTTASLLCCMAFYTVVQGLNAEGISYMMLFASVAAQFYMVS SYGQRLIDLSFSISMAAYLQNWYDGSIRYKKDLLLIMARAQRPAEISAKGIIVISLDTFKILMSITYRFFAVIRQTVGK Or49b \- GA14566 \- Fine as is. Or56a \- GA11666 \- Needs and open intron removed and a C-terminal exon added. MFRVKELLLPRGIFKNPMLRLHLRCFRLYGYVASKYQRRPWLSQARCILFTASIWMSCVLMLARVFQGYERLNDGATTCA TALQYFTVSIATMNAIVRRERVVSMLREVHEDMQKLMKEADDQELDLVLSTQKYTKTITLILWVSSIGAGLMALSDCIYR TLFMPQTVFNLPAVRRGEERPLLLFRLFPFGELYDNFVVGFLCPWYALGLGVTTIPLWHTFIMCLMKYVHLKLMILNKRV PEMDIMRHNPFLDLDRLTPAQLNRWRIRLFTKFVTDHLKIRKFVKELELLICVPVMIDFIIFSILICFLFFALAVGSPTK MDYFFMCIYIFVMASILLIYHWHATLISECHDELSFAYYSTPWYEFERSAQRMILFMMIHSQRPLQIRALMIPVNLGTFL DIVRAAYSYSNLLRQIY Or56N \- Not in BLASTP or list \- Gene downstream of Or56a and lost in Dmel. MFRVKELLLPRAIFKSHILGLHLRGIRMYGYVAEKYQRWPLLSLVRCIIFMVSIWVSTVTMLARVFQGFENPNDGVLCWA TTIMYISLSISALNSFVQRKRVKGMVRAIHEDIQKLMKEADDQELVLMLSTQKYIRMATWMLWYPALLTGIIAFTDSLYR TVLLTLSVFNITERRDEQQYIFLLKVYPFGDVYNNFVFGLFGAWYALGLGINTIPLWNSFIVCLIKYVHLKLLILKKRVT EMEITRFNPLLDLDRLTPAQLNRWRMRLIKEFVKEHLKIRRFVKELEQLICLPVLIDFIFFAISICFELYALIVGTPNEM EYFLILCYISLTTLILFLTYWHVTLIGECHDDLCFAYYSTPWYEYDPTMKRTILFMMMHAQSPLRIRALMFPVDLKTFLD IVLGAYNYFNILRGLY Or59a \- GA22057 \- Fine as is. Or59b \- GA17527 \- Fine as is. Or59c \- GA14401 \- A couple aa missing from C-terminus. MKKPLFERLRPVPLTKSVVSSDACIYFYRAATFLGWVPPKARLHRWAYLLWTCTTMVLGLVYLPLGLTLTYVVHFDKFAA SEFLTSVQVDINCIGNCVKACVTFSQMWRMRRINAMIAPLDERCPTLNQRQILHKMVARGNRIIVFFLSMYIGFTTTTLF SSVFAGKAPWQVYNPLVDWRQGTRQLWEASLLEYIVINIGICQELLSDSYPIVFLSIFRGHLAILKDRIKNLRCNPELSE NENYQKLVDCIKDYRTIVQCCDLIRPIMSATIFAQFMLIGIVVGVASVNILFFTTSFWMTLSNIIFIAAICAESFPLCMT CELLIEDCESLASGIFHSNWMDAERRYRSAIIYFLHRVQQPIQFWAGAIFPISVQSNITVAKFAFSIITIVNQMNLADKF RKEA Or63a \- GA22157 \- Needs a couple more aa on C-terminus. MYSPEEAAALEKRNYRSIREMIRLSYTVGFNLMRPRRWDVALRIWTVVLSLSSLLSLYGHWQMFRHYVEDMPRIVETVST ALQVLTSVFKMWYFLFAHRRIYELLRQARCHELLQRCELFATIADLPVAQVLRRRVAAIMQRYWGSTRRQLLIYLYSVIA LTSNYFINSFARNLYRYLTQPPGSFEIVLPLPALYPGWEDKGLAFPYYHIQMYIETCALYICGMCAVSFDGVFIVVCLHG VGLMESLGEMIAGATSPLVPPERRVEYLRGCIYQYQRVASFAEEINDCFRHLTLSQFLLSLFGWGLALFQMSVGLGTSSA ITMIRMTMYLTASGYQVAVYCYNGQRFATASEQIAGAFYGCEWYAECREFRQLIRMMLTRTGRCFRLDVSWFLAMSLPTL MSMVRTSGQYFLLLQNVSEKSG Or65a does not appear to have an ortholog in Dp, instead the pair of dmOr65b/c is orthologous to a quintuplication in Dp: Or65b1 \- GA16875 \- Add 17aa, or even around 100aa, to the N-terminus, e.g. MVEGPRDRHGAFGLKSYWNRFIGAFLDARGLLRDPKMVNRHSIAYYSRDQMKVMGLYINAEDKGQPLRRAWHVFLLVQFS ALYASMFYGLLKSLDDIVETGRDLAFILGMFFIVFKMVFFSLYADEVDVLIDIMEDSHHAEVKGPGTETCRAIKRHDFLL NVGLDFVWVVAVVVFVVLLIVTPFWADQSLPIHAVFPLELHDPAKHPIAHLVIYVCQSFSMAYLNIWLVATEGLSISLYA QVTTALSVLCVELQQLRHFWGDSSEDRLRLELTRLVRTHQSIILTVDRCNQLFHGPLIMQMTVNFLLVSLSVFEALMARH EPKVAAEFMVLMILALGHLSLWSKFGDMMSQQSLEVAHAAYEAYDPSVGSKRIHRDIGFMVRRSQRPLIMRASPFPAFNL SNYMAILNQCYGILTLLLKTLD Or65b2 \- Not in BLASTP \- My model is MFELPRERVGLGLGEKWKSFMKVYMTFVTVYRSPDECPEHTVPHHCRAQLKAMGYYPNSEERRIPGRRTWFLFLFSQMTM FFCSQCYGIFDSMDDLVEWGRDLAFIIASFFIYFKFIYFLLYADNVDEVVDGLEECYRWERAGPAASGVRSAKRLHYLII IGMQIIWVVSMVAFVLLLVTTPLWTQQDLPLHVSYPFHLHDSSKHPVTHILIYISQSWSILYFLTGLISTEGLSITIYSQ LTTGLTVLCVELRHLHQLCDGDEDLLRWEINRLVKYHQKIISLVDRSNEVFHGPLIMQMIVNFLLVSLSVFEAMMARHDP KVAGQFILLMILALGHLSMWTKFGDMMSQESLEVAEAAYEAYDPNVGSKKIDCHIRLIMLRAQEPLIMRATPFPTFNMIN YKAILNQCYGILTLLLNTLD Or65b3 \- Not in BLASTP \- My model is MSELPRERVGLGLGEKWKSFMKLYMTFVTVYRSPDESPEHTVPHQCRAQLKAMGYYPNSEERRIPGRRTWFLFIFSQITM FFCSQCYGIFDSLDDLVEWGRDLAFIIASFFIYFKFIYFLLYADNVDEVIDGLEECYRWERAGPAAAGVRSAKRLHYLIV IGMQIIWAFSMVVFVLLLVTTPLWTQQDLPLHVSYPFHLHDSSKHPVTHILIYISHSWSILYFLTGLVSTEGLSITIYSQ LTTGLTVLCVELRHLHQLCDGDEDLLRWEINRLVKYHQKIISLVDRSNEVFHGPLIMQMIVNFLLVSLSVFEAMMARHDP KVAGQFILLMILALGHLSMWTKFGDMMSQESLEVAEAAYEAYDPNVGSKKIDCHIRLIMLRAQEPLIMRATPFPTFNMIN YKAILNQCYGILTLLLNTLD Or65b4 \- Not in BLASTP \- My model is MVEPSGERVGLRLSKKWKRFMKPYAPFRRVYRTPGKCPEHTVPYLNREQLISTGYYPNSTQNSVSGQRSFHLFLLVKGTI FYSSILYAASESLDDIVELGRDLAFIIASFFIYFKLIYFLLYADNVDEVVDGLEECYRWERAGPAAAGVRSAKRLHYLIV IGMQIIWVVSMIIFIVLLISTPFWTQQELPLHAAYPFHLHDSLRHPRIHILIYLSQSFDILYYLTWLTVTECMSVSIYSQ LTTALSVLCVELRHLHQFCDGDEDLLRWEIHRLVKYHQKIIKLVDRCNEVFHGPLIMQMIVNFLLVSLSVFEAMMARHDP KVAGQFILLMILALGHLSMWTKFGDMMSQESLEVAEAAYEAYDSSVGSKKIYWNIRFIIMRAQEPLIMRATPFPTFNMTN YKAILNQCYGILTLLLNTLD Or65b5 \- Not in BLASTP \- My model is MVERVGFSLSDKGKSFLKPYIAIRMVYRRPDECPEHTVPYLNRDQLKAIGYYPNSEESSRPGRRNWHLFVMIKATIFFGS VTYAIFESLDNIVEWGRDLAFIIASFFIYFKLIYFLLYADNVDEVIDGLEDCYRWERAGPAAAGVRSAKRLHYLIVIGMQ IVWLAFMVVFVLLLVTTPLWTELSLPFHAAFPFHWHDPSKHPFTHILIYLSQTFDSAYFLMWLISIEGMSVAIYSQLTTA LSVLCVELRHLHQFCDGDEDLLRWEINRLVKYHQKIINLLDRCNEVFHGPLIMQMIVNFLLVSLSVFEAMMARHDPKVAG QFIVLMILALGHLSMWTKFGDVMSQESLEVADAAYEAYDPNIASKAVHKDLRVVIMRSQNPLIMRANPFPAFNMINYMAI LNQCYGILTLLLNTME Or67a \- Nothing in BLASTP or list. My model is MSVKFLKEKIFSRRGKSEKPKNAYIVIEDFMKLPIYFYRTIGLNPYELTGTNNKPGIGFHILFLLHMINANMVLALEIFF VYVSFRNNENFIESCMVMSYIGFVIVGDLKIGAVLLQKQKLTNLVRQMESVFPPARQKEQEEYDVRRYLRRCLRYTKGFG GLYMTLVITYNLFAICQYSIQKWILHSPHAKQSVPYVPLTPWTWQDNWKFYPTYLSQSMAGYTATCGHISADLMIFAVAI QVIMHFDRLAKSLTEFTVRAQSEEDGAEKDLKKLQELIAYHNKILGLTDVMNEVFGLALLLNFLASSTLVCFVGFQISIG ISPEMLAKLILILISANSEIYLICNFSQMLIDASGSICYAVYDMNWSEADPRFRKMLIVLALRAQKPVCLTATVFLDISI ETMSIFLRMSYKFFCAIRTMYQ Or67b \- GA12805 \- Fine as is. Or67c \- GA12792 \- Needs 8 aa on N-terminus. MTGQEQEPDTARTFKDMMRVPVQFYRTIGEDIYAHRSTSPWRSLLLKVYLYGGFINFNLLVIGELVFFYKSIQDFETVRL AIAVAPCIGFSLVSDFKQFAMAYYKGTLVRLLDELEEMHPKTLERQRAYRMPDFERTMKRVISIFTFLCLAYTTTFSFYP ALKASVKFNLLGYETFDRNFGFLIWFPFDATSSNLVYWIVYWDIAHGAYLAGIAFLCADLLLIVVITQICMHFDYVSRRL EEHPCEPGRDRENIEFLVWIVRYHNKCLTLCEHVNNLYSFSLLLNFLMASMQICFIAFQVTESTVEVIIIYCIFLMTSMV QVFLVCYYGDTLIATSLRVGDAAYNQKWFQCSKTYCQMLKMLIMRSQRPASIRPPTFPPISLVTYMKVISMSYQFFALLR TTYNAN Or67d \- GA12793 \- Needs a couple aa on each end. MSEGPFERYCKINRAIRFCVGLCGNDVIAEDYRMWWLTYAVIGAILFFFGCTGYTVYVGVVLDGDLTVILQAFALVGSAV QGLAKLLVTARMAAVVRQIQATYEAIYREYARRGGDYGRCLERRIKTTWHMLMSFMWVYVVLVGGLIAYPFFHLILHHKK LLVMQFRVPWIDESTDGGYLVLISIHVMLLSMGGFGNFGGDMFLFLFISNVPTLKDIFSAKLREFNEVAVRRQDYQRMRT LLWDLLAWHQQYVSILRDTERIYRIVLFVQLSTNCVSILCTISCIFIGAWPAAPIYLVYSFIVMYSFCGLGTIVETSNED FSKEIYANCLWYELPVKEQRLVILMLAKSQHEISLTAADVMPLSMSTALQLTKGIYSFSMMLITYLGYES Or69aA \- No BLASTP and not in list \- My model is MQLEDLMRYPDIAFRYFGMAPRFEWTARRTVAPKQTVTRQIVFMVGSLCLGYQNLGMIIYWVRFNSQQKEISMYVAKIAE MGSVLALIFAGFLNIWALTSKRAQIEAVLAELQEMYPEPRQRLYRIRHYNDQAVGLMKFTVNFYVVFIIYYNIAPLVLLL CEHLMDSQDISYRAQSYTWYPWQVYGSPLGYSAAYLCQAIGSILGVGFSMSSQQLICLFTTQLQLHFDAMANHLTAIDAK EPTANQQLRSLILYHRRILRLGDRVNRLFNFTFVVSLIVSTIAICLTSIATMLLELHKALLYISGLIAFVFYHFLICYRG SVLTLASDKVMPAAFYNNWYEGDLVYRKMLLILMMRSTKSYVWKTYNLAPVSIQTYMATLKFSYQMFTCVRSLK Then there is a pseudogene version of this in between it and Or69aB \- my model is MMLEDFMQYPDSTCQWIQFRRFEWSGRGSLRPKKSLIKRIIFFAGHHQYGVPCDVIYAYRTERESKNSIMYEADLFSVGS SLGFIMEGLCMIGMLIYYRLQIEELLEQLEDLFGIVRKKIYRLGHYEEZWRVMRKSZIIFIVCCTVYNLQSVLILFYEPT EAQAVSYRIQRLPLGSXASMVNLGAMMTTMELELHLDGLVRQLEELDAKHPREKEKLRSLIYYHSRILRVADNVNRLFNF SVFVSFSTSSLSMCFMFMFMCFTMTVRQLGSALKYMFGLVLFLVYTFSISYNGIZITEASDKVMPAAFYNNWYEGDLVYR KMLLILMMRSTKSYVWKTYNLAPVSIQTYMATLKFSYQMFTCVRSLK Then Or69aB \- not in BLASTP \- perhaps GA14713 in list? My model is MVFISMQLADFMQYSDLGCQVALIRRYGWSGRQSPGAKQTLMKKVIFVLGALNMSCYFFSFITYGYHIERKTKEPIIYVA ELSEVGGMLWFTVLGICNMYTLLIYRPQIEELLEGLEQLFAPARQSPYCTRYFYDDSALKMKRLSINFVCSATYYNLLPL VKLLSELLTESQQVSYQVQSKAWYPWQVHGSTLGFWIAYASQAFASVMNLGMMMATECLVFVCTAQLELHFDGLARRLEA LDARDPRAKEQLRALISYHTRLFKVADRANGIFNCTFLISYCVSSIAVCSMGFSMIMFDLGLALKYMVGMFLFMIYTFCI CHNGTQVTMASDKVMPAAFYNNWYEGDLVYRKMLLILMMRSTKSYVWKTYNLAPVSIQTYMATLKFSYQMFTCVRSLK GA16694 also corresponds to Or69a gene. Or71a \- GA14707 \- Not in BLASTP \- my model is MDLDRIRPVRFLLRILRWTRVWPASGGPATQRGWTNWEGYLLHSTLTLIFVVLLWVEAILSPDVEHTVDVLFFTLTMTS MCVKVLNSWRYAHVAQRLLTEWSSADRFRLKTLQEVDMWRIGHLRFNRVVGVYMMCSVGVIPVLVTPCLFAVPNKLPFS MWTPLDLQQPLGFWTAFSYQAVAIPFACLCDITLNLVNWYLMLHLSLCLRMLGQRLSALHRSGLQQDEEQLCAEFLELV SLHRRIKQQALDIETVISKSTFFQILVSSLVICCTVYSLKMTPVIQDMGKFAGLIQYCLSMVLEILSPSIYGNEVTQSA DKLPEALYSSDWPDLSPRLRRLILMFMIYLNRPLSLRAGGFFYMGLPMFTKVMNQAYSMLALLFNMNN Or74a \- GA12488 \- Fine as is. Or82a \- GA16295 \- not in list, but in BLASTP and fine as is. Or83a \- GA10437 \- Needs aa on both ends. MKERNPPLKIKGDSERRDLFVFVRYTMCIAAMYPFGYSLQGSSFLGLLVRVLDWFYEIFNYFVSVHILGLYICTIYINYG QGDLDFFVNCMIQTIIYLWTIAMKLYFRRFRPKMLDEIMSIINENYHTRSALGFSYVTMSGAHHVSKLWIKTYVYCCYIG TIFWLVLPIAYRDKSLPLACWYPFDYTQPIVYETVFFLQAIGQIQVAASFASSSGLHMVFCVLLSGQYDVLFCSLKNVLA TSYIHMGGNMAELRQLQSEQSIADAEPNQYAYSHEEQTPLEQLLQHPPQDESSRDFLKAFKRSFRHCIDHHRYIVEVLKK MERFYSPIWFVKIGEVTFLMCLVAFVSTKSTTANSFMRMVSLGQYLLLVLYELFIICYFADVVYQNSQRSGEALWRSPWQ RHLREIRSDYMFFMLNARKQFQLTAGKITNLNVDRFRGTITTAFSFLTLLQKMDARG Or83b \- GA10435 \- Fine as is. Or83c \- GA13827 \- Needs a N-terminal extension, removal of an open inframe intron, and addition of a C-terminal exon. MSSPVAVNPAKRFRQLTKNINIWTNLLGVDVLAPKVKFNYRTWTTTFAIVNYTGFTIFSMVDNGGDWRVSLKASLMMGGL FHGLGKFLTCLLKQRTMKRLTFFACNIYEEYEGKSEVHYRTLDMNIDRLLGLMRGIRNGYMATFLLMTILPMAMLLYDGT RVTVMQYQIPGLPLENNVCYSITYLIQLVTIGVAGCGFYAGDLFVLLGLSQIFAFADILQLKVDDLNAALDRKSEARALV SLGATITGEERRQELLLDLIKWHQLFTNYCHTVNELYHDLIATQVLSMAVAVMLSFCIILTSFHMPSAIYFLVSAYSMSV YCVLGTKIEFAYDQVYESICSVSWQELSCDQRKLVGPMLREAQNPQSIKLLGILPLSVRTALQIIKLIYSLSMMMMQNRT Or85b \- GA11167 \- Fine as is. Or85c \- GA14720 \- Mine ends in a K. MKFMKYANFFYKAVGIEPYTTDSRPQANSLKASIVFWANVLNLGAIVTGEILYLGVSLADGKLLDAVAVMSYIGFVIVGT SKMFFIWWKKPALSDMVRDLEHIYPHGKEAEEEYKLQSYLRSSSRISVTYALLYSVLIWTFNLFSIMQFLVYEKLLHLRV VGLALPYTVYYPWNWEAPWSYYMLLFCENFAGYTSAAGQISTDLLLCAVATQVVMHFDHLSTVLEGHELSGKWEEDSRFL VNTVKYHQRILRLSEVLNDIFGIPLLLNFMVSTFVICFVGFQMTVGVPPDIMIKLFLFLFSSLCQVYLICHYGQLIADSS LGLSNAAYKQNWNHADVRYRRALVFVIARAQKPAHLKATVFMNITRATMTDLLQISYKFFALLRTMYVK Or85d \- GA11171 \- Needs aa on each end. And there is also a possible long N-terminal extension with no Dm match. I don't know whether to include it or not. MEGTGKTPSTVEKAEKSEPITTERFLRYANIFYLSIGMEAYDHQGRRKMIELILRCIFIALILNLNAVLLSELIYVFLAI GKGTNFLEATMNLSFIGFVIVGDLKVWHIWRKRDQLTNVVREMEKLHPKEGHHQKAYDVESHLSGYSRYSKFYFGMHLVL IWTYNLYWAVYYLVCDFWLGIRHFVRMLPYYCWVPWDWSTNSSYYLMYVSQNMAGQTCLSGQLAADLMMCALVTLLVMHF IRLGRGIEEHVAGLLSPQQDLEFLQAAVVYHQRLLQLCHNINEIFGVSLLCNFVSSAFIICFVGFQMTIGGKIDNLVMLV LFLFCALVQVFMITTYAQRLLDASEHIGEAVYNHDWFQADLPYRKMLIFMVRRSQQASRLKATIFLNVSLVTVSDLLQLS YKFFALLRTMYVK Or85e \- GA21973 \- Not in BLASTP \- presumably because the Dm ortholog is truncated in the genome sequence \- it is a polymorphic pseudogene in Dm. Here's my model. Again there is a possible unaligned N-terminal extension. MASLQFHGNIDADTRYDATLDPAREPELFRLLMGLQLAMGMNPSPRLPSWWPTWLRPVGGLMAKAYCSMVILTSLHLGLL FTKTTLDVLPTGELQAITDALTMTIIYFFTAYANIYWCVRSQRLLAFMDHINREYRHHSLAGVTFVSSHAAHRWSRSFTT IWILSCLVGVITWGVSPLMLGIRTLPLTCWYPFDALSPGTYTAVYATQLFGQISVGVTFGFGGSLFVTLCLLLLAQFDVL YCSLKNLDAHSKLLSGETIAGLGLLQRELLQGAFTRELNQYAVLQEHATDLLRISAESQNLAQVKVFHSALVECVRLHRF ILYCCAELENLFSPYCLVKSMQITLQLCLLVFVGVSGTREFLRIVNQIQYLALTLFELLMFTYCGELLSRHSVRSGEAFW RGGWWKHAHLLRQDVLIFLVNSRRAVYVTAGKFYVMDVNRLRSVITQAFSFLTLLQKLAAKKATTEA Or85f \- GA14128 \- The final intron has a GC donor, removing three aa from your translation. A 72aa unalign N-terminal extension is also possible in Dp. MESVQRSYEDFPAMPSAVFRLMGYNVLDAPDETRSRRLLMWIYRWLCTCSHAVCVGFMIFRIFEVKTINSIPLIMRYVTL VTYVINSDTKHATAMQRESIRNLNKKLADLYPKTTKDRVYYRVNEHYWSRSFLAMICIYIGSSIMVVIGPILQSIFAYFT RHQFTYEHCYPYFIYDPNRHPVWVYIIIYATEWLHSTHMVISNVATDLWLLCFQVQICMHFSCMTRSLEEYQPDRTHDVD DNRFLAQMVNKHEYLVILQNDLNGIFGGSLLLSLITTSAVICTVSVYSLIQGLTLEGITYVFFIGTSVMQLFLVCNHGQQ LLDLSEDIGHAAYNHNWHKASIAYKKYLLIIIIRAQKPVELSAMGYLPISRDTFKQLMSVTYRGLAMIRQMIE Or88a \- GA12932 \- Mine has a Q added on the end. Or92a \- Not in BLASTP or list \- My model is MLWGKRKEKRELRTFEDLTRFPMAFYKTIGEDLYSDRDKNLVRRYLLRFYLVMGFLNFNAYVVGEIAYFIVHITSTTTLL EATAVAPCIGFSFMADFKQFGLTVNRGRLVQLLDDLKALFPTTLETQRAYNVSYYQRHMNQVMTLFTILCMTYTSSFSFY PAIKATIKYYFLGSEIFERNYGFHILFPYDAETDLTVYWFSYWGLAHCAYVAGVSYVCVDLLLITTITQLTMHFSYMADT LEAYDGDEHTDEENIKYLHNLVVYHSRALDLSEEVNSIFSFTILWNFIAASLVICFAGFQITASNVEDIVLYFIFFSASL VQVFVVCYYGDEMISSSSRIGHAAFNQNWLPCSTQYKMILKYIIMRSQKPASIRPPTFPPISFNTFMKVISMSYQFFALL RTTYYG Or94a \- GA14408 \- Needs a K on the end. Or94b \- GA19774 \- Fine as is. Or98a1 \- GA18957 \- Not in BLASTP but is in list. My model is below. MLNLLKEPTPGNLMSSPEAFKYMKNTMILMGWIPPQGGWTASSIIMSIFIVAVSVVYVPIGVFITFFVELKTLSPSEALS VLQVALNAMGFPLKLLFFRLYMWRFYKIEKLLGRMDERCIDSTERSEVHRWVARCNIAYLIYQFIYISYTISTFLTATYS GVVPWNIYNPFIDWRESTRNLWIDSVLELMFIIGIVIQTYMIDVFPLLYGLILRAHIKLLRQRVEKLCLDPSQSDDENNE ELENCIEDHKLILEYASVIRPVIEPAILVQFMLVGLVLGISLINLYLFADTWAKLAIAAYIVVQIVQTFPFCYTCDLIRE DCESLAVAIFHSNWKGSSRRYRSSLIYFLLNAQRTISFNAGSVFPICLNTNIRVAKLAFSVVTFVKHLGLGSR Or98a2 \- Not in BLASTP or list \- Moved to another chromosome, but tree shows is a Dp-specific duplication. My model is below, but a 17 aa N-terminal extension is possible. MLINLLKEPLPWNLMSSPEAFKYLEYAMILMGWTPPQEGWKPFRIILTIVISFWWILYVPIGVFITFFVELKTLSPSEAL SVLQVGINAGGFPLKMIIMRLNMWRYHKIKELLGRMDKRCINITERLEVHRWVARCNIVYLIYQFMYTAYTLSTFFTAIF SGVLPFHLYNPFIDWRESTMNLWIASVLELIPMNGIVSQTYMMDVFPLLYGLILRCHVKLLRQRIENLCSDPRKSDDENN EDLVFCIKDHKLILEYVSVMRPVIETIIFVQFLLIGLLLGITMLNLFFFADFWAKLAIAAYINGLIIQTFPFCFTCDLLK DDCESLALAIFHSHWKSSSRRYKSSLIYFLHNAQMPLSFTAGSIFPIGLKTNITVAKLAFSVVTFVQQLNIADKLRKE Or98a3 \- Not in BLASTP or list \- Moved to another chromosome, but tree shows is a Dp-specific duplication. My model is below MSSPEAFKYLEYAMILMGWTPPQEGWKPFRIILTIVISFWWILYVPIGVFITFFVELKTLSPSEALSVLQVGINAGGFPL KMIIMRLNMWRYHKIKELLGRMDKRCINITERLEVHRWVARCNIVYLIYQFMYTAYTLSTFFTAIFSGVLPFHLYNPFID WRESTMNLWIASVLELIPMNGIVSQTYMMDVFPLLYGMIVRCHVKLLRQRIENLCSDPRKSDDENNEDLVFCIKDHKLIL EYVSVMRPVIETIIFVQFLLIGLLLGITMLNLFFFADFWAKLAIAAYINGLIIQTFPFCFTCDLLKDDCESLALAIFHSH WKSSSRRYKSSLIYFLHNAQMPLSFTAGSIFPIGLKTNITVAKLAFSVVTFVQQLNIADKLRKE Or98b \- GA15044 \- Needs two aa on C-terminus. MLTEKFLRLQSFYFRLLGLELLDQQEVQSVGDIRRSICCILAVATFLPLSIAFGLNNIHNMDKLTDTLCSVLVDLLALCK IGIFLGLYKDFRRLIQRFHDMLERECHWEVAAKIVARQNDRDQFISSLYTICFLVAGGSACLMSPLHMIIRFWRTAEMEP DYPFPSVYPWDNRRFHNYLLSYLWNVFAAFGVALATICVDTLFSSLTHNLCGLFEICRHKMLTFKRRRPVETQQNLRHIF QLYGECLELGSSLNGFFRQIVFAQFIAASLHLCVLCYQLSSNLMQPGMLFYAAFMAAILGQVAIYCHGGACVQAESQLFA QAIYESDWLPLLLGGRLDVGRSLQIGMVRAQRGCRLDGYFFEANRKTFDLIVRTAMSYVALLRSTS OrN \- No GA \# \- This gene was lost from Dm. MTFFTQCRVVSATTYKRILPDESQAHCEMERLRELTQRIRPSDVDEGRIGSIELNVWLAQLTGLPLSGLKPETKAESIRI LVVSGVVLPLLFCYVVLEIYDLVLNWDNVDIMTQNVVMTLTHVGYWFKVLNTFYYYEDIRRIVFTLKHLTRTCVLSPGQR ETFHQVEVENKVVCLFYFCLVVFSSTLAMVMLLIVPDNLAGKRFPYRVHMPHFLPPIVQYLYMGLSIIWISCGIPTIDNV NMLFMNQICMHLKILNMAFDVLQRQVDPNIWMVSIVKYHSVLIKLRQRLEQIYRLPVLFQFVSSLLVVAMTAFQAIVGDG SGSSVLIYFLFGGVMCQIFLYCWFGNEVFEQSKTLSTSAFGCNWHEFDAQFKRTLLIFMINADRPFLFTAGGFMGLTLTS FANILGKSYSIVTVLRHMYGRAH Subject: Re: Dpse Ors and Grs Dear Peili and FlyBase, I've finished the Grs too now, and comments for each are below. Almost every model needs at least slight changes, many are new. Again I've attached the table from the manuscript for additional detail \- the second table in the file is the Grs. Hugh Gr2a \- GA14976 \- Needs a D added to the C-terminus Gr8a \- GA13680 \- Needs a Q added to the C-terminus Gr9a \- GA17078 \- Needs a different C-terminus to align well with DmGr9a MSADWLDRCLGGYFQLLALRCSYSKRRAGRLLSNLYLMLVILDLVGQMRSYAHGEEPMVDRMLFFPKAIQAVNVFYKLVH ALIALFALVGCQRERRLLQQLPPTHATPGIYRQVALEFLMVVYALWISFFDSLNAGQILENLRYVFSSQAVRARYLQMLL LVGRLQAQLEQLQRQLIDCSLDDYQQLRSSYAHLARLCRSLSQLYGPSLLLLNVLVLGDCLIVCNVYFMVEHLEAVPATW LVLWQAVYIVVPTLVKIWTVCAACDRFVMGSKILRQQLSDRRGRTSEERSQIEEFVLQIMQDTLQFNVCGIYHLNLQTLA SMFFFILEVLVIFLQFVSVIR Gr10a \- GA17050 \- but this is the second half of this annotation and perhaps the GA17050 number should be retained for the first half, which encodes the Or10a ortholog. My model is MSAAEEHEPKESFWERHEFKFYKYGHIYAVIYGQVVIDYVPQQPMRRGLKTVLIAYSHLLSLILIVVLPGYFVYHFRTLT ETHDRRMQLMLYVSFANTAIKYATVIVTYVANTVHFEAINYRCTVQRQRLETAFVGAPKQPKRSFEFFMYFKFCLINLMM LVQVAGIFALYEAADGPSVSQVRLHFAIYAFVLWNYTENMADYFYFVNGSVLKYYRQLHLQLVSLRRQLGGLRPGGMLME HCCRLSDRLGELRQRFGEIHDLYSESFQMHQFQLLGLVLATLINNLTNFYAIFNMLAKQSLEEISFPIVIGSVYATGFYI DTYIVTLLNEHIKQELEGFALTMRTFAEPPSTVEQRLTQEIEYFSLELLKCRPPMLCGLLNFDRRLIYLIAVTAFCYFIT LVQFDLYLRKSS Gr10b \- GA11723 \- Needs a few aa on the N-terminus. This gene is also truncated at the C-terminus by the end of a contig, hence is missing the last exon, which sadly is only available from a single bad read. Here's my best version. MGLFLSLAGWVRTHSKRTLVYRQLLSAMWARWLYTLLLRIWMIQGLVLGITGHYYSPSRRRLVPSRVLHWYSWLMMLATL ALYWGYWLYAQDYFVQGTFRRHGFVQSLSYGTVVLQLIALVVQTVLRMFREQEVCAVFNELMAMRSTVSRVHPQGQPSRF YYVVFFGRLFNYLQNLNFSLSILLIVDMRSVSVWDYLSNFYFVYMSLVRDTLLMSYILLLLELGEVLRVNGEHPRTSYAG LMRQLQRQERLLRLVRRVHRLYAWQVLATMFFQLYFNMATFYVGYSFLAASSAPVSGFRVWNAKFLLTVLTFITKLFDGL LLQIVGERLLAQGNKPCAGPRVEDVTYMQAAQRQMEMASLKRAIPAGSPKKRFLRMVCIDMKGPFPGINCNLSYGIKILP LGISPG Gr21a \- Not in BLASTP, but GA12646 in list and genome browser \- My model is below, starting where aligns with Dm. This is an extraordinarily conserved protein, but it has a potential N-terminal extension of 55 aa, but these do not align at all with Dm (which has its own potential 47aa extension), and Anopheles gambiae, which has a highly conserved ortholog, has no possible N-terminal extension. I there judge that this is the likely true start codon for this protein. MSFWAVSRGLTPPSKVAPMLNPNQRQFLEDEMRYREKLKLVARGDAMDEVYVRKQETVDDPLELDRHDSFYQTTKSLLVL FQIMGVMPIHRNPPVKNLPRTGYSWGSKQVMWAIFIYSCQTTVVVLVLRERVKKFVTSPDKRFDEAIYNVIFISLLFTNF LLPVASWRHGPQVAIFKNMWTNYQYKFFKTTGSPIVFPNLYPLTWSLCIFSWVLSIAINLSQYFLQPDFRLWYTFAYYPI IAMLNCFCSLWYINCNAFGTASRALSDALQTTIRGEKPAQKLTEYRHLWVDLSHMMQQLGRAYSNMYGMYCLVIFFTTII ATYGSISEIMDHGATYKEVGLFVIVFYCMGLLYIICNEAHYASRKVGLDFQTKLLNINLTSVDAATQKEVEMLLVAINKN PPIMNLDGYANINRELITTNISFMATYLVVLLQFKITEQRRTNSQAA Gr22a \- 16376 \- Needs for aa on N-terminus MPQQPHRGYRQRVAHFTLKSTLYASWALGLFPFTYDSRTRQLTRSRWLLSYGLVLNLGLIGVVLLPGTEDHRDVRIDMFE RNPIIQQVENMVEIISFLTAVAMHLGIFWKSREMVTILNELFFLEKRHFSNLILAHCHQFDKYVIQKCILVVLEVGSSLL IYFGVPDSNLVVTRAFCIYLVQVGVLLGVTHFHLAVIYIYRFVWTINGQLLELANQQRRGQKVDPARIKRLFWLYSRLLE VNSRLAAIYDIPVTLFMVTLMSANIMIAHVLIIIWINQFSLLDILLLFPQALLINFYDLWLSIAFCELVERTGRQTSDIL KLYNDGEDMDEELQRSLSDFALFCSHRRLRFRHCGLFYVNYEMGFRMIITNILYLVFLVQFDYMNLKYK Gr22b \- This is the ortholog of the Gr22b/c split in Dm (Gr22b is a polymorphic pseudogene in Dm). In your list it is given the GA16572, but it is not in BLASTP. My model is MLRRREGFAPKLCRLVLQVTLYGSWAFGLFPFTLDPQTRQLRRHRWLLVYGVAINLFLLCIVVLLSEYSEETTQSLEVFQ RNHLLGLMNVIIGVLALIASSGIILISFWKSGQALCIFNELLGLEHRHFGCLDVEDSSKFNLYVIQKGMSVVGELMGLVV VNYGMPEYTMSYLYLALLCLTQFCVNLVVTQIYLAILYIYRSVWLINRQLLGVVSRLRVDPLSDSSRIPLLLSLYGRLLA LHKQMETTFNGQITLILTSALAGNIVVIYFLIVYTVSLGQLSISLAIFPYSLIMNVWDYWLSISVCDLTERTGRHTSTIL KLFSDLEHNDEDLERSINEFTWLCSHRRFRFGLYGLCSVNSETGFQMIVTSFMYLLCLVQFDFMNL Gr22dP \- Like Dm, this is a pseudogene, but the defect is different, so it is independently a pseudogene. My model is MFRPHRGCRHKLVYFILYSILYSSWALGIFPFTYVSKERKLRRSKWLLVYGIAFNATLVVLMLRPHGGEGESMANDPKLD VFQRNFVLKQISLLLGIGGVITICAMYLRTFWRSRDLCRIYNQLLHLEVTYFKDYSVECPTFDRYVIQKGLLVIGGVAST LVIHMGMPNESVSLVXAGSFLLAIHFHLGVAFIYRFVWLINRELLDLANRLRVRPEGSSSRVRFLLTLYGRLLDLNTRLT ACYDYQTAMMMAIFLGGNIIVSFYMIVFSVSLSKMSVFVMLIMFPLALINNFLDFWRTGRQTSMILKLFNDMEILDKEME RSFAEFALFCSHRRLRFHHCGLFYVNYEMGFQMLVTSVLYLLFLVQFDYMNL Gr22e \- GA16578 \- Fine as is. Gr22f \- GA16574 \- I consider the pseudoobscura version of this gene to have been lost, but it is a close call \- if not then the Gr22b gene above is the equivalent. Gr23aA \- GA13700 \- In list and browser only. This is an alternatively spliced gene like DmOr23a; here are my two isoforms. MPREPFTTITWSSDCSMNSFECLTRRCLAGVFWLMGLVPLPPSSQLCSLLLSLAIRCCWIVYLVYLLSIGIGFWSVATES VGNVVGTMLFMGSTVLGLLLLLESVLKQKTHRQLEDLRFQSQLQLQRLGRFGRGRQAAYLLPLIGTQFACDLVRVLISAE VGTMISPVFLVSLPLMWLLRLRYVQLVQHVMDLNHRSLQLRRSLLALAAGNDLWQPYGVQEWAQLQTLRKTYQRIFECYE VLSDCYGWGMLGLHLATSFEFVTNAYWMITGLYEEQNLLILTFNGATAVDFGTPIATLSWYGDAGAENGREIGCLISKLV KPLGSRRYNHLVSEFSLQTLHQRFVVTAKDFFSLNLHLLSSMFAAVVTYLVILIQFMFAERSANAYSG And MFPLSRKQAFARVVLRFLHLTLSALGLTSRRHSRAVQWLQFGCWLSWYTAIWALTVHRATRTEDCDLDCVLRYVLLVCET GSHAIIVTNTFLQQDDSDSLECCDPVVGVTVVGLLVPIIAAQYLVCSNLDKFSERVISYYWKTLPSFLGLQFQIIAFIAQ VMYVNLRVRLARRQLQALARELACSWPQSKLQAMYLDHQTARLVDLKRRYNELYHLYSRINERYGGSLLIIFIVFFAGFV CNAYWLFIDLRTTPSRLYPILQNLGFIVNVALQMSAACWHCQQSYNLGREIGCLISKLVKPLGSRRYNHLVSEFSLQTLH QRFVVTAKDFFSLNLHLLSSMFAAVVTYLVILIQFMFAERSANAYSG Gr28a \- GA12527 \- not in BLASTP \- here's my model. MAFKLWERISQADNVFQALRPLTYISLLGLAPFRLNLNPRKEVQTSTYSFVAGIVHYLFFVLCFVTSGLEGDSIIGYFFQ TNITRLGDKTLRLTGILAMSTIFGFTMFKRQRLVSIIQNYIVVDEIFVRLGMKLDYRRILLFSFLISLGMLLFNVIYLCV SYGLLVSATISPSFETFTTFALPHINISLMVFKFLCTTDLAKSRFSMLNEILQDILDAHIEQYNALELSPMHSVVNHKRY SHRLRNFISTPMKRYSVTSIIRLNPEYAIKQVSNIHNLLCDICHTIEEYFTYPLLGIIAISFLFILFDDFYILEAVLNPK RLDVFETDEFFAFFLIQMSWYIVIIILIVESSSRTILEGNQSAAIVHKILNITDDPELRDRLFRLSLQLSHRRVLFTAAG LFRLDRTLIFTITGAATCYLIILVQFRATHHMEDAVGANASQLHFLHD Gr28b \- GA12528 \- Like Dm28b, this is an alternatively spliced gene like other Ors and Grs, where the last two exons are shared by multiple alternative first exons. Here are the five proteins in order, A-E. MIRSGLKSYRACRRRVGDALSARDYYGAIQMLIAIAYVLGITPFVVTHSAKGESGMQQSWYGFVNAISRWVLLAYCYTYI NLHNESLIGYFMRNRISQFSTRLHNICGIWSAVFTFIMPLLLRRHLQRFIEDVMEVDRRLDRLRHPVNFNAVFGVATLVL ALVALLDTIITVTCLVCLAQMEVRASWQLIFILVYELMAISMTIVMFSLITRTVQRRLNYLHTVLKNLSHQWDSHSLKGV VQKQRSLQCLDSFSMYTIVTKDPAEIIQESMEIHHLICEAAATANKYFTYQLLTIISIAFLIIVFDAYYVLETLLGKSKR ESKFKTVEFVTFFSCQMILYLIAIISIVEGSNRAIKKSEKTGGIVHSLMNKTKSPEVKEKLQQFSMQLLHLKINFTAAGL FNIDRTLYFTISGALTTYLIILLQFTSNSPHNSNGNYYSCCETFNNITNHTN MSTALRRVRKYFISSQVYEALRPLFFLTFLYGLTPFHVVRRKMGESYLKMSCFGVFNIFIYICLCGFCYISSLRQGESIV GYFFRTEISTIGDRLQIFNGLIAGAVIYTSAILKRCKLLGTLTILHSLDTNFSNIGVRVKYSRIFRYSILVLVFKLLILG VYFVGVFRLLVSLDVTPSFCVCMTFFLQHSVVSIAICLFCVIAFSFERRLSIINQVLKNLSHQWDSHSLKGVVQKQRSLQ CLDSFSMYTIVTKDPAEIIQESMEIHHLICEAAATANKYFTYQLLTIISIAFLIIVFDAYYVLETLLGKSKRESKFKTVE FVTFFSCQMILYLIAIISIVEGSNRAIKKSEKTGGIVHSLMNKTKSPEVKEKLQQFSMQLLHLKINFTAAGLFNIDRTLY FTISGALTTYLIILLQFTSNSPHNSNGNYYSCCETFNNITNHTN MEAGKDTDTVEIEVESGLCQPLRRRVRRFLSAKQLYECLRPVFHVTYWHGLTSFYISSDSATGRKDIKKTIFGFVNGIIH ITLYAICYTLTIFNNCESVASYFFRSRITYFGDMMQIVSGFIGVTVIYLTSIIPNHRLERCLEKFHTMDMQLQTVGIKIM YSKVLRFSYMILFSMFMVNICFTCGTFSVLYSSLVAPSMALHFTFLIQHTVISIAIAVFSCFTYLVEMRLVMVNKVLKNL SHQWDSHSLKGVVQKQRSLQCLDSFSMYTIVTKDPAEIIQESMEIHHLICEAAATANKYFTYQLLTIISIAFLIIVFDAY YVLETLLGKSKRESKFKTVEFVTFFSCQMILYLIAIISIVEGSNRAIKKSEKTGGIVHSLMNKTKSPEVKEKLQQFSMQL LHLKINFTAAGLFNIDRTLYFTISGALTTYLIILLQFTSNSPHNSNGNYYSCCETFNNITNHTN MSGFWRELFHPRDAHGSEQTLLLCTYILGLTPFRLRGEAGARHYQLSKIGYLNALIQLSFFSYCFLTALIQQQSIVGYFF KSEISQVGESLQKFIGMTGMSILFLCSTIRVRLLIHLCNLISRIDGHLLDVGVVFNYPAIMRLRHSQLFLMSTVQLAYLI SSTWMLLRNDVRPSYPAAVAFYVPLIFLLSTVILFGAFLHRLWQHLEALNKVLKNLSHQWDSHSLKGVVQKQRSLQCLDS FSMYTIVTKDPAEIIQESMEIHHLICEAAATANKYFTYQLLTIISIAFLIIVFDAYYVLETLLGKSKRESKFKTVEFVTF FSCQMILYLIAIISIVEGSNRAIKKSEKTGGIVHSLMNKTKSPEVKEKLQQFSMQLLHLKINFTAAGLFNIDRTLYFTIS GALTTYLIILLQFTSNSPHNSNGNYYSCCETFNNITNHTN MWLFNRLVYMARPHDVYTCYRVTLWLALWLGIVPYYLTSTSAGSSRLSASYFGYFNIIFRMIVYVVNFVYSALDPRSLMS NFFLSDISNVIDGLQKVNGMFGIIAILLISLAKRRKLLHVLAVFDCLERESFPRVGITHHQGPAARRMNRLVLVMTGTTT AYITCSFLMISMRDAGTFSISSVISYFSPHFIVCAVCFLAGNLMIKLRIYLSALHKVLKNLSHQWDSHSLKGVVQKQRSL QCLDSFSMYTIVTKDPAEIIQESMEIHHLICEAAATANKYFTYQLLTIISIAFLIIVFDAYYVLETLLGKSKRESKFKTV EFVTFFSCQMILYLIAIISIVEGSNRAIKKSEKTGGIVHSLMNKTKSPEVKEKLQQFSMQLLHLKINFTAAGLFNIDRTL YFTISGALTTYLIILLQFTSNSPHNSNGNYYSCCETFNNITNHTN Gr32a \- GA13351 \- Not in BLASTP \- here's mine. MSPNTWVIEMPTQKARLHPYPRRISPYRTPSVNRYAFSHETPPPPPPPPPRTLEHPVFEDIRTIMSVLKASGLMPIYEQL SSHEVGPPTKTNEFYSFFVRGVVHALTIFNVYSLFTPSSAQLFYSYRETDNVNQWIELLLCILTYTLTVFVCARNTKNIL RIMNEILQLDDEVRRQFGANLSQNFGFSVKYLFGIAACQTYIIVLKIYAVDGVITPTSYVLLAFYAVQNGLTATYIVFAS ALLRIVYIRFHFINQLLNGYTYAQQQRKKGGHRRQAAGATLMENFPEDSLFIYRMHNKLLRIYKGINDCCNLILVSFLGY SFYTVTTNCYNLFVQITAKGMVSSNILQWCFAWLCMHVSLLALLSRSCGLTTREANATSQILARVYAKSKEYQNIIDKFL TKSIKQEVQFTAYGFFAIDNSTLFKIFSAVTTYLVILIQFKQLEDSKVEDNIQDQQQT Gr33a \- GA14395 \- The first exon on this is nearer and shorter, and nicely shared with Dm, which needs updating as well. MIQIMNWFSMAIGLIPLNRQQSESNVILDYAMMLLVPVFYLGCYFLINLTHAFGLCFLDACNSVCRLSNNLFMHLGAFLY LTVTLMSLYRRKDFFLQFDERLNAIDAVIQKCRHVAEMDRVKVTAVKHSVAYHFTWLFLFCVFAFALYYDIRALYLTFGN YAFIPFMVSSFPYLAGSIIQGEFIYHVSVISQRFEQINTLFEKINQEARHRHAPLTVFDIESEGKKQERKNLTPATAMDS RGPSFGNEQKLSGEMKRQMAAPPPPPPQGQQKNEEDEMDSSYDEDEDDFDYDNATIAENTGNTSEANLPDLFKLHDKILS LSVITNGEFGPQCVPYMAACFVVSIFGIFLETKVNFIVSGKSRLLDYVTYLYVIWSFTTMVVAYIVLRLCCNANNHSKQS AMIVHEIMQKKPAFMLSNDLFYNKMKSFTLQFLHWEGYFQFNGIGLFALDYTFIFSTVSAATSYLIVLLQFDMTAILRNE GLMS Gr36a \- GA16444 \- Fine as is. The other GA models in here should be dropped, that is, GA16445 and 16442. This is the single ortholog of a triplication of DmGr36a-c. Gr36d \- I dropped this from the chemoreceptor superfamily but the name is stuck in FlyBase. Gr39a \- GA16340 \- This gene is complicated. In Dm this is alternatively sliced to give four proteins, each with its own first exon and sharing the last 3 exons. In Dp there are seven isoforms similarly structured, but two were lost from Dm, while one Dm first exon is duplicated in Dp. It would be great to have these letters PA-PG along the chromosome as below: MRDALHDLLKYQRRLGLTTVDTEDQNGNCCKLRPNWGTFFQFWLLQGVVVFTCSVFIIFWDHKFEATHTGVANHFAHVLE VLEPLSISWLLVWMRLHEGRQVRLLNRLQEMARVCHQVVTIPRWLLRLWLISSVGIVLSCLLYAFTLTGLELVSLVPYGT FILRHTYYNYLITFFTAIIFGMEQILMAHRRRIERSLRSTNKRELARSLCAIDEIHLLCETDINYIFGGSLALQMLYIVL STASFGYILSLEWFELLTCGAIVLCIFPTMFYSTMPAWSIRLQVEANKTAKILAKVPRTGTGLDRMIEKFLLKNLRQQPI LTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF MGQKLLLTFLHYQRYLGLSDLDYSEAGRGYWLHATWYSYAAQFLVAGTFFSALVAALSEPLYYINTGSMTGNIFDNAVMM TASVTQLLANLWFRSQQQTQVALLQRLSKIKEHLKVDTVALSSPRRMYRLWVGTWFFYGYMVGSFAASFWLAKPKLSHAL TLLGFGLRVMSANFQYTCYSGMVCLLQRLLRAQAEELQILVDTHPIPLEALAKSLRVHDEILMLGQREFVQVYGGVLLFL FLYQVMQCVLIFYVSTLKGSLNLRTTLTMSGWLAPMLLYLILPLMVNDVSNQANKTAKILAKVPRTGTGLDRMIEKFLLK NLRQQPILTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF MKLLKFLHYQRYLGLSNLDYSQHQQRYTLRGTCVSYALQFVVAVIFVSAFVSALAESVNYMQTNSLTGNIYDHAVILTVS VTQLLANLWFRAHQQAQVTLLRRLSKVMRLLKVNTLALGQPRWVYRLWVAVCLWYAMMIGSFGSSIWLSGMKLSHILTLL AFALRLLCANFQFTLYSSMVCVLQRLLSVQGELLQVLLGDPSGISRGALARCLRLHDEILMLSQGEFVQVYGGVLLFLFL YQVMECVLIAYVSTLEGFRSMQELARIVCWISPMLVYLILPLMINDLSNQANKTAKILAKVPRTGTGLDRMIEKFLLKNL RQQPILTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF MEDPPSGELCVYYKICRYLGIFCIDYNLSKQRFWLRRSLICYVVHFAVQTYLIGCIAIMVLYWNHAFKEEMTQTGNHFDR LVMLLALGMLLVQNAWLIWLQAPHLRIVRKLEFYRRKHLQHLRLQLPKRLLWIIIISNMLYLYNFVKICIFEWLSDATRL FVITALGFPIRYLVTSFTMGTYCCMVHLMRHLLLSNQSQISLIISQIQDPKLGSANVLRLRGCLDMHDRLVLLCNVEISL VYGFIAWLSWMFASLDVTGVIYLAMVVPSLRNPCVQIVGYLVWLTPSLMTCGASFMSNRVALQANKTAKILAKVPRTGTG LDRMIEKFLLKNLRQQPILTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF MSLGVELRVYLSSLKCLGLLCITHEPHDSEYRTHASRTDETLALAGLVCSQCIALLALGHAIVQPRVYELPLYTNVGNVY YVANYGLTCLTVSLFYAYFYLRRRSFLSLVSVLLYHNRVDLGNCHSRQFLRLYIIFVSQVLLTGLLQMMIMLYCNIDPLH SFLLFFFVSFSYMLIALVIAFYTCLVQIVASLVRLYNRDLTAAVHSRAPLSTTLCRLRLLQRNRLLWVCQQHLTSDFGLV LTIIIAFLLFSAPAAPFFMVTIVFEIDARLVGMRHLLIPLAVTLLWNLPIVVALLMTLRSDLVGKEANKTAKILAKVPRT GTGLDRMIEKFLLKNLRQQPILTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF MRMHSFGELRTHLRTLRLLGVFRFHIDYDKCVVSSTPSEERVARFYLWGVLWILLNIQTYCTYMPQHFFMVNYNATGNCY ALINIRTCNVTTVLIYTMLYVRRCRYARLLETMLRLNRASRDPQSSSLYGIHLTLFVLCMINYGHGYWRAQVRPTSIPIY LFQYGFSYMLMGQLVVLFVSFQRILLSSLRCYNRKLLGSRQLSRECREFYEDFRDYNQIIRLCHEDINDCFGLLLLPITG YVLVTTPSGPFYLISTLFEGLFRTPWRFAFMFLTCVFWSMPWVTLLVLAMGTTNVQREANKTAKILAKVPRTGTGLDRMI EKFLLKNLRQQPILTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF MSVCRDLRLYLRALKFLGMMCWQFETERCLLQSTPECERYAQVYTVVVLSGTTGALAYAHLQPDRFRMKVYNRTGNFYEA VIFRCSCVVLWLLYVCLYLRRHRHMELVQSLLTINRTCLAGSADRQFRNNFVLYGALSGLIFGNHINGYRHAGLDSIALT LNVVLYTYAFLVLCLLLVFFVCLKQIMAAGLTHYNEELRQGIASLDVATIFCGRQKIPSLRGRQQILALRGRQQLLALCE GELNECFGLLMLPIVALVLLLCPAGPFFLISTVLEGKYGPKQYILMTLTSIFWDVPWMIILFLMLRTNGITVEANKTAKI LAKVPRTGTGLDRMIEKFLLKNLRQQPILTAYGFFALDKSTLFKLFTAIFTYMVILVQFKEMENSTKSINKF Gr39b \- GA16339 \- Just needs an S on the end. Gr43a \- GA14333 \- There is an open intron that needs to be spliced. It might also need to be removed from DmGr43a. MEISQPSIGIFYISKILALAPYATRKNSKGQYEIRRSWIFTVYSASLTVTLVFLTYRGLLFDANSKIPVRMKSATSKVVT ALDVSVVVLAVVSGVYCGLFSLRETLELNARLYRIDNTLNACNYSRRDRWRALGMASVSLVTISVLLGLDVGSWMRKAEE MNIEESDTELNVHWYIPFYSLYFILTGLHINFANTAYGLGRRFRRLNQMLSSSFLGEPKEASALKLRKMSTVKTVSTNTT MTPTALHSNLTKLSSEMLPNESAAKNKGLLIQTMADNHESLTKCVLLLSNSFGIAVLFILVSCLLHLVATAYFLFLELLS TRDNGYLWVQALWILLHSFRLLMVVEPCHLAARESRKTIQIVCAIERKVHEPILAEAVKKFWQQLLVVDSDFSASGLCRV NRTILTSFASAIATYLVILIQFQKTNG Gr47a \- GA11896 \- This is a pseudogene in Dp \- my best version is YCLIHHSLCTTIPEFVSPPZPZHTACIAITCWQGRHGVRKVFKDLLHLERQYFKGQPSGTPLKARYKAEEQGGTDEEVAX LLFSKMLRIGSRZIVSTRSALSVCCPATAGSRSPSTPATPKSLDLSLSLRFFVILSYXLSTDITQLREDSLRILLETNRM DRMVRSYPRTYVTLELALHPRRVEFLNVFAFDRKLTLTLLAKTLLYAICWLQGDYMTLKR Gr47b \- GA15589 \- Here's my model MKPRKSSSDGFIYCYGNLYSLLFYWGLVTFRVRTQSDGGAASSPVSVLYAVCVRCFTVCGYLGSVLIKLEDERMAAAMIG HLTPVVKVIIMWECLSSSITYVENFLTLDVQRRRHVKLLANMQGFDLDISEEFPSVRWNYQRTRSKYWYGTVIVTICFFS FSLSLILNMARCTCGLSSTLLMAGTYTLLTSSLGLVGFVHIAIMDFVRLRLRLIQKLLHQEYEGSTGRDRPQTTVHRRIA KLFQFTKRCSHLLAELNAVFGFSLMTCFAYVICQKVLGSEAWDLEYTFMLLHVTLHSYKLIITSTYGYLLKREKRNCLRL LGLYSQHFPQQPLARSQVEDFQHWRMHNRQAALIGSSIQLSVSTIFLVYNGMANYVIILVQLLFQQQQIKERQRELGRDV DIVGPMGPRTHLD Gr57a \- 12290 \- Fine as is. Gr58a \- 15816 \- Here's my model. MSIGLLLKTFHSYGLGSGLLPAPLRLDLDLDRIHFSKRSHQRRFYLAYTACLNVLLIVLLPCTFPVFMYDESYMRDKLVL QWTFNLTNVTRIMAMVACGYLTWTKREPLLQLGEGLARHCHRCRQLENGALHPSGYRELQKRIRGLLRQQVFVLNLSIVS GTLLLMRIDTDVRRSNIIMVVVHMLQFIYVSIMMSGLYVICLLLYWQMERVNLALKELCSRLHHEERNALLLSASLARQT LHSLGHLVQLHCEGQRLMRSLFGIFDVTIAFLLLKMFVTNVNLLYHAVQFGNDSIDTTSVTKLWGESLIVTHYWTAVLLM NLVDNLTRQNGLETGEILRQFSDLELVKREFQLELERFSDHLRCHSTAYKCCGLFVFNKQTSLIYFFSALVNVLVLYQFD LKNSVLKIP Gr58b \- GA12328 \- Needs two aa on the C-terminus, plus internally seven aa need to be added back at the end of the intron. MLHPKLGQALRVAYYHALVFGLMGTTLHIRGNSRLIRVEKVSWIYLAYSLVISGGLLLDTYFMVPKAILDGYIHHNIVLQ WNFFLMLGLRIVTIFCSYGLVWLQRRQLVKLYVDSRHLWRNYRRLLKRMVDQQDLEKLQLSLTSLFWRQTIVVYGALLCS SVVQYQLLSVINRQSLTALCARLSQLLHVLAVKMTFYALLLMLDHQFQAVLLALRTLQRRKGGKQKAKDLRRIAALHLDT YHLARHFFGLYDVANAMLFINMCVTTTSILYHAVQYRNQSIPSDGWGNLFGSGLVLFNLCGTLMLMEKLDRVVSSCNVGP ALRQFCDLRKISKELQMELEIFSTQVHRNRLAYKICGLVEVNNSACLSYIGSILSHVIILMQFDIRRQQME Gr58c \- GA12324 \- Needs five aa on the N-terminus, plus a C-terminal exon. MVHVSLLRFYFELSRLIGLCNLHYDPPHHRLVLNHVPTVVYCLVLDVAYVLIMPFAFSLLVGNIYGCKQLGMFDTVYNVM GQAKLFSMLVLIGGVWLRRCRMEGLGNEYLKLLFHFRSAALNHVRRLCLWKVALTSSRFVMLIQILLTPNSLMHCKYTLD RTGVAPFYLAAMGYALIMELMVTYVDVTVYMIHVSGNWLISSMTERLQEMIDDVEVLPKRLGRPRDMGLRQILSAWLLLW HRCLHLDDLLKQLRDIFQWQILFNLGTTYIFSIATVFRLWIYIDYSKDFSLWTCLIMLFVFLAHHCEVMMQFSIFHTNTS KWRKLQEQLQHLWFLNQSQNGAGLTAEVVLSRKLEFAILYLNRKLQARPQRVRHLHILGLFDLSRASGHAMTSSVFSNAL VLCQIAYKIYG Gr59a1 \- GA15713 \- Please use this for the first of these three duplicated genes. MGILLMLDIFQWFAVLIGLTSYRPVDDRFIQTRIAKAYTLILNVITVTMLPVALMSAVNYFYVAVWLPRFMWITPFVLYA VNYVVIVQTLISRCHRDSILMELHHLVVKLNREMGRAEKKMNSKLRRLFYVKTLTTSYLSLCYILGTFLFSDELTFSMML SAFLINNGYNILIATTHFYFVSFWQVARGYDFVNQQLEELISTPSPLTSRYTEEMRSLWSFHSSLGQTAHKINRIYGRQM LASRFDYIIFTVINGYIGLMYSSREPTTLFAKCYGGLLYWIRTVDFFMTDYICDLVAQYQSMPKHTASEGVMSNELSSYV IYQNSMNLNLKVCGLFPANRKQWLNMMGAILCHSVMLLQYHLMMSAKQRNQ Gr59a2 \- MGILLMLDIFQWFAVLIGLTSYRLVDDRFIQTRIAKAYTLILNVITVIMLPVALVSMVDYFYVAVWLPRFMWITPFVLYA VNYVVIVQTLISRCHRDSILMELHHLVVKLNREMGRAEKKMNSKLRRLFYIKTLTTSYLSLCYVLGTFLSTNELTFSMML SAMLINNSYNILIATTHFYFVSFWQVARGYDFVNQQLEELISTRSPLTSGYAEELRNLWSLHGSLSQTAHKINRIYGRQM LASRFDYITFTVINGYLGLMYSSSEPTNLFEKCYEGLLYLIRTVDFFMTDYICDLVAQYQSMPKHTASEGVMSNELCSYV IYQNSMNLNLKVCGLFSANRKQWLNMMGAILCHSVMLLQYHLMMSAKQHNQ Gr59a3 \- MRRFLLLDIFQWFAVIIGLTSYRVVDDRFVQTRLSRMYTLIVNVITVTMLPAATLDLFNSFSMGFWLPQFMWITPYVLHA VNYAVIVHTLIFRGHRNIIQMELHHLSVKLNREMGRAGKQMNSTLRRLFYLKTLTLSCMCLWHFLQSFIVIGVSSLSKIL GLIIINSGFMILMSIVHCYFVSLWKVAIGYDFVNQQLEELIATRSPLTSRYAEELRSLWSLHASLSLTAHKINRIYGRQM LASRFEYITFTAVDSYMEIIYYFSESAPAISKCFGISFFSIRTIDFFMADYICDLVAQYQSMPKHTASEGVMSNELSSFV IYQNSMNLNLKICGLFPANRKQWLNMMACILGNSAVLMQYHLMMSGKEEKYFKS Gr59b \- GA15716 \- Just needs an N on the C-terminus. Gr59c \- GA15710 \- Needs an open intron removed. MADFVWIIQRFVYLYGRLVGVVNFTVDWRTGRAMITRWATIQAAVQNICLIGLLTFQLLHGDTVLFTFKHAKYLHEYVFL MVTAVRHWAVLLTLVSRWRHRGDIVLIWNRLFRATQQRPDVIPLYRRRLILKFIFAVLSDNLHMVLDLSALRQKFSPALV LKLIVWYLFTTIFNMIVAQYYLAMLQVNVSYTLIKRDLRELLTETQALCGSTNRRGGVFVTKCCALSDRLDRIAETQSKL QALVDGMSKIFQIQSFSMTIVYYLSTIGTIYFAFCTIKYSSTGLGASNWGLLLIVLSTTFFYADNFITINIGFIIMDSNP ELMKMLEERTLLCEELDERLKSSFESFQLQLARNPLEFYVMGLFKIDRGRIMSMANSLITHSIILIQWELQNN Gr59d \- GA15766 \- Needs a C-terminal exon. MPDLVKWCVRISYLYGRVTGTLNFEIDLKTGRTRVTKKATIYSALAHVFLITILAHHLWRVKPTSDLLANANALHENVFM VVAWMRVSCALAALAGRWYHRRRYMRLISSFRCLYLKNTEVMQHCRRGFVSKCFIATMAESMQFLMALLVVWDRLTFSLL IGIWSVMTVTAVMNVIITQYYFALGNARGHYKLLNRDLREVLAEARSLGPKRKRQNGVFITKCCFLADRVDEIAQTQSEL QTLIERMSRIYELQVLCLFCTYYLTSVGNAYLLFSIYKYNNITQGWSKLSLLAGATFLVFYYADCLINSYNVFYLLEAHQ EMHKLLEQRNLFPWGLDERLESAFDSLELSLARNPLQLHCFGLFKLDRSSAFDVGNSLLINSVLLIQYDMQNY Gr59e \- GA17326 \- Needs 5aa on the N-terminus MSSSISNRWGNLLLTISRCLAVAPTARQEGRFARWIHCFWCLVLLGYVWTGCIWKCIVFDAEMPTIEKLLYLMEFPGNIT ITGFLVYHAVLNCPYARDVETQIHLLIGRQDFGVAQRLYQKHGKRTRHLLVQTIVFHGACIVVDIVNYDFNWWTTWSSNS VYNLPALMISLGVLQYALAVHLLWLLKSHLCHCLEQLQKRRRLPQGIVNLDARYDRFFASLVDAGGCSSLVLEELRATYT SIDRLHRQLLDKFGLFLLLNFGNSLCSFCEELYMVFNFFERPQWAAGMLLFYRILWLVMHGGRIWVILAVNEQLVEQECQ LFLQLNQMEVCGSHLERTINRFLVQLQTSIGQPLLACGVIDLDTLAMGGFVGVLMAIVIFLIQIGLGNKSLMGVALNQSG WIYI Gr59f \- GA17325 \- Needs a T on the C-terminus. The third intron has a GC donor. And for the N-terminus I'm pretty much stumped because I can't build a phase 1 intron like Dm and DyGr59f and all the related genes. Rather I have a phase 2 intron there. MEAFLMSLAVDMGKPKKVPPKHPMTAERLLKGHPSFHQQTRRLYKALHWLLLISVLANTAPIAVLPGRQGIVYRHLHLCW MAVSYGWFCLASYWEFVLITLNKVSIDCYLNAMESAIYVVHSASILILTFQWRHRAPAVIDRIVKSDLERGYSINCRQSK HFLRVQLSLVLVLACSAFAIDICSQRFVVYKAILSIHSFVMPNIISSLSFIQYYVLLQGIAWRQAAVTESLQSELQHLPC PRRWEVQRLRLQHVELTRFTKLVNTAYQYSIVLLIVGCFFNFNLNLFLVYKGIDVPELADWVRWIYMVLWLAMHMGKVYG ILYFNHKVQDEQRKCLALLNGVQCVGPDLLDTLNHFVLQLQTNVRQHVVCGVIVLDLKYLSALLVASANFFIFLLQYDVT YEALYKLT Gr61a \- GA12601 \- Needs 8 aa on the C-terminus. And a 15aa N-terminal extension is possible beyond that of DmGr61a. MSKAPDSILRRLKVRRQKQRTILAMRWRCAKGGKEFKELDTFYRAIRPYLCVAQLFGIMPLSNVLSRDPQDVKFRLRSVG MCFTGLFLLLGGIKTVMQANILFRTGLNAKNMMNLVFLIVGIVNWLNFTGFARSWSKLILPWSSLDILMQFAPYAPSKHS LRSKLRLIGCVVGSLAVVDHLLYYASGYYSYHMHIFHCHTNHSRLSFGSYLEKEFSETFELLPYNMFSVCYGFWLNAAFT FLWNFMDIFIVLTSIGLAQRFRQFADRVLALQGRQVPDTLWYDIRRDHIRLCELASLVDESMSNIVLMSCANNVYVICNQ ALAIFTKLRHPINYVYFWYSLLFLLSRTSLVFMSASKIHDASLLPLRTLYLVPSTHWTEEVQRFVSQLTSEFVGLSGYRL FYLTRKSLFGMMATLVTYELMLLQMDAKSHKAGLPDLCA Gr63a \- GA13400 \- Needs six aa on the N-terminus. MDMKFPHSFSKMANYYRRKKDAVFHNAKPINSGNAQAYLYGVRKYSIGLAERLDADYQPPPSDRKKSSDSTGSNNPEFTP SVFYRNIAPVNWFLRIIGVLPIVRRGPARAKFEMSSASFVYSVVFFMLLACYVGYVANNRIHIVRSLSGPFEEAVIAYLF LVNILPIMVIPILWWEARKIAKLFNDWDDFEVLYYQISGHSLPLRLRQKALYIAIVLPILSVLSVVITHITMSDLNINQV VPYCILDNLTAMLGAWWFLICEAMSTTAHLLAERFQKALKHIGPAAMVADYRVLWLRLSKLTRDTGNAMCYTFVFMSLYL FFIITLSIYGLMSQLSEGFGIKDIGLTITALWNIGLLFYICDEAHYASVNVRTNFQKKLLMVELNWMNSDAQTEINMFLR ATEMNPSTINCGGFFDVNRSLFKGLLTTMVTYLVVLLQFQISIPTDKGDSDGGTNITVVDMLMDSLGNDMTILSASSSTT THSTATSSTTPPPTSAKHGRGHRG Gr64a \- GA16796 \- Fine as is. Gr64b \- GA16793 \- Fine as is. Gr64c \- GA16792 \- Needs MK on the N-terminus. Gr64d \- GA17330 \- Mine is MEWSAQSVPNKNTLHHAIGYVLVVAQFFGVLPLSGVEPSVPVASVRFRWFSPLNLLPVAALCFVLLDFVLSAKLVIQNGL KLYTIGSLSFSVICIFCFGAFLLLAPRWPHIIRRTFECERIFLQSCYNSSIGRRFSQRLRRWAIALLVTALCEHLSYVVS AVWSNWKQIRECHLDIDFWQNYFLRERQELFSILPYSTWFALYVEWCTLSMTFVWNFVDIFLILVCRSMQMRFQQLHWRI RQHIGRRMSDEFWQEVRYDLLDLNDLLKLYDKELSGLVLVACANNMYFICVQIYHSFQVKGAVLDEVYFWFCLLYVVSRI VNMVLAASSIPQEAKQINFTLDEVPTSCWSKELERLSEIFHNEAFALSGKGYFVLNRRLLFTMAATLMVYELVLINQMEG EEVQRSICNRGAGSSMSIFFS Gr64e \- No GA number \- Mine is MARTTGDPVKRQKCIARIKFWRRSRVGSDITLGILKYKVVSNQAQRFQFSKINAFLRRAVRDDYRYSGSFQEAIKPVLII AQIFALMPVRGIGSKLAEDLTFAWSSARTYYALAMMISFGVTSGYIVAFMTNISFDFDSVETMVFYGSIFLISMSFLQLA TRWPAIAQEWQAVETKLPPLRLGKERRSLAHHIKMITLVATTCSLVEHLLSMTSTMTYSVACPRWPGHPVDNFLYFNFAT VFHFVDYSTFLGLLGKVINVLSTFAWNFNDIFVMAVSVALASRFRHLNDYMQREARSATTVGLLDAVQSQFRNLCKLCQV VDDGISTITLLCFSNNLYFICGKILKSMQTKPSASHTMYFWFSLTYLLGRTLVLSLYSSSINDESKRPLRIFRMVPREYW CDELKRFSEEVHMDTVALTGMKFFRLTRGVVISVAGTIVTYELILLQFNKEETTAFTCENA Gr64f \- GA16791 \- Needs an internal segment restored. MKFLPAKLERKFRRLKKHSRSSLTRKLDVMHESARKKVIEENCDAYKNQKQSEYECRKRPTKFPGGTRETFLSEGSFHQA VGRVLLVAEFFAMMPVKGVTAKHPGDLSFSWRNVRTCFCLVFIASSLANFGLSLFKVLNNPISFNSVKPIIFRGSVLLVL IVALRLAQQWPTLMMYWHEVEQGLPQYPSQVGKGQMGHTIRMVMLVGMMLSFAEHLLSMISAIHYARYCNSTSDPIKNYF LRTNDEIFYVTSYSTALALWGKFQNVYSTFIWNYMDMFVMIVSIGLAAKFRQLNNDLRNFKGMHMAPSYWSERRIQYRNI CVLCDKMDDAISLITMVSFSNNLYFICVQLLRSLNTMPSVAHAVYFYFSLIFLIGRTLAVSLYSASVHDESRLTLRYLRC VPKDSWCPEVKRFSEEVISDEVALSGMKFFHLTRKLVLSVAGTIVTYELVLIQFHEDNDLWDCNQSYYS Gr65a \- removed from the superfamily as not clearly homologous. Gr66a \- GA20169 \- Needs three aa on N-terminus, and a further 20 are possible but don't align with DmGr66a. I also have a somewhat different internal splice \- but not sure who is right. MPPAQTESALPMVQPLLKEFQLLFYISKIAGILPQDLEKFRTKNVLEKSRNGMVYMLAMLIVYVLLYNILIYSFGEEDRT LKASQSTLTFVIGLFLTYIGLIMMGTDQLTALRNQGRIGELYERIRQVDERLYKEKCVVDNSHIGGRIRFMLIMTFLFEL SILLATYIKLVDYTQWMSVLWIVSAIPTFINTLDKIWFAVSLYALKERFEAINQTLEELVATHEKFKLWLRGDHDTSSRT LDSSQPPEYDSNLEYLYRELGGLDMGSLKGSGKNKVAPVSHSMNSFGESIEASDKATHHPISVNMAHESELSNASKVEEK LNNLCQVHDEICEIGKAMNELWSYPILSLMAYGFLIFTAQLYFLYCATQFQSIPSLFRSAKNPFITVIALSYTSGKCVYL IYLSWKTSLASKRTGISLHKCGVVADDNLLFEIVNHLSLKLLNHSVDFSACGFFTLDMETLYGVSGGITSYLIILIQFNL AAQQAKEAIQTFNSLNDTASLVGAATEMDNGSSTLYDLVTTTMLTPTV Gr68a \- GA20248 \- Needs EV added to C-terminus. Gr77a \- GA16898 \- Not sure why, but this time I extended the N-terminus well beyond the Dm alignment. MKFNQANSWRNQCSNAPMLQCANAPLPLALIRLCVLDSLWRTAQNASCIHNDMASSSLAGLTFYWLRKVAIAALLVLYGL AKVFGLMAASTPRGGHRVRQSLYWRIHGYAMLVFVGCFSPIAFASVYHRMAFLRQNRLLLLIGFNRYVLMLLCAFATVCI HTTKQEEIVGCLNQLLRCRRRLMRLMHTPELRQSIDRLSTRGNLLIVGVLIGVFILSPVHTLQILAWDPAVRENFLYVFS LLFIYACQLILGLSLGLYVLVLVLLDHLGHHSNQLLARLLADAASLRAPLGCGIIRRRQQLYHSQQTWLALELWRLLHVH HQLLALFRSICSLTGLQAVCFVSLVAMECMVHMFFTYFMHYSKFILRKYHRAFPFNYYAMAFVTGLFANLVLVICLTHRM ICRFAHTREVLRSGVLALPPGGTVKQLNETLHYYGLYLKNAEHIFAVRACGLFKLNNVLLFCIVEGMLNYLMILIQFDKV INK Gr85a \- GA16233 \- This gene is truncated by the end of a contig, so I built the rest of it from raw traces \- here's my extended version. MSSLKRLVQLSFGFFCALNGIVPFYFGFSSGKLHWSRVLAYYRIIHNFIVIGLTIKFITNYWHFHTHVEHSRSKLMELNT FTHFTIVLLSLLSCMECAHRQQDRIYGMIEKLLDLDRLSIELGYIAPKSHQRYIGLLVLAITPLLALRLFIHVGLNKIRT RLGFDFPCNCFMSECMILGMSSVGFGIMAEICQCWWRLQTGLKRTLLDDSLPDQLNQLLQLQRMFQCLIDITAEFCTVFK FVLLCFLVRNVWCGIVIGYMLVRIFCGHGVSELHLYQLYLAFVICIQPLLYSLLLNCLTHTTDSLMETTKEIVRESQGHE LLVERSIQWFSLQLAWQHTNVHVYGTYRINRRLVFQSASVILLHVSYMVQSDYRSM Gr89a \- GA13339 \- Not in BLASTP \- My version is below, but there is also an 18aa N-terminal extension possible beyond the DmGr89a alignment. MSRLPHVCGLCLLLWLWQLLALAPFSYSRSRGARCRRLLTLSGVLRWLLLIGLAPLMLWKSAAMYDATNVRHSMIFKNIA LAAMTGDVFISLALLGAHLWHRRGLARLLNGLAQLHRKRKLGWGSTLLLWSKLLLSLYELLCNVPFLQGAGSRLPWTQLL AYGVQLYVQHVSSVYANGIFGGMLLLLASLDHLEQESPALARLLKRERGWLRLSASFVDLFQLGIFLLVIGYFVNILANM YAYMSYFVSQHGVPLTISNYCLIVTIQLYALILAAHLCQVRHGRLRQRCLELGYLPPELTHHQAMAWTPFPLFAPLDSLK FSVLGLFTLDHAFWLFLVSYAMNFIVIILQFSLENMQHADDN Gr93a \- GA12269 \- This needs N- and C-terminal extensions. MDKYSQQKRGGGGVVAEAWSRGLLLTLYRAARVLGLISFRLDREELQLKLPKSGSRNRILETVWRCLVVLTYAGVWPQLS AHLITDRPESYADMFALMQSFSVSILALVSFIIQAKGEDKFRTVLNRYLTLYRRICAVTRVDQLFPMKFIVYFLLKLLLT IGGCVHEFPPLLKNEHFKDARNMVAVIVGIYMWLGTLFVLDACFMGFLVSGILYEHMAFNILLMLQRMQPIECEEKAVRM SKYQRMRLLCDYADELDECASIYSELYGVTIAFRRMLQWQILFYVYYNFISICLILYLCILHYLNANEIALVSLAMATIK LFNLILLIMCADYAVRESQKPNRLPLDIVCTDMDQRWDKSVETFLSQQQTQRLEIKVLGFFQLNNEFILLILSAIISYLF ILIQFGITGGFEASEEVRKQFNSSSHQIQELLN Gr93b \- GA16189 \- Needs 10 aa on N-terminus MTGSSSARSTVMPRVSPWLNGPRISAGLLRGCFYYATVFGVATFRIGLQDDASKLRASSRKGYKWLSILIRVLGSCFYGY SYGAWADQYTDWYLRLFFGLRLVGCLVCSVIILVLQVCYEKRILHLVNSFFGLFRRLRALTRTVKAGFGGRLELTLLMFK LLSLAFVFLAFQWQYSPWVLLTILCDLYTSIGTGMIMHFCFVGYLSIGVLYAELNRYVDHQLRAQLSSLQDQVEEDDIQQ QPDVQAHANLDECLAIYEEIHQVTCSFQRLFDLPLFLTLVQNLSAMAMVSYHAIMSREYHFSLWGLVLKLLIDVLLLTLA VHGAVSSSRLVRRLSLENYTIGQSKSYHIKFEIFLGRLNHQQLRVCPLGMFEVSNELTLFFLSAMVTYLTFLVQYGIQTK QF Gr93c \- GA16064 \- Fine as is. Gr93N \- GA16188 \- Not in BLASTP. In our opinion the ortholog of the DmGr93d was lost in Dp, while the ortholog of this gene was lost in Dm. Thus they are paralogs, hence the name Gr93N. This gene is actually more similar in sequence and trees with DmGr92a, but that gene was lost from Dp. Complicated, I know. MFGVLRKMNASKLSAGILLVMYYHAIFMGIFSFKLQRHWISENGQMLMELRALPRPWLMRFYAIYRILAIGILAYFYLPW ILRLEIFFERLVHFIRIITATLVCVCILRLQLLHKADSKQLMNTFFRLFRRVRALPSRKTFGYGGRRELVLLSLALICRI HELVYILESDRQHFSMARFLSWWCDTFIVFGINMMMQMSFVCYLSIGILYSELNDFVRFQLRSELQALKRPHGLQPRRQH LRTVRRKLNECLALYREIYALATTFQKLTDFPFFLSIVHNYTLLGVVIYRLTIVGWFDKHKIQLSILTTKVILDFFLVTM AVEGAMTQFRVIRRLSLENCYISDHKDWHTTFDMFVTHLSLYEFRVRVLGLFDVSNELVLIVLSALVTFVIYVVQYRMQS TGEAE Gr94a \- GA16146 \- Needs aa on both ends. MASSIDVTHRRMVKVLTITLILLMTVFGLLANRYDSRRRQSFKLSKVYLAYAILWTTAFAGIYGYQIYQDYIQGQINLRD AVSLYSYMNITVAIINYVTQMIMNDTVAKTMSQVPLFQTLKMFHLDNASLLRSIAMALVKSVGFPLILETTFILQQRRLE PELSLIWTVYRLLPLIISNLLNNCYFGAMIVVKEILKAINVRLESHRQQVNIMQREDQLKLNTSFYRMQRFCSLADELDR LADRYIVIYVNSDKYLSLMSLSIILSLICNLLGMTVGFYSLYYALADTFLGSKPYDGLGALISLVFLFISLTEITMLAHL CNNLLVATRRTAIILQEMNLQHADCRYRQAVHSFTLLVTLTKFKIKPMGLYELDMQLISNVFSAVSSFLLILIQADLSHR FKMY Gr97a \- GA17266 \- Fine as is. Gr98a \- GA12669 \- I don't understand how you avoided the 11bp frameshifting insertion relative to Dm around position bp90 in this gene. My translation is below, with X for the frameshift. Thus I consider this to be either a pseudogene or the translation would start after this frameshift. MRLMSGELDSCSLRRMHRVMKCLGIIPFESXPIQHFYLKVLCNLFMVFVIGTASSWRFSFNYEFTYDFLNDHMSRILDLT NFLVLMSAHFTIVMEILWGNRSAEIEQQMEQILHDLRVHLGREVSLKRFRHYSNAIYGSLISRFLLLFAVAVYNNEGLVF SAMYSEAVMQLRFTEFSLYCGVALAFHQELCSAGSSLLVELHLTRFDLWPLRRFTLEKLSRLQQIHGRLWQTIRLIERNF QRSLSIMLLKFFVDTAALPYWLYLSKIQHTSVSVQYYCATEEFCKLMEIIVPCWLCSRCDLMQRKFRSIFYRLATGRRNG QLNAALIRICIQLRQEKYQFSAGGFAYISTEMLGTFLFGMISYIVIGIQFNLNFNASNSSKLAAEAAVTDAPI Gr98b \- GA15973 \- This needs aa on both ends, plus there is a possible 53aa N-0terminal extension not alignable in Dm. And GA15975 and GA15976 which are kept for DmGr93c/d orthologs can be dropped since their ortholog in Dp was lost. MKGRRRLLAAARPYLQIFSVFALTPPPNFFDRTTNRRLRRFLIVGYGVYSLFLLGILICMSYVNVLALNAEIEQFQLEDF TRAMGRVQKVVLASMGIVIHLNMFLNYRRLGHIYEDIADLEMEIDDASQCFGGQPGHNSFRYRLATNCGLWLVALVGLMP RFTIEAMGPFVSWPSKILSELVLIMLQLKCLEYCVFVVMVYGLVLRLRHTLRQLQVELADCNQRDMLQALCVALRRNQQL LGRVWRLVGELEKYFTLPMMFLFLFNGLTILHVVNWAYINQFNPADCCRYVRLGNCVLLLINPLVACYLSQRCVNAYNSF PRILHQIRCLSVANNFPILSMGLREYCLQLQHLRLLFTCGGFFDINLKNFGGLIVTILGYTIILVQFKFQAVAEEKGRFD LNNSQSF