Method: same as Categories 1 and 2; most were resolvable by ClustalW. Category 9: genes which were properly linked in the penultimate XML but are not linked in the final one, and where the CG itself is still present and seems as though it should indeed be linked These should be easy by Clustals of proteins. Tho we may need advice on why the link went from the last xml. Abd-B FBgn0000015 =CG10291|FBan0010291|CT28893|CT40560| acj6 FBgn0000028=CG9151|FBan0009151|CT26176| anon-67Ea = no match hay FBgn0001179 =CG8019|FBan0008019|CT24084| (anon-67Ea is an ORF of 170 amino acids that begins immediately upstream of hay and overlaps the N-terminal coding region of hay) Apc FBgn0015589 =CG1451|FBan0001451|CT3529| Atpα FBgn0002921 =CG5670|FBan0005670|CT17822|CT36459|CT36488|CT36562| bsk FBgn0000229 =CG5680|FBan0005680|CT40896| Ca-P60A FBgn0004551 =CG3725|FBan0003725|CT12479|CT40312|CT40314|CT40316 csw FBgn0000382 =CG3954|FBan0003954|CT37554|CT13063 eIF-4E FBgn0015218 =CG4035|FBan0004035|CT39424|CT13384 Fer1HCH FBgn0015222 =CG2216|FBan0002216|CT40862|CT40864|CT40866| for FBgn0000721 =CG10033|FBan0010033|CT42452| (align over 561 aa, most of the C terminus of long isoform not here) Gel FBgn0010225 =CG1106|FBan0001106|CT40886|CT40884|CT1647 (CT40886,CT1647 identical CDS, 6aa longer than Gel in CDS set CT40884 starts at aa 51 of Gel in CDS set) Hnf4 FBgn0004914 =CG9310|FBan0009310|CT26497|CT40906 (CT40906= Hnf4 in CDS set, CT26497 is 217 aa longer at the N terminus and is missing a stretch of amino acids (no blast match of the 217 aa to known proteins)) Hs2st FBgn0024230 =CG10234|FBan0010234|CT28767| (good match for first 296 aa, mismatch at C terminus, to fix.) ImpL3 FBgn0001258 =CG10160|FBan0010160|CT28577| mts FBgn0004177 =CG7109|FBan0007109|CT21977| ogre FBgn0004646 =CG3039|FBan0003039|CT9674|CT40095| Pak FBgn0014001 =CG10295|FBan0010295|CT28905|CT40904| Pep FBgn0004401 = CG6143|FBan0006143|CT41974|CT19128| (CT19128 CDS starts at aa 24 of CT41974 CDS.) (Aubrey/Chris? Note:CT40332 is blank in aa_gadfly.dros) Pkc53E FBgn0003091 =CG6622|FBan0006622|CT20486|CT42082| (CT42082 has an extra 9 aa compared to CT20486) Pp1β-9C FBgn0003131 =CG2096|FBan0002096|CT6778| (CT6778 CDS starts at 3rd Met of Pp1β-9C from CDS set.) Pp1-87B FBgn0004103 =CG5650|FBan0005650|CT17842| PpY-55A FBgn0003140 =CG10930|FBan0010930|CT30615| Pten FBgn0026379 =CG5671|FBan0005671|CT40892|CT40894|CT17882 Ptp99A FBgn0004369 =CG2005|FBan0002005|CT6383| (CT6383 missing 4aa in midsection, starts at 2nd Met of Ptp99A) _____________________________________________________________________ A MESS, ONE CG that has 3 CTs that belong to 3 separate genes: all 3 in aa_gadfly.dros repo FBgn0011701 =CG8045|FBan0008045|CT24072| BUT: CG8045|FBan0008045|CT24092| matches 14-3-3epsilon FBgn0020238 CG8045|FBan0008045|CT24072| matches repo FBgn0011701 CG8045|FBan0008045|CT24102| matches nothing known (all 3 are annotated as '14-3-3epsilon' in GenBank.) _____________________________________________________________________ Shc FBgn0015296 =CG3715|FBan0003715|CT12443| Src64B FBgn0003501 =CG7524|FBan0007524|CT40878|CT1253| Trfp FBgn0013531 =CG18267|FBan0018267|CT41393| ttk FBgn0003870 =CG1856|FBan0001856|CT5673|CT36468|CT36466| (CT36466 corresponds to the 643 aa short isoform) tws FBgn0004889 =CG6235|FBan0006235|CT19500| (CT36963 is 57 aa and doesn't match CDS in CDS set; check when annotating) Vha16 FBgn0004145 =CG3161|FBan0003161|CT40117|CT12409|CT10607| (all same CDS) pk FBgn000309 =CG11084|FBan0011084|CT42406| (good alignment for first 1050 aa, last 200 or so need fixing/checking) Abl FBgn0000017 =CG4032|FBan0004032|CT13380| Acp95EF FBgn0002863 =CG17924|FBan0017924|CT41862| (CG missing first 6 aa) Actn3 FBgn0015008 =CG8953|FBan0008953|CT25722| (incomplete CDS in CDSset) anon-88Bd FBgn0025554 =CG3321|FBan0003321|CT11157| CanB2 FBgn0015614 =CG11217|FBan0011217|CT31322| cpo FBgn0000363 =CG18434|FBan0018434|CT41984| (aa 6-450 match well, rest need fixing, cpo is incomplete CDS) _________________________________________________________________ WARNING:inconsistency in pnr CGs: pnr FBgn0003117 =CG3978|FBan0003978|CT41948| NOTE: CG3978 product in GenBank is 531 aa and matches pnr. CG3978 product in aa_gadfly.dros is only 32 aa and does not match pnr. Aubrey:CT13235 listed by you is not in aa_gadfly.dros. _________________________________________________________________ Pp1-13C FBgn0003132 =CG9156|FBan0009156|CT26196| Ret FBgn0011829 =CG1061|FBan0001061|CT1245| (Ret incomplete CDS in CDSset, CG ORF longer and starts with the amino acids MSATAN.)