From benos@XXXX Wed Dec 13 14:36:42 2000 Envelope-to: gm119@XXXX Delivery-date: Wed, 13 Dec 2000 14:36:42 \+0000 Date: Wed, 13 Dec 2000 14:35:56 \+0000 (GMT) From: Takis Benos <benos@XXXX> To: flybase-updates@XXXX cc: Eleanor Whitfield <eleanor@XXXX> Subject: EDGP-CG gene comparison.... MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY='1053048513-1391734921-976718156=:45284' Content-Length: 63419 Hello, dear All: Please, find attached a text file with the results of the comparison we performed between EDGP genes and the ones reported by Celera in the version 1.0 of the Drosophila Genome. The differences between the genes are divided in the following categories: \# 0. less than 1% differences \--> IDENTICAL \# A. differences on ATG selection \# B. additional/missing internal exon \# C. differences on termination codon selection \# D. major differences indicating very different gene model Genes of category D were analysed in detail (as well as some of the other categories). File EDGPvsCG.takis.for_FB contains the results of the analysis. Note that some of the CG genes (category D) were found to be wrong, based on EST hits. Also two of the genes were 'splitted' between two EDGP clones; therefore their corresponding entries need to be joined. Best regards, takis ;) From benos@XXXX Wed Dec 13 14:32:39 2000 Date: Wed, 13 Dec 2000 08:30:26 \-0600 From: Takis Benos <benos@XXXX> To: benos@XXXX \# \#23456789012345678901234567890123456789012345678901234567890123456789012345 \# \# This file is summarising the differences between EDGP and CG \# a.a. sequences, as reported by Melanie Gatt in the form of ClustalW \# output. THe categories of differences that are considered are: \# \# 0. less than 1% differences \--> IDENTICAL \# \# A. differences on ATG selection \# B. additional/missing internal exon \# C. differences on termination codon selection \# D. major differences indicating very different gene model \# \# Plus/minus symbols mean that EDGP gains/misses in sequence length over CG \# gene. \# \# PVB-000724 \# >CG17636|FBan0017636|CT38919|FBan0017636 last_updated:000321 > EG:23E12.1 >Diff : 0 >CG17617|FBan0017617|CT38862|FBan0017617 last_updated:000321 > EG:23E12.5 >Diff : 0 >CG17960|FBan0017960|CT40020|FBan0017960 last_updated:000321 > EG:23E12.2 >Diff : B- >CG17707|FBan0017707|CT39230|FBan0017707 last_updated:000321 > EG:23E12.3 >Diff : 0 >CG3038|FBan0003038|CT10170|FBan0003038 last_updated:000321 > EG:BACR37P7.1 >Diff : 0 >CG2995|FBan0002995|CT10081|FBan0002995 last_updated:000321 > EG:BACR37P7.2 >Diff : 0 >CG13377|FBan0013377|CT32709|FBan0013377 last_updated:000321 > EG:BACR37P7.9 >Diff : A- >CG13375|FBan0013375|CT32707|FBan0013375 last_updated:000321 > EG:BACR37P7.8 >Diff : B- >CG18104|FBan0018104|CT40671|FBan0018104 last_updated:000321 > EG:171D11.4 >Diff : C+ >CG17829|FBan0017829|CT39573|FBan0017829 last_updated:000321 > EG:115C2.6 >Diff : 0 >CG5254|FBan0005254|CT16777|FBan0005254 last_updated:000321 > EG:BACR19J1.2 >Diff : 0 >CG5273|FBan0005273|CT16821|FBan0005273 last_updated:000321 > EG:BACR19J1.3 >Diff : 0 >CG12467|FBan0012467|CT32681|FBan0012467 last_updated:000321 > EG:34F3.2 \#>Diff : A-- B- C+ >Diff : D \# \#Comments: \# \# A : differences in NH2- terminus are probably due to different prediction \# methods used: in this case gene CG12467 should be EG:34F3.1 \+ \# EG:34F3.2 . \# B : CG12467 has one additional exon: this probably comes from Genie \# prediction. None of Genfinder/Genscan predictions support this exon. \# C : differences on the \-COOH are probably due to different prediction \# methods used. In this case, the two last exons of EG:34F3.2 (starting \# 'APSATPII...' should form another CG gene). \# \# There is no supporting evidence to favour of EDGP or CG model. \# (and the EST hits cannot enlighten us further on that) \# \#CONCLUSION: Ambigious \# >CG16785|FBan0016785|CT32689|FBan0016785 last_updated:000321 > EG:34F3.6 >Diff : A-- >CG11664|FBan0011664|CT34394|FBan0011664 last_updated:000321 > EG:BACR7A4.3 >Diff : 0 >CG3711|FBan0003711|CT12449|FBan0003711 last_updated:000321 > EG:BACR7A4.19 >Diff : 0 >CG3034|FBan0003034|CT10206|FBan0003034 last_updated:000321 > EG:BACR7A4.6 >Diff : 0 >CG3708|FBan0003708|CT12445|FBan0003708 last_updated:000321 > EG:BACR7A4.18 >Diff : A+ >CG3706|FBan0003706|CT12435|FBan0003706 last_updated:000321 > EG:BACR7A4.20 >Diff : 0 >CG11642|FBan0011642|CT36631|FBan0011642 last_updated:000321 > EG:BACR7A4.5 >Diff : 0 >CG3704|FBan0003704|CT12427|FBan0003704 last_updated:000321 > EG:BACR7A4.17 >Diff : 0 >CG3026|FBan0003026|CT10194|FBan0003026 last_updated:000321 > EG:BACR7A4.16 >Diff : 0 >CG3703|FBan0003703|CT12417|FBan0003703 last_updated:000321 > EG:BACR7A4.15 >Diff : A- >CG3699|FBan0003699|CT12411|FBan0003699 last_updated:000321 > EG:BACR7A4.14 >Diff : D- \# \#Comments: \# \#The \-COOH terminus (70-90 last a.a.) is completely different between the \#two genes. That is very strange, because EG:BACR7A4.14 is single-exon \#gene. Obviously, CG3699 has a splice site at some point which leads to \#different \-COOH sequence. \#However, ESTs AI293564 and AI294621 support the view that the exon \#predicted by EDGP is terminal. \# \#CONCLUSION: EG:BACR7A4.14 is more correct. \# >CG3690|FBan0003690|CT12383|FBan0003690 last_updated:000321 > EG:BACR7A4.13 >Diff : 0 >CG11638|FBan0011638|CT36619|FBan0011638 last_updated:000321 > EG:BACR7A4.12 >Diff : A++ >CG18091|FBan0018091|CT40606|FBan0018091 last_updated:000321 > EG:190E7.1 >Diff : 0 >CG3051|FBan0003051|CT10258|FBan0003051 last_updated:000321 > EG:132E8.2 |FBgn0023169;SNF1A >Diff : 0 >CG3719|FBan0003719|CT12473|FBan0003719 last_updated:000321 > EG:132E8.3 >Diff : 0 >CG11448|FBan0011448|CT36231|FBan0011448 last_updated:000321 > EG:132E8.4 >Diff : 0 >CG18531|FBan0018531|CT42308|FBan0018531 last_updated:000321 > EG:BACN32G11.1 >Diff : 0 >CG3021|FBan0003021|CT10176|FBan0003021 last_updated:000321 > EG:BACR7A4.8 >Diff : C- >CG14630|FBan0014630|CT34388|FBan0014630 last_updated:000321 > EG:BACR7A4.9 >Diff : A- >CG14629|FBan0014629|CT34387|FBan0014629 last_updated:000321 > EG:103E12.2 >Diff : 0 >CG3655|FBan0003655|CT12291|FBan0003655 last_updated:000321 > EG:103E12.3 >Diff : 0 >CG11392|FBan0011392|CT31794|FBan0011392 last_updated:000321 > EG:BACR42I17.1 >Diff : B+ >CG11382|FBan0011382|CT31774|FBan0011382 last_updated:000321 > EG:BACR42I17.10 >Diff : 0 >CG11398|FBan0011398|CT31827|FBan0011398 last_updated:000321 > EG:BACR42I17.11 >Diff : 0 >CG14628|FBan0014628|CT34385|FBan0014628 last_updated:000321 > EG:BACR42I17.12 >Diff : 0 >CG11378|FBan0011378|CT31764|FBan0011378 last_updated:000321 > EG:BACR42I17.2 >Diff : 0 >CG11384|FBan0011384|CT31776|FBan0011384 last_updated:000321 > EG:BACR42I17.3 >Diff : 0 >CG11379|FBan0011379|CT31766|FBan0011379 last_updated:000321 > EG:BACR42I17.4 >Diff : 0 >CG14627|FBan0014627|CT34384|FBan0014627 last_updated:000321 > EG:BACR42I17.5 >Diff : A++ >CG14626|FBan0014626|CT34383|FBan0014626 last_updated:000321 > EG:BACR42I17.6 >Diff : A- >CG11380|FBan0011380|CT31768|FBan0011380 last_updated:000321 > EG:BACR42I17.7 >Diff : A++ >CG14625|FBan0014625|CT34382|FBan0014625 last_updated:000321 > EG:BACR42I17.8 >Diff : A+ >CG11491|FBan0011491|CT36317|FBan0011491 last_updated:000321 > EG:17A9.1_altrn_1 |FBgn0000210;br /alternative >Diff : 0 >CG3791|FBan0003791|CT12681|FBan0003791 last_updated:000321 > EG:73D1.1 >Diff : 0 >CG14813|FBan0014813|CT34626|FBan0014813 last_updated:000321 > EG:63B12.10 >Diff : 0 >CG16903|FBan0016903|CT37506|FBan0016903 last_updated:000321 > EG:67A9.2 >Diff : C-- >CG3981|FBan0003981|CT13223|FBan0003981 last_updated:000321 > EG:67A9.1 >Diff : A- >CG3954|FBan0003954|CT37554|FBan0003954 last_updated:000321 > EG:BACN25G24.2 |FBgn0000382;csw >Diff : 0 >CG3895|FBan0003895|CT12875|FBan0003895 last_updated:000321 > EG:BACN25G24.3 |FBgn0004860;ph-d >Diff : A-- B+ C+ > \##### WARNING Known gene \! \##### >CG3540|FBan0003540|CT11900|FBan0003540 last_updated:000321 > EG:152A3.2 >Diff : 0 >CG10755|FBan0010755|CT30144|FBan0010755 last_updated:000321 > EG:152A3.6 |FBgn0015036;Cyp4e4 >Diff : 0 >CG18031|FBan0018031|CT40356|FBan0018031 last_updated:000321 > EG:103B4.2 \#>Diff : B C- >Diff : D \# \#Comments: \# \# B : differences on internal exon(s) are due to different prediction \# EDGP followed Genefinder; Genscan (and \-presumambly Genie- supports \# one more exon. Similarity hits from both SW and TrEMBL though, \# favour EDGP's (a.k.a. Genefinder's) prediction. (although the hits \# are not very strong) \# C : differences in \-COOH terminus are probably due to different \# prediction methods used. There is no supporting evidence for any of \# the gene structures there. \# \#CONCLUSION: Ambigious or EG:103B4.2 is more correct. \# >CG3218|FBan0003218|CT10801|FBan0003218 last_updated:000321 > EG:30B8.5 |FBgn0000810;fs(1)K10 >Diff : 0 >CG14050|FBan0014050|CT33609|FBan0014050 last_updated:000321 > EG:BACH48C10.1 >Diff : 0 >CG2854|FBan0002854|CT9762|FBan0002854 last_updated:000321 > EG:BACH48C10.2 >Diff : C- >CG14048|FBan0014048|CT33607|FBan0014048 last_updated:000321 > EG:BACH48C10.6 >Diff : 0 >CG2841|FBan0002841|CT9712|FBan0002841 last_updated:000321 > EG:BACH48C10.5 >Diff : A+ >CG14047|FBan0014047|CT33606|FBan0014047 last_updated:000321 > EG:BACH48C10.4 \#>Diff : A-- B- B+ B+ C- >Diff : D \# \#Comments: \# \# A : differences in NH2- terminus are probably due to incomplete analysis \# of EDGP. Gene EG:BACH48C10.4 is located at the end of BAC H48C10 \# (reverse orientation) and is probably continuing on BAC H7M4. In \# this case gene EG:BACH7M4.1 should match the NH2- terminus of \# CG14047. \# B : differences on internal exons should be due to different prediction \# algorithms used. Although, in this case, EDGP followed the Genscan \# prediction (which should be similar to Genie that Celera used), it is \# possible that the two software programs are utilising different exons. \# C : differences on the \-COOH are probably due to different prediction \# methods used. It applies the same as in (B). \# \# (There is no supportive evidence to favour one or the other prediction.) \# >CG13756|FBan0013756|CT33235|FBan0013756 last_updated:000321 > EG:BACH59J11.2 >Diff : B+ >CG14045|FBan0014045|CT33604|FBan0014045 last_updated:000321 > EG:BACH7M4.1 >Diff : A-- >CG14045|FBan0014045|CT33604|FBan0014045 last_updated:000321 > EG:BACH7M4.2 >Diff : A- C- >CG12496|FBan0012496|CT33234|FBan0012496 last_updated:000321 > EG:BACH7M4.4 >Diff : C- >CG12497|FBan0012497|CT33237|FBan0012497 last_updated:000321 > EG:BACR25B3.2 >Diff : A+ B+ >CG13758|FBan0013758|CT33238|FBan0013758 last_updated:000321 > EG:BACR25B3.3 \#>Diff : A+ B C+ >Diff : D \# \#Comments: \# \# A : differences in NH2- terminus are due to different prediction methods \# used. Although, in this case, EDGP followed the Genscan prediction \# (which should be similar to Genie that Celera used), it is possible \# that the two software programs are utilising different starting \# exons. There is no other evidence to favour one or the other \# version. \# B : differences on internal exons should be due to frameshift errors \# (most probably). \# C : differences in \-COOH terminus are due to different prediction methods \# used. Although, in this case, EDGP followed the Genscan prediction \# (which should be similar to Genie that Celera used), it is possible \# that the two software programs are utilising different end exons. In \# fact, the last exon that Genscan/EDGP predicts (not present in \# Genefinder prediction) is some 1.5 kb downstream of the previous one. \# There is no supportive evidence to favour one or the other prediction. \# \#CONCLUSION: Ambigious. \# >CG13759|FBan0013759|CT33239|FBan0013759 last_updated:000321 > EG:BACR25B3.5 >Diff : B+ >CG13760|FBan0013760|CT33241|FBan0013760 last_updated:000321 > EG:BACR25B3.6 >Diff : A-- >CG17437|FBan0017437|CT33240|FBan0017437 last_updated:000321 > EG:BACR25B3.7 >Diff : 0 >CG13761|FBan0013761|CT33242|FBan0013761 last_updated:000321 > EG:BACR7C10.4 >Diff : C+ >CG10742|FBan0010742|CT30103|FBan0010742 last_updated:000321 > EG:BACR7C10.6 >Diff : 0 (6/302) >CG9904|FBan0009904|CT27892|FBan0009904 last_updated:000321 > EG:BACR7C10.1 >Diff : 0 >CG13762|FBan0013762|CT33244|FBan0013762 last_updated:000321 > EG:BACR7C10.7 >Diff : B- >CG10260|FBan0010260|CT28483|FBan0010260 last_updated:000321 > EG:BACR7C10.2 >Diff : A+ B- B+ (over 2160 a.a.) >Diff : D \# \#Comments: \# \# A : differences in NH2- terminus are minor and involve different starting \# ATG selection (2 a.a. apart). \# B : the initial differences on internal exon should be due to different \# prediction algorithms used. EDGP follows Genefinder prediction, \# which in this case seems more trustworthy (Genscan predicts a tiny \# intron in the region). Both programs however, agree on the 'main \# body' of the prediction, i.e. region starting at 'ADKLCG...'. Thus, \# the absence of one region in CG10260 is unexplainable there. \# \#CONCLUSION: Ambigious or EG:BACR7C10.2 is more correct. \# >CG12498|FBan0012498|CT33246|FBan0012498 last_updated:000321 > EG:BACR43E12.1 >Diff : 0 >CG14416|FBan0014416|CT34073|FBan0014416 last_updated:000321 > EG:BACR43E12.7 >Diff : 0 >CG3588|FBan0003588|CT11992|FBan0003588 last_updated:000321 > EG:100G7.6 >Diff : A-- C+ >CG14424|FBan0014424|CT34081|FBan0014424 last_updated:000321 > EG:100G7.5 >Diff : 0 >CG7981|FBan0007981|CT23996|FBan0007981 last_updated:000321 > EG:BACR25B3.11 >Diff : D (identity starts at 3285 a.a. of CG) \# \#Comments: \# \# A : differences in NH2- terminus are due to different prediction methods \# used. EDGP follows the Genfinder prediction, whereas CG7981 seems to \# be similar to the Genscan one. In this case the NH2- terminus of the \# CG7981 should be EG:BACR25B3.10 . \# C : differences in \-COOH terminus are also due to different prediction \# methods used. Genscan, suggests a highly unprobable 'long' gene \# structure (big introns, small exons), probably derived from its \# 'human' training set. I don't know whether Genie (the primary used \# Celera program) was trained on similar data set. \# \#CONCLUSION: Ambigious. \# >CG7981|FBan0007981|CT23996|FBan0007981 last_updated:000321 > EG:BACR25B3.10 >Diff : D (identity starts at 2448 a.a. of CG) \# >CG7981|FBan0007981|CT23996|FBan0007981 last_updated:000321 > EG:BACR25B3.1 >Diff : A++ C- \# \#================================================================== \#GENERAL Comments for EG:BACR25B3.1 , EG:BACR25B3.10 , EG:BACR25B3.11 \#================================================================== \# \#This was one of the most difficult regions to annotate (PVB did it during \#his visit to Cambridge, Feb. 2000). It seems like it has undergone \#multiple duplications. Genscan, predicts one very very very long gene \#there (biased from its Human training set, i suppose); so does Celera. \#(based on Genie; from which organism Genie's training set was derived?) \#This is 'the longest Drosophila gene' that is described on the paper. \#PVB has split the gene into three 'smaller' ones and XDrosDB comments say \#that 'TVB-990205: i'm not sure about this gene; ...'. \#However, based on similarity matches (considering the ranges of the hits), \#we can safely say that 'the longest Drosophila gene' does NOT exist (not \#as such, anyway). \# \#CONCLUSION: EG:BACR25B3.1 , EG:BACR25B3.10 , EG:BACR25B3.11 should be more correct. \# >CG12467|FBan0012467|CT32681|FBan0012467 last_updated:000321 > EG:34F3.1 \#>Diff : A++ C >Diff : D (CG extends 800 a.a. further down) \# \#Comments: \# \# A : differences in NH2- terminus are due to different selection of \# starting ATG. CG12467 chooses an in-frame ATG 39 a.a. downstream of \# the one of EG:34F3.1 , for no obvious reason. \# C : differences in \-COOH terminus are probably due to different \# prediction methods used. EDGP follows the Genefinder prediction; \# Celera, follows Genie's one. I'm expecting that the 'missing' 800 \# a.a. are part of EG:34F3.2 gebe (which is downstream of EG:34F3.1 ). \# \#Mel_Comments: \# \#Takis you suggested that EG:34F3.1 and 34F3.2 equal CG12467 and I agree. \# \#34F3.1 starts at \+40 of the CG gene \# it has a perfect match until aa position 524 \# \#34F3.2 starts at \+525 of the CG gene \# one gap is present and the C terminus is messed up. \# \#You have suggested that the last two exons of 34F3.2 should form anothe \#CG gene but I have been unable to find another matching CG gene for \#their aa sequence. \# \#CONCLUSION: Ambigious or EG:34F3.1 is more correct. \# > EG:87B1.5 |FBgn0004861_ph-p > EG:BACN25G24.3 |FBgn0004860_ph-d >Diff : D > \##### WARNING Known gene!##### \# \#Comments: \# \# It seems like Celera has a mis-assembly on the genomic sequence, \# resulting in incorrect model. The mis-assembly is probably due to a \# genomic duplication that resulted into ph-p and ph-d. \# >CG4857|FBan0004857|CT15601|FBan0004857 last_updated:000321 > EG:EG0007.12 ; CAA21828.1 >Diff : D \# \#Comments: \# \# EG:EG0007.12 is a mere Genefinder prediction, although with very very high \#score (137.81, when our cutoff is 50 and BDGP's cutoff was 20). ESTs \#support practically the 2nd and 3rd exon; but there is no other evidence \#to favour EDGP's or CG's version. \# \#CONCLUSION: Ambigious. \# >CG4857|FBan0004857|CT15601|FBan0004857 last_updated:000321 > EG:EG0007.4 ; CAA21827.1 \#>Diff : B- C >Diff : D (CG extends 1000 a.a. further down) \# \#Comments: \# \# C : differences in \-COOH terminus are probably due to different \# prediction methods used. EDGP follows the Genefinder prediction; \# Celera, follows Genie's one. \# \#CONCLUSION: Ambigious. \# >CG2766|FBan0002766|CT9405|FBan0002766 last_updated:000321 > EG:BACN33B1.2 \#>Diff : A+ >Diff : D \# \#Comments: \#The first 61 a.a. of EG:BACN33B1.2 are matching the first 61 a.a. of \#CG2766, whereas the rest of the gene matches CG2716. \# \#I checked the genomic structure and all data. Gene EG:BACN33B1.2 is a \#mere Genscan prediction. Genefinder predicts two genes (probably the \#equivalent of CG2766 and CG2766). However, EST AI512291, which covers the \#first 4 exons of the gene, supports EG:BACN33B1.2 over CG2766 \+ CG2766. \# \#CONCLUSION: EG:BACN33B1.2 is more correct. \# >CG2716|FBan0002716|CT9237|FBan0002716 last_updated:000321 > EG:BACN33B1.2 >Diff : D \# \#Comments: \# \#The first 61 a.a. of EG:BACN33B1.2 matches exactly the corresponding \#CG2766 region. The whole gene matches CT9237 (from CT9237:140 onwards). \# EG:BACN33B1.2 is a mere Genscan prediction supported by multiple ESTs on \#all but its last exon. \#I don't know where CG2716 comes from or why CG2716 is different than \#CT9237. \# \#CONCLUSION: Ambigious or EG:BACN33B1.2 is more correct. \# \#General Comments: \#Based on EST hit(s), genes CG2766 \+ CG2716 should concatanate to \# EG:BACN33B1.2 . \# >CG3713|FBan0003713|CT12463|FBan0003713 last_updated:000321 > EG:BACR7A4.2 >Diff : 0 (suspicious... only 83 a.a. long) >CG4015|FBan0004015|CT41858|FBan0004015 last_updated:000321 > EG:140G11.3 >Diff : 0 >CG3526|FBan0003526|CT11880|FBan0003526 last_updated:000321 > EG:BACR43E12.4 >Diff : A++ >CG3939|FBan0003939|CT13113|FBan0003939 last_updated:000321 > EG:140G11.5 >Diff : A+ >CG14265|FBan0014265|CT33887|FBan0014265 last_updated:000321 > EG:96G10.8 >Diff : 0 >CG18166|FBan0018166|CT40990|FBan0018166 last_updated:000321 > EG:171D11.6 >Diff : D \# \#Comments: \# \#Completely different gene structure. \# EG:171D11.6 follows the 'consensus' prediction of Genefinder and Genscan \#(practically identical except the last exon(s)); and is supported by \#multiple ESTs. \#One thing that might misled the Celera annotators is that this region/gene \#seems to be duplicated (as shown by the multiple hits of some of the ESTs \#(e.g. AA438842). \# \#CONCLUSION: EG:171D11.6 is more correct. \# >CG18273|FBan0018273|CT41446|FBan0018273 last_updated:000321 > EG:171D11.6 >Diff : B+ C++ \# \#General Comments: \#Both genes CG18166 & CG18273 match gene EG:171D11.6 , but in a 'weird' \#way. EG:171D11.6 follows both Genefinder and Genscan predictions, as well \#as the corresponding EST matches. If there is not a mis-assembly case, i \#don't know how to explain it... \# \#Mel_Comments: \#The two clustal analysis of this gene which you looked at were incorrect. \#I can only assume that Michael tried clustals on these two CG genes, \#CG18166 and CG18273, before he found the correct matching CG gene which is \#CG13372. \# \#This is matched perfectly until aa positon 1006, gapped for 50 aa and \#finally the CG gene misses the last 250aa of the EG gene. \# \#CONCLUSION: EG:171D11.6 is more correct. \# >CG18503|FBan0018503|CT42214|FBan0018503 last_updated:000321 > EG:171D11.3_altrn_1 |FBgn0004648;svr /alternative >Diff : D \# \#Comments: \#This is the ALTERNATIVE transcript/protein, based on EST and protein \#hits. The two transcripts differ on their 5' exon(s). CG18503 attaches \#the 5'-most exon in a completely different gene. \#PVB believes that EG:171D11.3_altrn_1 is the correct gene structure for \#the alternatively spliced EG:171D11.3 gene. \# \#CONCLUSION: EG:171D11.3_altrn_1 (svr alternative) is correct. \# >CG14624|FBan0014624|CT34381|FBan0014624 last_updated:000321 > EG:BACR42I17.9 \#>Diff : A++ >Diff : D (CG extends 460 a.a. further up) \# \#Comments: \# \# A : differences in NH2- terminus are due to different prediction methods \# used. EDGP follows the Genefinder prediction. Probably, the first \# 460 a.a. are part of EG:BACR42I17.8 gene. \# \#Mel_Comments: \# \#There are two clustals of this gene. \#The clustal of 42I17.9 with CG11381 begins at \+25 of the CG gene \- if you \#look closely you notice that the first 6 aa of the EG gene match the \#begining of the CG gene. This then has a perfent match until position 446 \#and the CG gene misses the next ~260aa. \#In the clustal of 42I17.9 with CG14624 the CG gene begins at \+463 and is \#matched perfectly until the end. \# \#CONCLUSION: Ambigious. \# >CG11381|FBan0011381|CT31772|FBan0011381 last_updated:000321 > EG:BACR42I17.9 >Diff : A-- C+ >CG18089|FBan0018089|CT40590|FBan0018089 last_updated:000321 > EG:100G7.1 |FBgn0014096;anon-3Ca >Diff : 0 (suspicious... only 56 a.a. long) >CG2945|FBan0002945|CT9961|FBan0002945 last_updated:000321 > EG:BACR37P7.3 |FBgn0000316;cin >Diff : 0 >CG3114|FBan0003114|CT10322|FBan0003114 last_updated:000321 > EG:BACR37P7.7 |FBgn0005427;ewg >Diff : B+ >CG12470|FBan0012470|CT32706|FBan0012470 last_updated:000321 > EG:BACR37P7.5 >Diff : 0 > \##### WARNING Cant find translation for EG:BACR37P7.5 !##### \# \#Comments: \#Translation found and is 100% identical to CG12470. \# >CG3777|FBan0003777|CT12604|FBan0003777 last_updated:000321 > EG:125H10.1 >Diff : 0 >CG3757|FBan0003757|CT12485|FBan0003757 last_updated:000321 > EG:125H10.2 |FBgn0004034;y >Diff : 0 >CG3796|FBan0003796|CT12661|FBan0003796 last_updated:000321 > EG:125H10.3 |FBgn0000022;ac >Diff : 0 >CG3827|FBan0003827|CT12777|FBan0003827 last_updated:000321 > EG:198A6.1 |FBgn0004170;sc >Diff : 0 >CG3839|FBan0003839|CT12815|FBan0003839 last_updated:000321 > EG:198A6.2 |FBgn0002561;l(1)sc >Diff : 0 >CG13374|FBan0013374|CT32705|FBan0013374 last_updated:000321 > EG:EG0001.1 |FBgn0011822;pcl >Diff : 0 >CG3258|FBan0003258|CT10959|FBan0003258 last_updated:000321 > EG:165H7.2 |FBgn0000137;ase >Diff : 0 >CG3972|FBan0003972|CT13187|FBan0003972 last_updated:000321 > EG:165H7.1 |FBgn0011757;ASC-T1 >Diff : 0 >CG3923|FBan0003923|CT13059|FBan0003923 last_updated:000321 > EG:165H7.3 \#>Diff : A+ C+ (probably different selection of starting exon) >Diff : D+ \# \#Comments: \# \#Only a.a. EG:165H7.3:40-190 match CG3923. \# EG:165H7.3 is the 'consensus' prediction of Genefinder and Genscan (aggree \#on all but one exon). I had added one more NH2- exon, following multiple \#EST hits (e.g. AA439864). \#I don't know where the rest of CG3923 comes from or why Celera missed this \#gene. \# \#CONCLUSION: EG:165H7.3 is correct. \# >CG13372|FBan0013372|CT32701|FBan0013372 last_updated:000321 > EG:171D11.6 >Diff : B+ C+ >CG3156|FBan0003156|CT10534|FBan0003156 last_updated:000321 > EG:171D11.2 >Diff : A- >CG17896|FBan0017896|CT39849|FBan0017896 last_updated:000321 > EG:171D11.1 >Diff : 0 >CG17896|FBan0017896|CT39849|FBan0017896 last_updated:000321 > EG:171D11.1_altrn_1 /alternative >Diff : A- >CG17778|FBan0017778|CT13856|FBan0017778 last_updated:000321 > EG:171D11.5 >Diff : 0 >CG4122|FBan0004122|CT41466|FBan0004122 last_updated:000321 > EG:171D11.3 |FBgn0004648;svr \#>Diff : A+ B+ D >Diff : D \# \#Comments: \# \#Gene EG:171D11.3 is KNOWN (FBgn0004648;svr) and is (or should be) \#identical to U29591; apart from the last exon, where the original cloning \#should had contained a frameshift. \#I don't know why CG4122 doesn't match. \# \#CONCLUSION: EG:171D11.3 is correct. \# >CG18503|FBan0018503|CT42214|FBan0018503 last_updated:000321 > EG:171D11.3_altrn_1 |FBgn0004648;svr /alternative, mismatched >Diff : D \# \#General Comments: \#Only the first 152 a.a. of EG:171D11.3a match the CG18503. \# \#This is the alternative svr gene, which corresponds to U29592. \# \#CONCLUSION: EG:171D11.3a is correct. \# >CG4262|FBan0004262|CT13920|FBan0004262 last_updated:000321 > EG:65F1.2 |FBgn0000570;elav >Diff : 0 >CG4293|FBan0004293|CT14025|FBan0004293 last_updated:000321 > EG:65F1.1 >Diff : 0 >CG7727|FBan0007727|CT23451|FBan0007727 last_updated:000321 > EG:65F1.5 |FBgn0000108;Appl >Diff : A+ >CG6172|FBan0006172|CT19338|FBan0006172 last_updated:000321 > EG:118B3.1 |FBgn0003986;vnd >Diff : 0 >CG13366|FBan0013366|CT32693|FBan0013366 last_updated:000321 > EG:118B3.2 >Diff : 0 >CG17828|FBan0017828|CT39571|FBan0017828 last_updated:000321 > EG:115C2.5 >Diff : 0 >CG13369|FBan0013369|CT32698|FBan0013369 last_updated:000321 > EG:115C2.1 >Diff : 0 >CG18451|FBan0018451|CT32691|FBan0018451 last_updated:000321 > EG:115C2.12 >Diff : 0 \# \#WARNING: Only 64 a.a. long!? >CG7622|FBan0007622|CT23257|FBan0007622 last_updated:000321 > EG:115C2.7 |FBgn0002579;RpL36 >Diff : 0 >CG6189|FBan0006189|CT19398|FBan0006189 last_updated:000321 > EG:115C2.2 |FBgn0001341;l(1)1Bi >Diff : 0 >CG7486|FBan0007486|CT22949|FBan0007486 last_updated:000321 > EG:115C2.9 |FBgn0020381;Dredd >Diff : B- >CG6222|FBan0006222|CT19476|FBan0006222 last_updated:000321 > EG:115C2.3 |FBgn0003575;su(s) >Diff : 0 >CG13367|FBan0013367|CT32696|FBan0013367 last_updated:000321 > EG:115C2.8 >Diff : A- >CG16982|FBan0016982|CT32695|FBan0016982 last_updated:000321 > EG:115C2.11 >Diff : B+ >CG13363|FBan0013363|CT32690|FBan0013363 last_updated:000321 > EG:115C2.10 >Diff : C- >CG16983|FBan0016983|CT32694|FBan0016983 last_updated:000321 > EG:115C2.4 >Diff : 0 >CG5227|FBan0005227|CT16627|FBan0005227 last_updated:000321 > EG:BACR19J1.1 |FBgn0021764;sdk >Diff : B+ >CG7434|FBan0007434|CT22861|FBan0007434 last_updated:000321 > EG:BACR19J1.4 |FBgn0015288;RpL22 >Diff : 0 >CG7359|FBan0007359|CT22657|FBan0007359 last_updated:000321 > EG:34F3.8 >Diff : 0 >CG13358|FBan0013358|CT32683|FBan0013358 last_updated:000321 > EG:34F3.1059 >Diff : A+ >CG13359|FBan0013359|CT32684|FBan0013359 last_updated:000321 > EG:34F3.9 >Diff : B- >CG7413|FBan0007413|CT22763|FBan0007413 last_updated:000321 > EG:34F3.3 |FBgn0015799;Rbf >Diff : 0 >CG16989|FBan0016989|CT32688|FBan0016989 last_updated:000321 > EG:34F3.4 >Diff : 0 >CG13360|FBan0013360|CT32685|FBan0013360 last_updated:000321 > EG:34F3.5 >Diff : 0 >CG12311|FBan0012311|CT21087|FBan0012311 last_updated:000321 > EG:34F3.7 >Diff : A- C- >CG3658|FBan0003658|CT12303|FBan0003658 last_updated:000321 > EG:BACR7A4.11 |FBgn0026143;CDC45L >Diff : 0 >CG3019|FBan0003019|CT10162|FBan0003019 last_updated:000321 > EG:BACR7A4.10 |FBgn0003638; su(wa) (/partial; this is the 5' terminus) >Diff : B- \#### WARNING: check the 3' terminus too.... \# identity up to EG:BACR7A4.10 1-224 >CG3638|FBan0003638|CT12157|FBan0003638 last_updated:000321 > EG:33C11.3 >Diff : A+ C-- >CG11403|FBan0011403|CT31837|FBan0011403 last_updated:000321 > EG:33C11.2 >Diff : 0 >CG11405|FBan0011405|CT31841|FBan0011405 last_updated:000321 > EG:33C11.1 >Diff : 0 >CG11408|FBan0011408|CT31847|FBan0011408 last_updated:000321 > EG:114D9.1 >Diff : 0 >CG14622|FBan0014622|CT34379|FBan0014622 last_updated:000321 > EG:114D9.2 >Diff : A-- >CG11411|FBan0011411|CT31859|FBan0011411 last_updated:000321 > EG:8D8.1 >Diff : 0 >CG11409|FBan0011409|CT31851|FBan0011409 last_updated:000321 > EG:8D8.2 >Diff : 0 >CG11412|FBan0011412|CT31863|FBan0011412 last_updated:000321 > EG:8D8.6 >Diff : B+ >CG11418|FBan0011418|CT31875|FBan0011418 last_updated:000321 > EG:8D8.8 >Diff : 0 >CG11415|FBan0011415|CT31869|FBan0011415 last_updated:000321 > EG:8D8.7 >Diff : 0 >CG12773|FBan0012773|CT31881|FBan0012773 last_updated:000321 > EG:8D8.3 >Diff : 0 > EG:8D8.4 >CG11417|FBan0011417|CT31873|FBan0011417 last_updated:000321 >Diff : 0 >CG11420|FBan0011420|CT31887|FBan0011420 last_updated:000321 > EG:8D8.5 >Diff : 0 >CG3056|FBan0003056|CT10284|FBan0003056 last_updated:000321 > EG:132E8.1 >Diff : 0 >CG3064|FBan0003064|CT10298|FBan0003064 last_updated:000321 > EG:49E4.1 \#>Diff : A- B- C+++ >Diff : D \# \#Comments: \# \#Gene EG:49E4.1 is a mere Genefinder prediction with very high score \#(404.95). Genscan prediction is practically the same, except it has a \#couple of additional NH2- terminus exons. Exons 3, 4, 5, 6, 8 and 9 are \#supported further by protein similarity hits (e.g. MAPB_HUMAN). \#I don't know where CG3064 comes from. \# \#CONCLUSION: EG:49E4.1 is more correct. \# >CG14785|FBan0014785|CT34595|FBan0014785 last_updated:000321 > EG:BACN32G11.2 >Diff : 0 >CG14786|FBan0014786|CT34596|FBan0014786 last_updated:000321 > EG:BACN32G11.3 >Diff : 0 >CG14787|FBan0014787|CT34597|FBan0014787 last_updated:000321 > EG:BACN32G11.4 >Diff : A- >CG14788|FBan0014788|CT34598|FBan0014788 last_updated:000321 > EG:BACN32G11.5 >Diff : 0 >CG14789|FBan0014789|CT34599|FBan0014789 last_updated:000321 > EG:BACN32G11.6 >Diff : A- >CG14777|FBan0014777|CT34587|FBan0014777 last_updated:000321 > EG:80H7.10 >Diff : 0 \##### WARNING: No corresponding CG to EG:80H7.1 \! \##### WARNING: Can't find translation of EG:80H7.2 \! >CG14780|FBan0014780|CT34590|FBan0014780 last_updated:000321 > EG:80H7.3 >Diff : 0 >CG14791|FBan0014791|CT34601|FBan0014791 last_updated:000321 > EG:80H7.4 >Diff : B- >CG14781|FBan0014781|CT34591|FBan0014781 last_updated:000321 > EG:80H7.11 >Diff : B+ >CG14782|FBan0014782|CT34592|FBan0014782 last_updated:000321 > EG:80H7.5 >Diff : 0 >CG14792|FBan0014792|CT34602|FBan0014792 last_updated:000321 > EG:80H7.6 |FBgn0003517;sta \#>Diff : A- B C+ >Diff : A- B C+ >CG14793|FBan0014793|CT34603|FBan0014793 last_updated:000321 > EG:80H7.7 >Diff : D \# \#Comments: \# \#Differences in COOH- terminus are probably due to different prediction \#methods used. In this case, EG:80H7.7 follows the Genefinder prediction \#and protein similarity hits. \#Genscan prediction extends downstream to include two more exons, which in \#fact correspond to gene sta (or EG:80H7.6 ). \#If this is true, the \-COOH terminus of CG14793 should align with 80H7.6. \# \#CONCLUSION: EG:80H7.7 is more correct. \# >CG14795|FBan0014795|CT34605|FBan0014795 last_updated:000321 > EG:196F3.3 >Diff : A+ >CG14783|FBan0014783|CT34593|FBan0014783 last_updated:000321 > EG:196F3.2 >Diff : C+ >CG14796|FBan0014796|CT34607|FBan0014796 last_updated:000321 > EG:56G7.1 >Diff : 0 >CG11491|FBan0011491|CT36317|FBan0011491 last_updated:000321 >CG11509|FBan0011509|CT36379|FBan0011509 last_updated:000321 >CG11511|FBan0011511|CT34608|FBan0011511 last_updated:000321 >CG11514|FBan0011514|CT36387|FBan0011514 last_updated:000321 > EG:17A9.1 |FBgn0000210;br >Diff : B-- \##### Note: There are alternatively spliced products for br. >CG3093|FBan0003093|CT10396|FBan0003093 last_updated:000321 > EG:171E4.1 |FBgn0000482;dor >Diff : 0 >CG3740|FBan0003740|CT12509|FBan0003740 last_updated:000321 > EG:171E4.4 >Diff : D \# \#Comments: \# \#Only the first 55 a.a. of the two genes match. \# \# EG:171E4.4 is a small two-exon gene, located betwen EG:171E4.1 (dor) and \# EG:171E4.2 (on their opposite strand). It is a modified prediciton of \#both Genefinder and Genscan programs; but both give relatively low score. \#The reason it was reported is that matches EST AA142192 (on the correct \#orientation). \#I don't know where \-COOH terminus of gene CG3740 is located though. \# \#CONCLUSION: EG:171E4.4 is more correct. \# >CG3095|FBan0003095|CT10406|FBan0003095 last_updated:000321 > EG:171E4.2 >Diff : A+ C+ >CG3737|FBan0003737|CT12505|FBan0003737 last_updated:000321 > EG:171E4.3 >Diff : 0 >CG3100|FBan0003100|CT10412|FBan0003100 last_updated:000321 > EG:9D2.1 |FBgn0024897;b6 >Diff : 0 >CG3783|FBan0003783|CT12641|FBan0003783 last_updated:000321 > EG:9D2.2 >Diff : D \##### Note: Only one exon of EG:9D2.2 matches CG3783! \# \#Comments: \# \# EG:9D2.2 is the intact Genscan prediction. The first 4 exons match the \#genomic sequence for gene a6 (Y16065), but i decided to leave them in \#because: (a) the ORF of a6 gene 700-800 bp upstream and (b) exon 4 matches \#EST AI403894, which further extends to exons 5 and 6. \#Maybe Celera didn't include the first 4 exons, because of their overlap \#with Y16065 \# \#CONCLUSION: EG:9D2.2 is more correct. \# >CG3771|FBan0003771|CT12608|FBan0003771 last_updated:000321 > EG:9D2.3 |FBgn0023130;a6 >Diff : C- >CG3795|FBan0003795|CT12705|FBan0003795 last_updated:000321 > EG:9D2.4 >Diff : 0 >CG14808|FBan0014808|CT34621|FBan0014808 last_updated:000321 > EG:4F1.1 >Diff : 0 >CG12598|FBan0012598|CT34611|FBan0012598 last_updated:000321 > EG:BACN35H14.1 >Diff : A+ >CG17968|FBan0017968|CT40057|FBan0017968 last_updated:000321 > EG:137E7.1 >Diff : 0 \##### Note: Only 64 a.a. long! >CG14801|FBan0014801|CT34614|FBan0014801 last_updated:000321 > EG:131F2.2 >Diff : A- >CG14812|FBan0014812|CT34625|FBan0014812 last_updated:000321 > EG:131F2.3 >Diff : 0 >CG14814|FBan0014814|CT34627|FBan0014814 last_updated:000321 > EG:63B12.6 >Diff : A- >CG14802|FBan0014802|CT34615|FBan0014802 last_updated:000321 > EG:63B12.13 >Diff : 0 >CG14815|FBan0014815|CT34628|FBan0014815 last_updated:000321 > EG:63B12.5 >Diff : 0 >CG14803|FBan0014803|CT34616|FBan0014803 last_updated:000321 > EG:63B12.9 >Diff : B+ >CG14816|FBan0014816|CT34629|FBan0014816 last_updated:000321 > EG:63B12.4 >Diff : 0 >CG14804|FBan0014804|CT34617|FBan0014804 last_updated:000321 > EG:63B12.8 >Diff : 0 >CG14817|FBan0014817|CT34630|FBan0014817 last_updated:000321 > EG:63B12.11 >Diff : 0 >CG14805|FBan0014805|CT34618|FBan0014805 last_updated:000321 > EG:63B12.7 >Diff : B+ >CG14818|FBan0014818|CT34631|FBan0014818 last_updated:000321 > EG:63B12.12 >Diff : 0 >CG3848|FBan0003848|CT12853|FBan0003848 last_updated:000321 > EG:63B12.3 >Diff : B++ >CG3109|FBan0003109|CT10436|FBan0003109 last_updated:000321 > EG:63B12.2 >Diff : B+ >CG11579|FBan0011579|CT12773|FBan0011579 last_updated:000321 > EG:86E4.6 |FBgn0000117;arm >Diff : A+ >CG3810|FBan0003810|CT36539|FBan0003810 last_updated:000321 > EG:86E4.2 >Diff : C+ >CG17766|FBan0017766|CT39345|FBan0017766 last_updated:000321 > EG:86E4.3 >Diff : A- >CG3480|FBan0003480|CT11715|FBan0003480 last_updated:000321 > EG:86E4.4 >Diff : 0 >CG3806|FBan0003806|CT12735|FBan0003806 last_updated:000321 > EG:86E4.1 >Diff : 0 >CG3573|FBan0003573|CT11908|FBan0003573 last_updated:000321 > EG:86E4.5 >Diff : 0 >CG11596|FBan0011596|CT12035|FBan0011596 last_updated:000321 > EG:39E1.1 >Diff : 0 >CG3857|FBan0003857|CT12883|FBan0003857 last_updated:000321 > EG:39E1.3 >Diff : 0 >CG3587|FBan0003587|CT12059|FBan0003587 last_updated:000321 > EG:39E1.2 >Diff : 0 >CG3600|FBan0003600|CT12107|FBan0003600 last_updated:000321 > EG:BACH61I5.1 >Diff : 0 >CG16902|FBan0016902|CT37504|FBan0016902 last_updated:000321 > EG:133E12.2 \#>Diff : B B- B+ C++ >Diff : D \# \#Comments: \# \#Gene EG:133E12.2 is a 'consensus' prediction of both Genfinder and Genscan \#programs. \# B : So, differences on internal exons should be due to the different \# prediction algorithms used. \# C : I had decided to not include the last exons (that both Genfinder and \# Genscan predict), because it was indicated by the protein similarity \# hits (e.g. O161228, THR4 gene). \# \#CONCLUSION: EG:133E12.2 is more correct. \# >CG4406|FBan0004406|CT14360|FBan0004406 last_updated:000321 > EG:133E12.3 >Diff : A+ >CG4399|FBan0004399|CT14236|FBan0004399 last_updated:000321 > EG:133E12.4 >Diff : 0 >CG4376|FBan0004376|CT14163|FBan0004376 last_updated:000321 > EG:133E12.1 |FBgn0000667;Actn >Diff : 0 >CG4380|FBan0004380|CT14272|FBan0004380 last_updated:000321 > EG:22E5.1 |FBgn0003964;usp >Diff : 0 >CG4325|FBan0004325|CT14153|FBan0004325 last_updated:000321 > EG:22E5.12 >Diff : 0 >CG4322|FBan0004322|CT14137|FBan0004322 last_updated:000321 > EG:22E5.11 >Diff : C+ >CG4313|FBan0004313|CT14076|FBan0004313 last_updated:000321 > EG:22E5.10 >Diff : 0 >CG4290|FBan0004290|CT14053|FBan0004290 last_updated:000321 > EG:22E5.8 >Diff : 0 >CG4281|FBan0004281|CT14023|FBan0004281 last_updated:000321 > EG:22E5.7 \#>Diff : A- B- C+ >Diff : D \# \#Comments: \# \# EG:22E5.7 is mainly a Genefinder prediction apart from the first exon. \#Both Genefinder and Genscan aggree on the prediction of the first 3-4 \#exons, but Genescan extends the gene further down (in the area of \# EG:22E5.8 ). \# \# A : I had decided to 'truncate' the first exon (appeared on both \# predictions), because this was suggested by EST hits \# (e.g. AI133907). Thus, i adopted the first ATG found in this EST, as \# the starting ATG for gene EG:22E5.7 . \# CG4281 probably follows the Genscan prediction. \# B : I cannot explain gthe differences in the internal exon, since both \# Genefinder and Genscan predictions aggree on them. \# C : If Celera followed the Genscan (or Genie) prediction 'blindly', then \# the \-COOH of CG4281 should match part of EG:22E5.8 . \# I believe EG:22E5.8 is a different gene, because of EST hits \# (e.g. AI108139) and protein similarity hits of different type. \# \#CONCLUSION: EG:22E5.7 is more correct. \# >CG4199|FBan0004199|CT13610|FBan0004199 last_updated:000321 > EG:22E5.5 >Diff : A+ >CG4194|FBan0004194|CT13604|FBan0004194 last_updated:000321 > EG:22E5.6 >Diff : 0 >CG4061|FBan0004061|CT13484|FBan0004061 last_updated:000321 > EG:22E5.3 >Diff : 0 >CG4045|FBan0004045|CT13392|FBan0004045 last_updated:000321 > EG:22E5.4 >Diff : C+ >CG4025|FBan0004025|CT13374|FBan0004025 last_updated:000321 > EG:22E5.9 >Diff : 0 >CG3835|FBan0003835|CT42116|FBan0003835 last_updated:000321 > EG:87B1.3 >Diff : 0 >CG3724|FBan0003724|CT12475|FBan0003724 last_updated:000321 > EG:87B1.4 |FBgn0004654;Pgd >Diff : 0 >CG3717|FBan0003717|CT12461|FBan0003717 last_updated:000321 > EG:87B1.6 |FBgn0013432;bcn92 >Diff : 0 >CG3707|FBan0003707|CT42114|FBan0003707 last_updated:000321 > EG:87B1.2 |FBgn0004655;wapl >Diff : A+ \##### WARNING: Known gene \! >CG3656|FBan0003656|CT12233|FBan0003656 last_updated:000321 > EG:87B1.1 |FBgn0005670;Cyp4d1 >Diff : 0 >CG3630|FBan0003630|CT12189|FBan0003630 last_updated:000321 > EG:152A3.3 >Diff : 0 >CG3621|FBan0003621|CT12175|FBan0003621 last_updated:000321 > EG:152A3.7 >Diff : 0 >CG3466|FBan0003466|CT11675|FBan0003466 last_updated:000321 > EG:152A3.4 |FBgn0011576;Cyp4d2 >Diff : A- \##### WARNING: Known gene \!? >CG3461|FBan0003461|CT11665|FBan0003461 last_updated:000321 > EG:152A3.5 |FBgn0003116;pn >Diff : 0 >CG3460|FBan0003460|CT11655|FBan0003460 last_updated:000321 > EG:152A3.1 >Diff : 0 >CG3457|FBan0003457|CT11649|FBan0003457 last_updated:000321 > EG:17E2.1 >Diff : B- >CG3456|FBan0003456|CT11643|FBan0003456 last_updated:000321 > EG:103B4.3 >Diff : A- >CG18033|FBan0018033|CT40358|FBan0018033 last_updated:000321 > EG:103B4.4 >Diff : 0 >CG3299|FBan0003299|CT11081|FBan0003299 last_updated:000321 > EG:103B4.1 |FBgn0004397;Vinc >Diff : 0 >CG3443|FBan0003443|CT40368|FBan0003443 last_updated:000321 > EG:30B8.4 |FBgn0003048;pcx >Diff : B-- >CG3228|FBan0003228|CT10831|FBan0003228 last_updated:000321 > EG:30B8.2 |FBgn0001330;kz >Diff : 0 >CG3206|FBan0003206|CT10765|FBan0003206 last_updated:000321 > EG:30B8.7 >Diff : C >CG3193|FBan0003193|CT10256|FBan0003193 last_updated:000321 > EG:30B8.1 |FBgn0000377;crn >Diff : 0 >CG3191|FBan0003191|CT10198|FBan0003191 last_updated:000321 > EG:30B8.3 >Diff : 0 >CG3078|FBan0003078|CT9973|FBan0003078 last_updated:000321 > EG:30B8.6 >Diff : D \# \#Comments: \# \# EG:30B8.6 is the intact prediction of both Genefinder and Genscan with \#good scores. The two first exons supported by EST hits and exon-2 has \#also some protein hits (not very strong though). \#CG3078 matches EG:30B8.6 only in the first 184 a.a. \#I don't know what happened with the rest of the prediction. \# \#CONCLUSION: Ambigious or EG:30B8.6 is more correct. \# >CG3071|FBan0003071|CT41361|FBan0003071 last_updated:000321 > EG:25E8.3 >Diff : B+ >CG2924|FBan0002924|CT42120|FBan0002924 last_updated:000321 > EG:25E8.2 >Diff : A+ C- >CG2918|FBan0002918|CT9894|FBan0002918 last_updated:000321 > EG:25E8.1 >Diff : 0 >CG2879|FBan0002879|CT9868|FBan0002879 last_updated:000321 > EG:25E8.6 >Diff : D \##### Note: EG:25E8.6:235-283 matches CG2879:1-49 . \# \#Comments: \# \# EG:25E8.6 is a small gene predicted by both Genefinder and Genscan with \#marginally good scores. It contains small repeats, thus i am not sure \#about its true existence. \#There are no other supportive evidence for this gene (the protein \#similarity hits are not convincing, as it is written in XDrosDB). \#The start of CG2879 matches the end of EG:25E8.6 . If CG2879 extends \#further down, it should match gene EG:25E8.1 . \# \#CONCLUSION: Ambigious. \# >CG2865|FBan0002865|CT9798|FBan0002865 last_updated:000321 > EG:25E8.4 >Diff : 0 >CG2845|FBan0002845|CT9736|FBan0002845 last_updated:000321 > EG:BACH48C10.3 |FBgn0003079;phl >Diff : B+ \##### WARNING: Known gene \! >CG7952|FBan0007952|CT23972|FBan0007952 last_updated:000321 > EG:BACH7M4.5 >Diff : 0 >CG7925|FBan0007925|CT23896|FBan0007925 last_updated:000321 > EG:BACH59J11.1 |FBgn0003714;tko >Diff : 0 >CG7803|FBan0007803|CT23694|FBan0007803 last_updated:000321 > EG:BACH59J11.3 |FBgn0004050;z >Diff : 0 >CG9659|FBan0009659|CT27300|FBan0009659 last_updated:000321 > EG:BACR25B3.8 |FBgn0001404;egh >Diff : 0 >CG8590|FBan0008590|CT24639|FBan0008590 last_updated:000321 > EG:BACR25B3.9 >Diff : 0 >CG9900|FBan0009900|CT25082|FBan0009900 last_updated:000321 > EG:BACR7C10.3 |FBgn0004643;mit(1)15 >Diff : 0 \##### NOTE: 100% match on the first 260 a.a. only! \##### WARNING: Known gene \! \# \#Comments: \#This was a partial version of the gene (splitted between two clones). The \#complete version sent to Melanie on 4-Sep-2000. Coparison with CG9900 \#showed (almost) 100% a.a. identity over the whole 721 a.a. \# >CG2621|FBan0002621|CT8875|FBan0002621 last_updated:000321 > EG:155E2.3 |FBgn0003371;sgg (/partial; this is the 3' terminus) >Diff : A+ \# \#Comments: \#This was a partial version of the gene. The complete version sent to \#Melanie on 4-Sep-2000. Comparison with CG9900 showed (almost) 100% \#a.a. identity over the whole 721 a.a. \# \#After checking, i found that EG:155E2.3 is practically identical to Q27605 \#'PROTEIN KINASE SHAGGY, SGG46 ISOFORM (EC 2.7.1.-) (PROTEIN ZESTE-WHITE 3) \#(SGG46).' \# >CG2655|FBan0002655|CT8939|FBan0002655 last_updated:000321 > EG:155E2.2 |FBgn0011276;HLH3B >Diff : 0 >CG2652|FBan0002652|CT8935|FBan0002652 last_updated:000321 > EG:155E2.5 >Diff : 0 >CG2647|FBan0002647|CT8963|FBan0002647 last_updated:000321 > EG:155E2.4 |FBgn0003068;per >Diff : A- B+ \##### WARNING: Known gene \! \# \#Comments: \# \#There are four alternatively splice per genes reported by EDGP. \# \#CONCLUSION: No differences really. \# >CG2650|FBan0002650|CT8975|FBan0002650 last_updated:000321 > EG:155E2.1 |FBgn0000092;anon-3B1.2 >Diff : B- >CG2658|FBan0002658|CT8999|FBan0002658 last_updated:000321 > EG:100G10.7 >Diff : 0 >CG2662|FBan0002662|CT9011|FBan0002662 last_updated:000321 > EG:100G10.6 >Diff : 0 >CG2675|FBan0002675|CT9063|FBan0002675 last_updated:000321 > EG:100G10.5 >Diff : A+ >CG2677|FBan0002677|CT9073|FBan0002677 last_updated:000321 > EG:100G10.3 >Diff : 0 >CG2680|FBan0002680|CT9081|FBan0002680 last_updated:000321 > EG:100G10.4 >Diff : B+ >CG2681|FBan0002681|CT9087|FBan0002681 last_updated:000321 > EG:100G10.2 >Diff : B- >CG2685|FBan0002685|CT9103|FBan0002685 last_updated:000321 > EG:100G10.1 >Diff : 0 >CG2694|FBan0002694|CT9121|FBan0002694 last_updated:000321 > EG:100G10.8 >Diff : 0 >CG2701|FBan0002701|CT9167|FBan0002701 last_updated:000321 > EG:95B7.9 >Diff : 0 >CG2706|FBan0002706|CT9201|FBan0002706 last_updated:000321 > EG:95B7.8 |FBgn0000928;fs(1)Yb >Diff : 0 >CG2707|FBan0002707|CT9213|FBan0002707 last_updated:000321 > EG:95B7.4 |FBgn0000927;fs(1)Ya >Diff : A- >CG2709|FBan0002709|CT9223|FBan0002709 last_updated:000321 > EG:95B7.5 >Diff : 0 >CG2711|FBan0002711|CT9225|FBan0002711 last_updated:000321 > EG:95B7.6 >Diff : 0 >CG2713|FBan0002713|CT9227|FBan0002713 last_updated:000321 > EG:95B7.3 >Diff : 0 >CG2712|FBan0002712|CT9229|FBan0002712 last_updated:000321 > EG:95B7.7 >Diff : 0 >CG2714|FBan0002714|CT9231|FBan0002714 last_updated:000321 > EG:95B7.2 |FBgn0000376;crm >Diff : 0 >CG2715|FBan0002715|CT9233|FBan0002715 last_updated:000321 > EG:95B7.11 >Diff : 0 >CG2759|FBan0002759|CT9359|FBan0002759 last_updated:000321 > EG:BACN33B1.1 |FBgn0003996;w >Diff : 0 > EG:BACR43E12.6 >CG14417|FBan0014417|CT34074|FBan0014417 last_updated:000321 >Diff : 0 > EG:BACR43E12.5 >CG14418|FBan0014418|CT34075|FBan0014418 last_updated:000321 >Diff : B+ >CG3591|FBan0003591|CT12085|FBan0003591 last_updated:000321 > EG:100G7.2 |FBgn0014097;anon-3Cb >Diff : 0 >CG3598|FBan0003598|CT12109|FBan0003598 last_updated:000321 > EG:100G7.3 >Diff : 0