Date: Fri, 02 Jul 2004 18:10:59 \+0100 Subject: Re: Ribosomal proteins From: Steven Marygold <Steven.Marygold@XXXX> To: 'Michael Ashburner (Genetics)' <ma11@XXXX> Dear Michael, Thanks a lot for getting back to me on this- I appreciate your time. I agree that most RPs are correctly identified and named in FlyBase, though as you say, there are a few that could do with tweaking. I've attached my comments on the list you sent me. To date, I've only considered the cyto RPs. I made my compilation of the D.m. RPs by combining my own gene ontology searches with the RPG data gleaned from BLAST hits against human RPs. In comparing my compilation to your list, there are only 2-3 that don't appear in both lists. In my comments, I've assumed that the RPG classification and nomenclature conventions are the correct ones. Therefore there is no recognised L1, L2, L16, L20, L25, L33, S1 or S22 RP classification. I think the convention with the fly RPs is that they're named after their similarity to vertebrate/mammalian RPs, and I've followed this in my comments. I've also added a few ideas for fly gene name changes- though I'm not sure of the FlyBase policy on this. ... Steven. \-- Steven Marygold Ph.D. Growth Regulation Laboratory (Rm 503), Cancer Research UK London Research Institute, 44 Lincoln's Inn Fields, LONDON, WC2A 3PX U.K. Tel: \+44 (0)20 7269 3351 Fax: \+44 (0)20 7269 3581 L1 RpL10Aa CG3843 L1 RpL10Ab CG7283 These genes are shown as RpL10A orthologs in RPG- there is no 'RPL1' classification. The confusion stems from these proteins having a 'Ribosomal L1 motif', which I think refers to the prokaryotic homologs; this corresponds to RPL10A in mammals. L3 RpL3 CG4863 L4 RpL4 CG5502 L5 RpL11 CG7726 This gene is the ortholog of mammalian RPL11. Again, I think there's confusion from the protein having a 'Ribosomal L5' motif, which corresponds to mammalian RPL11. CG17489 is the real RPL5 ortholog (below). L5 yip6 CG17489 I recommend this gene be renamed 'RpL5'. L6 RpL6 CG11522 L7 RpL7 CG4897 L7-like CG5317 CG5317 (CG4897 protein is 55% identical to human RPL7 and CG5317 is only 28% identical, so it's probably right to refer to the latter as 'L7-like' rather than L7b.) L7A RpL7A CG3314 L8 RpL8 CG1263 L9 RpL9 CG6141 L10 Qm CG17521 I recommend this gene be renamed 'RpL10'. L11 RpL12 CG3195 The real RPL11 ortholog is CG7726 (as in FlyBase). CG3195 is the RPL12 ortholog; again, I think the confusion from CG3195 protein having a prokaryotic 'Ribosomal L11 motif' when it actually corresponds to mammalian L12. L13 RpL13A CG1475 This gene is the RpL13A ortholog. L13E RpL13 CG4651 This gene is just the 'RPL13' ortholog. L14 RpL14 CG6253 L15 RpL15 CG17420 L17 RpL23 CG3661 This gene is the RPL23 ortholog. It appears to have been mis-named in the past as L17 or L17A for some reason. L18 RPL18 CG8615 L18A RPL18A CG6510 L19e RpL19 CG2746 This gene is just the 'RPL19' ortholog. L21 oho23B CG2986 This is wrong. oho23B is the RPS21 ortholog. L21E RpL21 CG12775 This gene is just the 'RPL21' ortholog. L22 RpL22 CG7434 L22-like CG9871 CG9871 (CG7434 protein is 58% identical to human RPL22 and CG9871 is 39% identical, so it's probably right to refer to the latter as 'L22-like' rather than L22b.) L23/L17-like CG3203 CG3203 This is the real L17 ortholog (see above). L23A RpL23A CG7977 L24 RpL24 CG9282 L24-like CG6764 CG6764 (CG9282 protein is 65% identical to human RPL24 and CG6764 is only 22% identical, so it's probably right to refer to the latter as 'L24-like' rather than L24b.) L26 RpL26 CG6846 L27 RpL27 CG4759 L28 RpL28 CG12740 L29 RpL29 CG10071 L27A RpL27A CG15442 L30 RpL30 CG10652 The L31 ortholog (CG1821) is missing, though it's in FlyBase. . L32 RpL32 CG7939 L33 CG15458 CG15458 I'm not at all sure about this one. There's no L33 classification in the RPG. There's a L33 reference in the CG15458 FlyBase record, but a SMART search shows the protein sequence has a transmembrane domain and a Pfam search shows it has no Pfam domains. L34 RpL34 CG6090 L34-like CG9354 CG9354 I think these two should probably be renamed L34a and L34b as both protein products are 52% identical to human RPL34. L35 RpL35 CG4111 L35A RpL35A CG2099 L36 RpL36 CG7622 L36A RpL36A CG7424 L37 RpL37 CG9091 L37-like CG9873 CG9873 There's also a case for these two being renamed L37a and L37b as CG9091 is 74% identical to human RPL27 and CG9873 is 65% identical. L37A RpL37A CG5827 L38 RpL38 CG18001 This has an updated CG no. of CG40278 L39 RpL39 CG3997 L40 RpL40 CG2960 L41 RpL41 CR30425 Clearly this is the RPL41 homolog and not a non-coding RNA (CR). P0 RpLP0 CG7490 P2 RpLP1 CG4087 P1 RpLP2 CG4918 The RPG now classifies these as RPLP0, RPLP1, and RPLP2 (like the fly gene names in fact.) SA sta CG14792 I recommend this be renamed RpSA. S2 sop CG5920 I recommend this be renamed RpS2. S3 RpS3 CG6779 S3A RpS3A CG2168 S4 RpS4 CG11276 S4-like CG4866 CG4866 I'm not sure what this is but it's not an ortholog of RPS4- it only shares 3% identity with human S4 compared to the 75% identity of CG11276 and human RPS4. It has a prokaryotic ribosomal S4 domain that actually corresponds to eukaryotic S9, but it's only 16% identical to human RPS9 (compared to 84% identity between human S9 and CG3395- see below). S5 RpS5a CG8922 S5 RpS5b CG7014 S6 RpS6 CG10944 S6-like CG11386 CG11386 This is the only one on your list that I didn't find at all in my original Dm RP searches. It looks like S6 (it's 40% identical to human S6 whereas CG10944 is 75% identical) and it has a Pfam Ribosomal S6e domain. S7 RpS7 CG1883 S8 RpS8 CG7808 S9 RpS9 CG3395 S10 RpS10a CG12275 S10 RpS10b CG14206 S11 RpS11 CG8857 S12 RpS12 CG11271 S13 RpS13 CG13389 S14 RpS14a CG1524 S14 RpS14b CG1527 S15 RpS15 CG8332 S15A RpS15Aa CG2033 S15A RpS15Ab CG12324 S16 RpS16 CG4046 S17 RpS17 CG3922 S18 RpS18 CG8900 S19 RpS19a CG4464 S19 RpS19b CG5338 S20 RpS20 CG15693 The RPS21 ortholog is CG2986/oho23B from your RP-L list above. I recommend this gene be renamed RpS21. S23 RpS23 CG8415 S24 RpS24 CG3751 S25 RpS25 CG6684 S26 RpS26 CG10305 S27 RpS27 CG10423 S27A RpS27A CG5271 S28 RpS28a CG15527 S28 RpS28b CG2998 S29 RpS29 CG8495 S30 RpS30 CG15697 Other structural components of cytosolic ribosome CG32276 CG32276 ribosome associated membrane protein-like CG1789 CG1789 ? hoip CG3949 Trisn small nuclear ribonucleoprotein Other genes in FB said to encode ribosomal proteins, but with no sequence data &, therefore, not mapped to genome. S11-like anon-MMS23 7/8 RP7-8 34 Rp21 34 Rp34 ? RpL5 RpL5 and yip6 reports should be merged.