transcripts having spurious FlyBase Annotation IDs

A forum for discussing Power User related features of FlyBase such as using Chado, GFF, FASTA files, etc...

transcripts having spurious FlyBase Annotation IDs

Postby malcook » Tue Oct 20, 2009 2:52 pm

I find that exactly two transcripts each have spurious db_annotation FlyBase Annotation IDs:

"FBtr0089686";2
"FBtr0100361";2

This query finds them:

Code: Select all
SELECT   tr.uniquename as tr_uniquename, count(*)
FROM
  feature tr
  JOIN   cvterm tr_type ON (tr.type_id = tr_type.cvterm_id )
  JOIN   cvterm trtype ON (trtype.cvterm_id = tr.type_id AND
            trtype."name" = 'mRNA' )
  JOIN   feature_dbxref ftr_annotation ON (tr.feature_id = ftr_annotation.feature_id AND
                  ftr_annotation.is_current = TRUE )
  JOIN   dbxref tr_annotation ON (ftr_annotation.dbxref_id = tr_annotation.dbxref_id)
  JOIN   db db_annotation ON (tr_annotation.db_id = db_annotation.db_id
               AND db_annotation.name = 'FlyBase Annotation IDs' )
  WHERE   tr.is_obsolete = FALSE
group by tr.uniquename
having count(*) != 1


And this query shows them

Code: Select all

SELECT   *
FROM
  feature tr
  JOIN   cvterm tr_type ON (tr.type_id = tr_type.cvterm_id )
  JOIN   cvterm trtype ON (trtype.cvterm_id = tr.type_id AND
            trtype."name" = 'mRNA' )
  JOIN   feature_dbxref ftr_annotation ON (tr.feature_id = ftr_annotation.feature_id AND
                  ftr_annotation.is_current = TRUE )
  JOIN   dbxref tr_annotation ON (ftr_annotation.dbxref_id = tr_annotation.dbxref_id)
  JOIN   db db_annotation ON (tr_annotation.db_id = db_annotation.db_id
               AND db_annotation.name = 'FlyBase Annotation IDs' )
  WHERE   tr.is_obsolete = FALSE
AND tr.uniquename in( 'FBtr0089686' , 'FBtr0100361')
;


In the case of FBtr0100361, the spurious accession value is "FBtr0100361" itself. It crops up in the web UI at http://flybase.org/reports/FBtr0100361.html as concatenated values for Annotation Symbol: "CG12537-REFBtr0100361"

In the case of FBtr0089686, the spurious accession value is "CRMP-RA" - but for some reason it does NOT crop up in the web UI.
Malcolm Cook - Stowers Institute for Medical Research
malcook
 
Posts: 8
Joined: Mon Mar 31, 2008 12:19 pm
Location: kansas city

Re: transcripts having spurious FlyBase Annotation IDs

Postby Josh Goodman » Wed Oct 21, 2009 10:46 am

Hi Malcolm,

Your first query needs a minor tweak to remove 1 false positive:

Code: Select all
    SELECT   tr.uniquename as tr_uniquename, count(*)
    FROM
      feature tr
      JOIN   cvterm tr_type ON (tr.type_id = tr_type.cvterm_id )
      JOIN   cvterm trtype ON (trtype.cvterm_id = tr.type_id AND
                trtype."name" = 'mRNA' )
      JOIN   feature_dbxref ftr_annotation ON (tr.feature_id = ftr_annotation.feature_id AND
                      ftr_annotation.is_current = TRUE )
      JOIN   dbxref tr_annotation ON (ftr_annotation.dbxref_id = tr_annotation.dbxref_id)
      JOIN   db db_annotation ON (tr_annotation.db_id = db_annotation.db_id
                   AND db_annotation.name = 'FlyBase Annotation IDs' )
      WHERE   tr.is_obsolete = FALSE AND tr.is_analysis = FALSE
    group by tr.uniquename
    having count(*) != 1


Note the addition of the tr.is_analysis = FALSE.

Otherwise I do see the problem and I've forwarded this on to the database folks.

Cheers,
Josh
Josh Goodman
Site Admin
 
Posts: 64
Joined: Mon Nov 26, 2007 2:39 pm

Re: transcripts having spurious FlyBase Annotation IDs

Postby malcook » Thu Oct 22, 2009 12:05 pm

Josh,

Thanks for the improvement, and for passing the data-problem on.

I now find that there is still a single other transcript, FBtr0100361, with a spurious Flybase Annotation ID; oddly, its own uniquename also appears as a 'Flybase Annotation ID'.

You can see effects of this at http://flybase.org/reports/FBtr0100361.html where we see
Code: Select all
Annotation Symbol     CG12537-REFBtr0100361


This cropped up while I was writing a query to list for each mRNA its transcript and gene IDs, which I will post in another message for possible general education....

Cheers,

--Malcolm
Malcolm Cook - Stowers Institute for Medical Research
malcook
 
Posts: 8
Joined: Mon Mar 31, 2008 12:19 pm
Location: kansas city


Return to Power Users

cron