FlyBase_Transcript_Report error in "Associated gene" query

A forum for discussing Power User related features of FlyBase such as using Chado, GFF, FASTA files, etc...

FlyBase_Transcript_Report error in "Associated gene" query

Postby malcook » Mon Oct 19, 2009 3:56 pm

Upon reviewing

http://gmod.org/wiki/FlyBase_Field_Mapping_Tables#FlyBase_Transcript_Report

I was surprised at the use of DISTINCT in the "Associated gene" query', viz:

Code: Select all
SELECT DISTINCT(g.uniquename),g.name
  FROM feature tr, feature g, feature_relationship fr, cvterm fr_type, cvterm tr_type,
     cvterm g_type
  WHERE tr.uniquename='FBtr0087703' AND tr.is_obsolete=false AND
    tr.is_analysis=false AND fr_type.name='partof' AND
    g_type.name='gene' AND tr.feature_id=fr.subject_id AND
    fr.object_id=g.feature_id;


Removing `DISTINCT` from the query returns 86,782 identical rows.

The problem stems from two references to cvterm table that are uncorrelated to the rest of the query.

Recoding it as follows returns a single row that is perhaps closer to the intention.

Code: Select all
SELECT g_tx_agene.uniquename as gn_uniquename,g_tx_agene.name as gn_name, tr_type_tx_agene.name as gn_type
  FROM
     feature tr_tx_agene,
     feature_relationship fr_tx_agene,
     feature g_tx_agene,
     cvterm fr_type_tx_agene,
     cvterm tr_type_tx_agene,
     cvterm g_type_tx_agene
  WHERE
    tr_tx_agene.uniquename='FBtr0087703' AND
    tr_tx_agene.is_obsolete=false AND
    tr_tx_agene.is_analysis=false AND
    fr_type_tx_agene.name='partof' AND
    fr_tx_agene.subject_id = tr_tx_agene.feature_id AND
    fr_tx_agene.object_id = g_tx_agene.feature_id AND
    g_tx_agene.type_id = g_type_tx_agene.cvterm_id AND -- new
    g_type_tx_agene.name='gene' AND
    tr_tx_agene.type_id = tr_type_tx_agene.cvterm_id  -- new


I hesitate to make a change to the GMOD wiki....

Thoughts? Corrections?

--Malcolm Cook - Stowers Institute for Medical Research
Malcolm Cook - Stowers Institute for Medical Research
malcook
 
Posts: 8
Joined: Mon Mar 31, 2008 12:19 pm
Location: kansas city

Re: FlyBase_Transcript_Report error in "Associated gene" query

Postby Josh Goodman » Tue Oct 20, 2009 9:02 am

Hi Malcolm,

Thanks for correcting that sloppy query. I'll update the GMOD wiki. Your hesitation is justified because we generally do a flush and replace of the GMOD wiki from a script that works off our own internal documentation. I thought I had a note to that effect on the page itself but I don't see it so I'll add it as well.

Cheers,
Josh
Josh Goodman
Site Admin
 
Posts: 64
Joined: Mon Nov 26, 2007 2:39 pm

Re: FlyBase_Transcript_Report error in "Associated gene" query

Postby malcook » Tue Oct 20, 2009 9:52 am

Fine.

I revamped it a bit using explicit join sytax:

Code: Select all
SELECT
  tr."name" AS tr_name,
  g."name" AS g_name,
  g.uniquename AS g_uniquename
FROM
  feature tr
  join   cvterm tr_type on (tr.type_id = tr_type.cvterm_id AND tr_type."name" = 'mRNA' )
  join   feature_relationship fr on (fr.subject_id = tr.feature_id )
  join   feature g on (fr.object_id = g.feature_id )
  join   cvterm gtype on (gtype.cvterm_id = g.type_id AND gtype."name" = 'gene' )
  join   cvterm fr_type on (fr_type.cvterm_id = fr.type_id AND fr_type."name" = 'partof')
WHERE
  tr.uniquename = 'FBtr0087703'
Malcolm Cook - Stowers Institute for Medical Research
malcook
 
Posts: 8
Joined: Mon Mar 31, 2008 12:19 pm
Location: kansas city


Return to Power Users

cron