To make this slightly easier to understand I'll break this into separate queries.
CG Symbol -> FBgn#The first query resolves the annotation symbols to FlyBase IDs. The CG symbols are all stored as dbxref's in Chado. To ensure that you use the current symbol you need to filter out dbxref's from the 'FlyBase Annotation IDs' db and one where feature_dbxref.is_current=true.
- Code: Select all
select dbx.accession as CG_SYMBOL, f.uniquename as FBID
from feature f, feature_dbxref fdbx, dbxref dbx, db
where dbx.accession='CG4832' and fdbx.is_current=true and db.name='FlyBase Annotation IDs' and
f.feature_id=fdbx.feature_id and fdbx.dbxref_id=dbx.dbxref_id and dbx.db_id=db.db_id;
Once you get the current FBgn# it is best to use this for all the subsequent queries otherwise you will end up with some hairy SQL every time.
Gene Full NameThe full gene name is stored in the synonym table which is linked to the feature table via feature_synonym.
- Code: Select all
select distinct(s.synonym_sgml), f.uniquename
from feature f, feature_synonym fs, synonym s, cvterm type
where f.uniquename='FBgn0013765' and fs.is_current=true and type.name='fullname' and
f.feature_id=fs.feature_id and fs.synonym_id=s.synonym_id and s.type_id=type.cvterm_id;
A single full name entry can be returned multiple times in that query because it is attributed to many pubs via feature_synonym so we use DISTINCT to only return one.
GO IDsTo get the GO IDs for a gene we have to traverse to the cvterm and dbxref tables via feature_cvterm. This query will return all 3 types of GO terms.
- Code: Select all
select distinct(db.name || ':' || dbx.accession) as GOID, cvt.name as term, fcv.is_not, cv.name
from feature f, feature_cvterm fcv, cvterm cvt, dbxref dbx, db, cv
where f.uniquename='FBgn0013765' and db.name='GO' and f.feature_id=fcv.feature_id and fcv.cvterm_id=cvt.cvterm_id and
cvt.cv_id=cv.cv_id and cvt.dbxref_id=dbx.dbxref_id and dbx.db_id=db.db_id;
InterPro domainsInterPro domains are stored as dbxref's so the query is similar to the annotation symbol query above.
- Code: Select all
select f.uniquename as FBID, dbx.accession, dbx.description as domain
from feature f, feature_dbxref fdbx, dbxref dbx, db
where f.uniquename='FBgn0013765' and fdbx.is_current=true and upper(db.name)='INTERPRO' and
f.feature_id=fdbx.feature_id and fdbx.dbxref_id=dbx.dbxref_id and dbx.db_id=db.db_id;
This should get you on your way but feel free to let us know if you still have questions.
Cheers,
Josh