A Database of Drosophila Genes & Genomes

FB2013_03, released May 7th, 2013
 

Reference Report

Reference
Citation Rangarajan, A., Schedl, T., Yook, K., Chan, J., Haenel, S., Otis, L., Faelten, S., Depellegrin-Connelly, T., Isaacson, R., Skrzypek, M.S., Marygold, S.J., Stefancsik, R., Cherry, J.M., Sternberg, P.W., Muller, H.M. (2011). Toward an interactive article: integrating journals and biological databases.  BMC Bioinformatics 12(): 175. (Export to RIS)
FlyBase ID FBrf0214821
Publication Type Research paper
PubMed ID 21595960
PubMed Abstract Journal articles and databases are two major modes of communication in the biological sciences, and thus integrating these critical resources is of urgent importance to increase the pace of discovery. Projects focused on bridging the gap between journals and databases have been on the rise over the last five years and have resulted in the development of automated tools that can recognize entities within a document and link those entities to a relevant database. Unfortunately, automated tools cannot resolve ambiguities that arise from one term being used to signify entities that are quite distinct from one another. Instead, resolving these ambiguities requires some manual oversight. Finding the right balance between the speed and portability of automation and the accuracy and flexibility of manual effort is a crucial goal to making text markup a successful venture.We have established a journal article mark-up pipeline that links GENETICS journal articles and the model organism database (MOD) WormBase. This pipeline uses a lexicon built with entities from the database as a first step. The entity markup pipeline results in links from over nine classes of objects including genes, proteins, alleles, phenotypes and anatomical terms. New entities and ambiguities are discovered and resolved by a database curator through a manual quality control (QC) step, along with help from authors via a web form that is provided to them by the journal. New entities discovered through this pipeline are immediately sent to an appropriate curator at the database. Ambiguous entities that do not automatically resolve to one link are resolved by hand ensuring an accurate link. This pipeline has been extended to other databases, namely Saccharomyces Genome Database (SGD) and FlyBase, and has been implemented in marking up a paper with links to multiple databases.Our semi-automated pipeline hyperlinks articles published in GENETICS to model organism databases such as WormBase. Our pipeline results in interactive articles that are data rich with high accuracy. The use of a manual quality control step sets this pipeline apart from other hyperlinking tools and results in benefits to authors, journals, readers and databases.
DOI 10.1186/1471-2105-12-175
Related Publication(s)
hide Recent Updates
Description
What does this section display?
This section contains items that were added to this record for each release. It currently only tracks new links between this FlyBase report and other FlyBase data classes (e.g. genes, references, stocks) or controlled vocabulary terms (e.g. GO, anatomy terms).
What does this section not display?
This section does not currently display links that were removed or gene model changes.
Update Feed
Click the icon below to subscribe to this FlyBase record and receive updates automatically through your feed reader.
FB2013_03
FB2013_02
All updates Click here to see a list of all updates to this record from FB2010_08 and on.
hide Associated Information
Comments
Associated Files
hide Other Information
Secondary IDs
Language of Publication English
Additional Languages of Abstract
Also Published As
hide Parent Publication
Publication Type Journal
Abbreviation BMC Bioinformatics
Title BMC Bioinformatics
Publication Year 2000-
ISBN/ISSN 1471-2105
hide Data from Reference