FB2014_06, released November 12th, 2014
 
 

A Database of Drosophila Genes & Genomes

QuickSearch help

Overview

The QuickSearch tool on the home page provides access to the FlyBase report pages. By default QuickSearch scans Drosophila melanogaster (Dmel) genes for the search term. To search for genes in another species you should add a 4-letter species prefix to the gene name separated by backward slash "\" , for example Dvir\dpp, which will override the "Dmel only" option. The "All species" option enables you to look for a gene in all the species available. A unique match for the search string produces the relevant report page (except when using the "All data classes" data class option), whereas more than one match will generate a list of results linked to the report pages.

Search fields

A search using the default "ID/Symbol/Name" option is case-insensitive and restricted to FlyBase IDs, valid symbols and synonyms, such as annotation symbols, and names. If the "All text" option is selected QuickSearch searches all contents of the full web reports.

Data class

The default data class for QuickSearch is the genes data class, data other than genes can be queried by selecting one of the options from the "Data Class" drop-down menu. The "All data classes" option (see below) searches across all FlyBase data classes except controlled vocabularies. The "gene associations" data class option (see below) allows you to enter a gene name and obtain all FlyBase data classes related to that gene.

All data classes

The "All data classes" option searches across all FlyBase data classes except controlled vocabularies. If the ID/Symbol/Name option is selected it searches all IDs, valid symbols, and synonyms for all data classes. If "All text" is selected it searches all contents of the full web reports for all data classes. The "All data classes" option supports wild cards for both ID/Symbol/Name and "All text" searches and boolean operators for only "All text".

Instead of the usual list of results (HitList) this search produces a page that displays all the data classes searched, the field that matched, and the number of matches. In addition to this information, each line of hits includes two buttons labeled "HitList" and "Refine". The "HitList" button sends you to a HitList for the corresponding entries. The "Refine" button directs you to a QueryBuilder session with those entries set as the first leg of your QueryBuilder query. This allows you to further refine your list of hits by adding additional query legs.

Gene expression patterns

The "gene expression patterns" option allows you to search curated statements that describe transcript and polypeptide expression. When you select this option the input form presents you with 3 input boxes that represent developmental stage, body part/tissue, and subcellular localization. Auto-complete dialog boxes will assist you in finding the appropriate controlled vocabulary (CV) term(s) that have been used during the curation of each descriptor.

Each filled search field further constrains the auto-complete function for the remaining fields. For example, if you have entered "gastrula stage" in the Stage field, the auto-complete function for the Tissue search field will include the CV term "parasegment 10", but will exclude the CV term "leg". Likewise, if you have entered the CV term "prothoracic leg" in the Tissue search field, the auto-complete function for the Stage search field will include "adult stage" but exclude "embryonic stage 4".

If you select only terms suggested by the auto-complete feature, your expression statement query should always match some results.

Gene associations

The "gene associations" option allows you to enter a single gene identifier (ID, symbol, synonym, annotation symbol, name, etc...) and see a list of FlyBase data classes that are directly related to that gene. From there you can click the "HitList" and "Refine" buttons to further evaluate the items returned. If the gene identifier you enter matches multiple genes you will be presented with a list of matching genes to select from before progressing to the final result page. The gene associations search does not support wild cards, boolean operators, or multiple genes.

In addition to the explicit "gene associations" data class you can enter a gene identifier when using alleles, clones, transcripts, polypeptides, insertions, sequence features and stocks data class options. Doing so will return a list of all records of that data class that are related to the gene you entered.

Protein domains

The "protein domains" option allows you to find a list of genes that encode for a specific protein domain. Currently, this option only searches data obtained from InterPro. Accepted forms of input include InterPro IDs or InterPro domain names. The protein domain search does support wild cards but does not support boolean operators and searches for more than one domain at a time.

Controlled vocabularies

The "controlled vocabularies" data class option allows you to search for a term in the controlled vocabularies (CVs) used by FlyBase. When this option is selected you can choose to search All CVs or select a particular CV from the drop down button. The text field supports wild cards (see below).

References

When searching references the search form disables the "Species" and "Search" options and presents you with three fields that allow you to search authors, publication year, and all the contents of the reference report (All text). All three fields support boolean operators (see below). In addition to boolean operators the year field supports mathematical comparison symbols (>,>=,<,<=) and range indicators (-,--,..). For example,

  • >2003
  • <=1945
  • 1999-2003
  • 1970-1990 NOT 1976
  • 1992 OR 1995 OR 1998

Wild cards

The wild card character (*) can be added to the beginning or the end of the search term to find all terms with the shared root. All searches except for the "gene associations" data class search support wild cards. For example, an "All text" search of the genes data class for the term *meio* will identify genes that include 'meiosis', 'meiotic', 'premeiotic', or 'meiocyte' in some field. Similarly, searching stocks data for *bxd* will find stocks that carry either aberrations or alleles that include bxd in the genotype.

Boolean operators

QuickSearch supports the AND, OR, or NOT boolean operators in the "All text" search of any data class and the "Author(s)" and "Year(s)" fields of a "references" data class search. These operators are not supported for any ID/Symbol/Name search and the "gene association", "controlled vocabularies" and "protein domains" data class options. In addition, the NOT operator must be used in conjunction with other search conditions. In other words, you cannot use "NOT Smith" as your only search restriction for a reference search. Instead, you could use "Jones NOT Smith" or pair "NOT Smith" with a year or "All text" search.

Double quotes

There are a few situations in which obtaining the proper query result is dependent on using double quotations around your search term. The first is when you want to search for a phrase using the "All text" option. For example, if you want to find all records that contain the phrase "protein-protein interaction", including the quotes in the text box will ensure that only these records are returned. Without the double quotes you will get records that contain the terms "protein" and "interaction" somewhere in their records. The second situation is when you are performing an "All text" search with a symbol that contains non word characters (<, {, ], $, etc...). In these cases, surrounding your search term with quotes should again return the proper result. Double quotes around symbols are not required when doing an ID/Symbol/Name search. The final case is when you are searching for Authors in references. If the author's last name contains a space (e.g. Ponce de León) or other non word characters then you must surround it with quotes for the search to work.

QuickSearch Hints & Examples
hide Find me the Adh gene in Drosophila pseudoobscura

The best strategy would be to type 'Dpse\Adh' into the "Enter text" box, which will take you directly to the report page of the gene. Dpse is the 4-letter species abbreviation for Drosophila pseudoobscura, and the backward slash "\" is used as a separator between the 4-letter abbreviation and the gene name. Alternatively, you could enter 'Adh' into the search box with the "All species" option selected, and then choose the Drosophila pseudoobscura gene from the resulting list.

hide Find me all the Drosophila melanogaster genes that start with "dp"

Click the "Dmel only" button and type "dp*" into the "Enter text" box.

hide Find me all clones associated with the gene "bcd"

Select the data class "clones" and type "bcd" into the "Enter text" box.

hide Find me all the references with "Muller" as an author between 1921 and 1934 excluding 1928

Select the data class "references" and type "Muller" into the "Author(s)" box then type "1921-1934 NOT 1928" into the "Year(s)" box.

hide Searching for GAL4 and GFP Lines

Select the data class "stocks" and enter the search term surrounded by wild card characters, for example *GFP*. We recommend using wild card characters because often the search term may be part of a more complex term such as P{w[+mC]=UAS-GFP.nls}8.

hide Searching for Stocks of Specific Alleles

Select the data class "stocks" and enter the allele as the search term with the superscript placed in square brackets, for instance type "bcd[6]" to search for bcd6.

hide Searching for all data classes associated with the gene "pen".

Select the data class "gene associations" and enter in "pen".

hide Searching for genes that encode for polypeptides with a "Netrin domain".

Select the data class "protein domains" and enter in "Netrin domain" or "IPR001134".