Report on the results of the FlyBase survey undertaken November 2007.
We were very pleased that respondents find that FlyBase is proving a valuable resource,
with 65% using FlyBase once a day or more and 80% finding FlyBase invaluable or
very helpful in their genetic research activities. We were also pleased that most
respondents who had contacted us for help found our response very good or good
(88%). For a detailed breakdown of the results please see the accompanying PDF.
More importantly, your responses have been key in setting our priorities for the future,
and we urge you to continue to give us suggestions for how we can improve the ability of
FlyBase to assist your research. We need to understand FlyBase users' needs in order
to make best use of our limited resources. We received too many suggestions to
discuss all of them in this summary, and although all the points that were raised are
being considered by the FlyBase team, we would like to highlight here a few of the
lessons we learned from the survey.
Formalizing the structure of the data can result in reports that are not easy to read and
understand. On the one hand users want to find out the basics of what is known about a
gene from a short summary written in plain English. On the other hand using "controlled
vocabularies" or "ontologies" is essential to produce a database that can be effectively
searched (if different terms are used to describe the same thing, then to find all the
entries for that thing, you would have to search with all possible terms; if you always use
the same term, then one search is comprehensive). These contrasting needs were also
reflected in the responses we received, for example when we asked "Are there data that
are difficult to interpret because of formatting or presentation?" we received these two
We recognize that both approaches are essential: controlled vocabularies for effective
searching and textual descriptions for summarizing subtle aspects of genetic, phenotypic
and expression information. We have realized that no single approach will solve all
problems, so we will be providing a variety of summaries. We will transfer "Red Book"
summaries for the classical markers and visible phenotypes in the adult. Many users
found Tom Brody's Interactive Fly a good source of information, and he has kindly
agreed to supply FlyBase with his gene summaries. In addition we will set up a gene
wiki, so that users can contribute to gene summaries, both in an attributed way or adding
to a collective summary. We are investigating other approaches as well and welcome
your further suggestions.
"The automated descriptions of gene function are very poor. I reckon that's because
GO doesn't work. GO and controlled vocabulary is great for informatics, but awful for the
average grad student wanting to do genetics."
"The textual descriptions of gene expression are utterly useless for the computational
community. Someone should sit down and translate them into the official vocabulary."
We are currently working hard to eliminate the problem that one can click on a data
topic only to discover there is nothing in that category (empty matryoshka). If you haven't
already noticed it, you can use Profile Manager to set your own configurations of which
parts of the gene report are open by default.
We are significantly accelerating literature incorporation into FlyBase. We will describe
our detailed plans separately, but one important feature will be to seek your help, by
asking authors to give us key bits of information from the paper on a data entry web site.
A number of users also asked for protein interaction data and information about useful
reagents, such as antibodies. We will begin to incorporate these data types in the next
The phenotypic and expression data need to be improved, and methods developed to
improve our ability to search for genes with similar phenotypes or expression patterns.
Many of you would like FlyBase search tools to guess, "like Google", at what you are
looking for when you type something that is not in the database. FlyBase provides the
equivalent of Google spell-checking for symbols by including in the database extensive
symbol synonyms. If you search for a known variant of a gene or other symbol, the
record will be found. We plan to provide additional help by offering Google-style
suggestions based on extension of the letters you have already typed. We will not,
however, be able to provide search tools that test for all possible variants of what you
have typed (even Google doesn't do that).
Information about orthologs and gene families needs to be improved. We are replacing
the current ortholog identifiers with gene symbols, and if possible names, from the other
species. We are also evaluating optimal ways to present alignments and relationships
amongst orthologs and gene families, and we expect the first results of these efforts to
be publicly accessible in the next several months.
We had been concerned that those FlyBase users at great distance from our servers
in Indiana might have much slower access to FlyBase; if so, this would have argued for
the establishment of FlyBase mirrors around the globe. Here is the perception of the
speed of the website from those countries with substantial numbers of survey
It can be seen that there is no correlation between the speed of FlyBase site and the
country of access; therefore we concluded that mirrors would not improve speed of
access to FlyBase. Rather, we expect that the speed of access reflects a complex set of
issues such as local network conditions, capacity and configuration of individual
desktops, browser choices and other issues that are largely out of our control.