For each genome release, there are generally a number of changes that
affect the total number or type of gene models (annotations) for that
annotated genome.
A summary of these changes can be found in the "Summary of changes from
previous release" section of the relevant genome release on the
"Release Notes" page (this page is accessible under the "Documents"
section of the Navigation Bar at the top of each FlyBase page).
In addition to the summary information on the "Release Notes" page,
FlyBase also curates standard comments indicating the nature of the
change to the gene model in each of the affected gene reports. These
comments are attributed to a "FlyBase analysis" type reference for that
genome release.
The type of changes and the comments that may be found on the gene
reports are listed here. The list contains all the different types of
changes that may be commented for any genome release, a particular
genome release may not contain all the different types of changes.
This particular reference is the "FlyBase analysis" reference for
release 5.2 of the annotated D.melanogaster genome.
A. Changes affecting gene model number.
There are 4 main types of changes.
1. New Gene Models.
When a gene model is made where previously there was none annotated on
the genome, a comment of the following type is listed in the "OTHER
COMMENTS" subsection of the "OTHER INFORMATION" section of the relevant
gene report, attributed to this reference.
"New annotation (CGxxxxx) in release 5.2 of the genome annotation."
where CGxxxxx is the annotation symbol.
2. Restored Gene Models.
A gene model that had previously been deleted in an earlier release of
the genome (due to lack of sufficient supporting evidence) may be
restored in a later release of the genome if new evidence for the gene
model becomes available. In this case, a comment of the following type
is listed in the "OTHER COMMENTS" subsection of the "OTHER INFORMATION"
section of the relevant gene report, attributed to this reference.
"Annotation CGxxxxx restored in release 5.2 of the genome annotation."
where CGxxxxx is the annotation symbol.
3. Deleted Gene Models.
A gene model may be deleted due to lack of sufficient supporting
evidence. In this case, although there will no longer be a gene model
on gbrowse, the gene report is not deleted, so that data about the gene
from the literature is still available on FlyBase, and a comment of the
following type is listed in the "OTHER COMMENTS" subsection of the
"OTHER INFORMATION" section of the relevant gene report, attributed to
this reference.
"Annotation CGxxxxx eliminated in release 5.2 of the genome annotation."
where CGxxxxx is the annotation symbol.
4. Merged Gene Models.
Gene models that were previously separate may be merged, due to new
supporting evidence. In this case, a comment of the following type is
listed in the "RELATIONSHIP TO OTHER GENES" subsection of the "OTHER
INFORMATION" section of the gene report representing the new merged
gene model, attributed to this reference.
"Annotations CGxxxxx and CGyyyyy merged as CGzzzzz in release 5.2 of the genome annotation."
where CGzzzzz is the annotation symbol of the new merged gene model.
5. Split Gene Models
A gene model may be split into two or more separate gene models, due to new
supporting evidence. In this case, a comment of the following type is
listed in the "RELATIONSHIP TO OTHER GENES" subsection of the "OTHER
INFORMATION" section of the gene report representing each new separate
gene model, attributed to this reference.
"Annotation CGxxxxx split into CGyyyyy and CGzzzzz in release 5.2 of the genome annotation."
where CGyyyyy and CGzzzzz are the annotation symbols of the new separate gene models.
B. Changes affecting gene model type.
If a gene model is changed from protein coding (annotation symbol
begins with "CG") to non-protein coding (annotation symbol begins with
"CR") or vice versa, a comment of the following type is listed in the
"OTHER COMMENTS" subsection of the "OTHER INFORMATION" section of the
relevant gene report, attributed to this reference.
"Annotation CGxxxxx renamed CRxxxxx in release 5.2 of the genome annotation."
or
"Annotation CRxxxxx renamed CGxxxxx in release 5.2 of the genome annotation."
C. Additional comments.
The following types of additional comments may also be attributed to this reference.
1. Dicistronic gene models.
If a new dicistronic gene model is created, a comment of the following
type is listed in the "RELATIONSHIP TO OTHER GENES" subsection of the
"OTHER INFORMATION" section of the gene reports representing each open
reading frame, attributed to this reference.
"One or more of the processed transcripts for this gene contain(s) two non-overlapping open reading frames (ORFs). The non-overlapping ORFs are represented by X, Y."
Where X and Y are the gene symbols representing each open reading frame.
2. Gene models sharing untranslated sequences.
If two separate gene models are newly identified to share untranslated
sequences but have non-overlapping open reading frames, a comment of
the following type is listed in the "RELATIONSHIP TO OTHER GENES"
subsection of the "OTHER INFORMATION" section of the gene reports
representing each open reading frame, attributed to this reference.
"One or more of the processed transcripts for this gene share(s) untranslated sequences with a transcript of an adjacent gene, but encode(s) a single open reading frame (ORF). The non-overlapping ORFs that share untranslated sequences are represented by X, Y."
Where X and Y are the gene symbols representing each non-overlapping open reading frame.
3. Known mutations in the sequenced strain.
If a mutation in the sequence strain is newly identified in a genome
release, an allele representing this mutation is created and a comment
of the following type is listed in the "COMMENTS" section of the allele
report, attributed to this reference.
"Mutation found during genome annotation of the strain used in the genome sequencing project."