Evolution of Genes and Genomes in the Genus Drosophila



n      Model system for evolutionary genomics

o       Use well-characterized model system to better understand genome evolution

o       Phylogeny imposes an extrinsic hypothesis for how features of the genome should change.

o       Take advantage of multiple species to study gain, loss, duplication and transformation of genomic feature

o       Optimize species choice and methods for comparative genomics

o       Relate genome variation to phenotypic variation

n      Background

o       Species and phylogeny (Figure showing phylogeny and pictures of each species – Teri Markow)

o       Age of group

o       Evolutionary history and ecological niches

o       Characteristics of each species/species group


n      Overview of sequencing and assembly (Doug Smith)

o       Sequencing (Doug Smith, summarizing results of Agencourt, JCVI, Wash U, Broad)

        Not too much detail in the text, mostly in methods

        Differences between the projects (libraries, depth of coverage, etc)

o       Assembly (Doug and Sergei)

        Reconciliation – need a graf on this since its not standard – then point to paper on reconciliation process (Jim Yorke and colleagues)

o       Table summarizing sequencing stats for all species (will also be a supplement with additional information) (Mike Eisen and Venky Iyer)

        Number of reads and total bases

        Genome size (not from sequence – do we have estimates of this? if not, we will do these here)







o       Wolbachia (this is just a reference to the paper on discovering Wolbachia in these traces)

o       Maybe a note on mtDNA here too?

n      Annotation

o       Coding genes (Venky Iyer, Mike Eisen)

        Different methods used (Venky Iyer, Mike  Eisen, Mike Brent, Chris Ponting, Andreas Heger, Lior Pachter, Don Gilbert, NCBI,  others?)

        Consensus sets (Venky Iyer, Aaron Mackey)

        Problems with consensus sets

        Masking of alignments (Tim Sackton)

        Synteny (Venky Iyer, AJ Bhuktar)

o       ncRNAs (Casey Bergman)

o       tRNAs (Casey Bergman)

        Have RFAM annotations

        Have asked Ian Holmes to provide de novo annotations

o       miRNAs (Chung-I Wu)

o       Transposable elements (there are several groups who have tried to characterize TEs in these different genomes, and, since there is no really solid method yet, I think they should be encouraged to submit a description of what they did and found)

n      Coding gene alignments (Venky Iyer, Tim Sackton)

n      Whole-genome alignments

o       Mercator (need Lior Pachter and Colin Dewey to describe these and provide some info on their quality)

o       12-way blastz alignments from UCSC (Angie Hinrichs)

n      Genome structure – (Thom Kaufman, Steve Schaeffer, AJ Bhutkar, Bill Gelbart, others?)

o       Transposition and chromosomal rearrangements

o       Inversions



o       Patterns of synteny

o       Relating rates of evolution to rearrangements

n      Genome Size (Eisen Lab)

o       Where is the variation in genome size coming from?

o       What are the relative sizes of different features in different genome (e.g. introns, utrs, intergenic space, other features)

o       How are insertions and deletions of different sizes distributed across the genomes?

o       Are they expansions/losses

n      Coding gene evolution (Andy coordinating with lots of contribution from others- Rasmus Nielsen, Melissa Hubisz,  Montse Aguade, Julio Rozas, Mohamed Noor, Lindy McBride, Roman Arguello, Chris Ponting, Andreas Heger )

o       Rates


        adjusted for codon useage/GC etc..

        amino acid sequence rates – radical vs. conservative across whole taxa

        rate heterogeneity

o       Relationship between rates and

        genome features



        TE density

        gene properties



o       Properties of

        extremely conserved genes

        rapidly diverging genes

        specific classes of genes

o       Selection

        Positive selection

        Lineage specific positive selection

        Inference of selection stratified by functional class

        Sex related genes – Rama Singh, Mariana Wolfner

        Genes involved in speciation (Dan Barbash, Allen Orr, Daven Presgraves)

o       Codon usage (Severio Vacario, Jeff Powell, Akashi, Andreas Heger, Chris Ponting)

o       Selection at synonymous sites (Nielsen, Chip Aquadro)

o       Gene structure evolution

        Intron/Exon structure (Scott xxx, Hartl)

        Splice signals (Angela Brooks and Michael Eisen)


n      Intron Evolution (Josep Cameron, Wolfgang Stephan, Dmitri Petrov, Scott Roy, Dan Hartl, Andy Clark)

o       Mechanisms for expansion and contraction

o       Correlation with genome size

o       Driven by indel balance or selection?

n      Multigene families (Matt Rasmussen)

o       Expansion along different lineages

n      Specific families

o       Innate Immunity (Andy Clark w/ Brian Lazzaro, Tim Sackton, Todd Schlenke, Jay Evans, Dan Hultmark)

o       Cytochrome P450s – Phil Batterham, Charles Robin

o       OR/Gr – Roman Arguello, Lindy McBride

n      Novel/Lineage specific (Eisen)

o       lineage specific genes

o       lineage specific exons/introns

n      Genome organization

o       Rearrangement

o       Patterns of synteny

        Statistical model for this – do we have enough power?

        Rates of flux of genes onto and off of X chromosome

        Moving from arm to arm

o       Genes moving between eu and heterochromatin

n      RNA evolution (Casey Bergman)

n      Regulatory sequence evolution (Mike Eisen, Dan Pollard)

o       Binding sites

o       Turnover

o       Couple this with Kevin White and Brian Olivers evolution of gene expression?

n      Transposon evolution (Casey Bergman)

o       Gain/loss

o       Evolution of specific families


n      Comparison of rates and patterns of evolution in different genomic features

o       e.g. Are genome rearrangments related to transposons positions? GC?

n      Phylogeny

o       Gene trees v. species trees (Dan Pollard – pointer to discordance paper)

o       Species divergence times (Patrick OGrady)

n      Whats missing from D. melanogaster?

n      Sections of genomic features

o       X chromosome (Manyuan Long, Nadia Singh)

        Contrasting rates and patterns of divergence between X and autosomes

        Codon usage differences

        Male-biased genes

        Male v. Female mutation rates

o       Y chromosome (Bernardo Carvalho, Leonardo Koerich, Doris Bachtrog, , Amanda Larracuente, Andy Clark)

        Gene content, structure, organization

        Rates of gene gain and loss

        Rates and patterns of sequence divergence

o       dot chromosome (Sally Elgin, Manyuan Long)

        Gene content, structure, patterns of divergence

        Traffic of genes onto and off of dot


o       Mitochondria (David Rand, Kristi Montooth, Dawn Abt, Jeffrey Hoffman)

        Assembly of mitochondrial genomes

        Analysis of divergence

        Gene phylogenies

Discussion points

n      Benefits of having many species

n      Species/Tree choice – depends on the question(s)

n      Relationship between lineage specific biology and genomic features

n      Analysis of genomes on a phylogeny provides an opportunity to ask questions at the interface of full-genome biology and evolution:

o       To what extent are genomic features driven by adaptive evolution as opposed to being frozen accidents?

o       To what extent do genomic features constrain the evolutionary options of species?

o       Are there genomic signatures of extinction liability?