Evolution of Genes and
Genomes in the Genus Drosophila
Introduction
n Model system for evolutionary genomics
o Use well-characterized model system to better
understand genome evolution
o Phylogeny imposes an extrinsic hypothesis for how
features of the genome should change.
o Take advantage of multiple species to study gain,
loss, duplication and transformation of genomic feature
o Optimize species choice and methods for comparative
genomics
o Relate genome variation to phenotypic variation
n Background
o Species and phylogeny (Figure showing phylogeny and
pictures of each species – Teri Markow)
o Age of group
o Evolutionary history and ecological niches
o Characteristics of each species/species group
Results
n Overview of sequencing and assembly (Doug Smith)
o Sequencing (Doug Smith, summarizing results of
Agencourt, JCVI, Wash U, Broad)
¤
Not too much detail in
the text, mostly in methods
¤
Differences between the
projects (libraries, depth of coverage, etc)
o Assembly (Doug and Sergei)
¤
Reconciliation –
need a graf on this since itÕs not standard – then point to paper on
reconciliation process (Jim Yorke and colleagues)
o Table summarizing sequencing stats for all species
(will also be a supplement with additional information) (Mike Eisen and Venky
Iyer)
¤
Number of reads and
total bases
¤
Genome size (not from
sequence – do we have estimates of this? if not, we will do these here)
¤
Assembly
á
Contigs
á
Scaffolds
á
Size
á
Coverage
¤
Genes
o Wolbachia (this is just a reference to the paper on
discovering Wolbachia in these traces)
o Maybe a note on mtDNA here too?
n Annotation
o Coding genes (Venky Iyer, Mike Eisen)
¤
Different methods used
(Venky Iyer, Mike Eisen, Mike
Brent, Chris Ponting, Andreas Heger, Lior Pachter, Don Gilbert, NCBI, others?)
¤
Consensus sets (Venky
Iyer, Aaron Mackey)
¤
Problems with consensus
sets
¤
Masking of alignments
(Tim Sackton)
¤
Synteny (Venky Iyer, AJ
Bhuktar)
o ncRNAs (Casey Bergman)
o tRNAs (Casey Bergman)
¤
Have RFAM annotations
¤
Have asked Ian Holmes
to provide de novo annotations
o miRNAs (Chung-I Wu)
o Transposable elements (there are several groups who
have tried to characterize TEs in these different genomes, and, since there is
no really solid method yet, I think they should be encouraged to submit a
description of what they did and found)
n Coding gene alignments (Venky Iyer, Tim Sackton)
n Whole-genome alignments
o Mercator (need Lior Pachter and Colin Dewey to
describe these and provide some info on their quality)
o 12-way blastz alignments from UCSC (Angie Hinrichs)
n Genome structure – (Thom Kaufman, Steve
Schaeffer, AJ Bhutkar, Bill Gelbart, others?)
o Transposition and chromosomal rearrangements
o Inversions
¤
Macro
¤
Micro
o Patterns of synteny
o Relating rates of evolution to rearrangements
n Genome Size (Eisen Lab)
o Where is the variation in genome size coming from?
o What are the relative sizes of different features in
different genome (e.g. introns, utrs, intergenic space, other features)
o How are insertions and deletions of different sizes
distributed across the genomes?
o Are they expansions/losses
n Coding gene evolution (Andy coordinating with lots of
contribution from others- Rasmus Nielsen, Melissa Hubisz, Montse Aguade, Julio Rozas, Mohamed
Noor, Lindy McBride, Roman Arguello, Chris Ponting, Andreas Heger )
o Rates
¤
dN/dS
¤
adjusted for codon
useage/GC etc..
¤
amino acid sequence
rates – radical vs. conservative across whole taxa
¤
rate heterogeneity
o Relationship between rates and
¤
genome features
á
GC
á
recombination
á
TE density
¤
gene properties
á
GO
á
expression
o Properties of
¤
extremely conserved
genes
¤
rapidly diverging genes
¤
specific classes of
genes
o Selection
¤
Positive selection
¤
Lineage specific
positive selection
¤
Inference of selection
stratified by functional class
á
Sex related genes
– Rama Singh, Mariana Wolfner
á
Genes involved in
speciation (Dan Barbash, Allen Orr, Daven Presgraves)
o Codon usage (Severio Vacario, Jeff Powell, Akashi,
Andreas Heger, Chris Ponting)
o Selection at synonymous sites (Nielsen, Chip Aquadro)
o Gene structure evolution
¤
Intron/Exon structure
(Scott xxx, Hartl)
¤
Splice signals (Angela
Brooks and Michael Eisen)
¤
Promoters
n Intron Evolution (Josep Cameron, Wolfgang Stephan,
Dmitri Petrov, Scott Roy, Dan Hartl, Andy Clark)
o Mechanisms for expansion and contraction
o Correlation with genome size
o Driven by indel balance or selection?
n Multigene families (Matt Rasmussen)
o Expansion along different lineages
n Specific families
o Innate Immunity (Andy Clark w/ Brian Lazzaro, Tim
Sackton, Todd Schlenke, Jay Evans, Dan Hultmark)
o Cytochrome P450s – Phil Batterham, Charles
Robin
o OR/Gr – Roman Arguello, Lindy McBride
n Novel/Lineage specific (Eisen)
o lineage specific genes
o lineage specific exons/introns
n Genome organization
o Rearrangement
o Patterns of synteny
¤
Statistical model for
this – do we have enough power?
¤
Rates of flux of genes
onto and off of X chromosome
¤
Moving from arm to arm
o Genes moving between eu and heterochromatin
n RNA evolution (Casey Bergman)
n Regulatory sequence evolution (Mike Eisen, Dan
Pollard)
o Binding sites
o Turnover
o Couple this with Kevin White and Brian OliverÕs
evolution of gene expression?
n Transposon evolution (Casey Bergman)
o Gain/loss
o Evolution of specific families
¤
age/history
n Comparison of rates and patterns of evolution in
different genomic features
o e.g. Are genome rearrangments related to transposons
positions? GC?
n Phylogeny
o Gene trees v. species trees (Dan Pollard –
pointer to discordance paper)
o Species divergence times (Patrick OÕGrady)
n WhatÕs missing from D. melanogaster?
n Sections of genomic features
o X chromosome (Manyuan Long, Nadia Singh)
¤
Contrasting rates and
patterns of divergence between X and autosomes
¤
Codon usage differences
¤
Male-biased genes
¤
Male v. Female mutation
rates
o Y chromosome (Bernardo Carvalho, Leonardo Koerich,
Doris Bachtrog, , Amanda Larracuente, Andy Clark)
¤
Gene content,
structure, organization
¤
Rates of gene gain and
loss
¤
Rates and patterns of
sequence divergence
o dot chromosome (Sally Elgin, Manyuan Long)
¤
Gene content,
structure, patterns of divergence
¤
Traffic of genes onto
and off of dot
¤
Polymorphisms
o
Mitochondria (David
Rand, Kristi Montooth, Dawn Abt, Jeffrey Hoffman)
¤
Assembly of
mitochondrial genomes
¤
Analysis of divergence
¤
Gene phylogenies
Discussion points
n Benefits of having many species
n Species/Tree choice – depends on the
question(s)
n Relationship between lineage specific biology and
genomic features
n Analysis of genomes on a phylogeny provides an
opportunity to ask questions at the interface of full-genome biology and
evolution:
o To what extent are genomic features driven by
adaptive evolution as opposed to being frozen accidents?
o To what extent do genomic features constrain the
evolutionary options of species?
o Are there genomic signatures of extinction liability?