Evolution of Genes and Genomes in the Genus Drosophila

 

Introduction

n      Model system for evolutionary genomics

o       Use well-characterized model system to better understand genome evolution

o       Phylogeny imposes an extrinsic hypothesis for how features of the genome should change.

o       Take advantage of multiple species to study gain, loss, duplication and transformation of genomic feature

o       Optimize species choice and methods for comparative genomics

o       Relate genome variation to phenotypic variation

n      Background

o       Species and phylogeny (Figure showing phylogeny and pictures of each species – Teri Markow)

o       Age of group

o       Evolutionary history and ecological niches

o       Characteristics of each species/species group

Results

n      Overview of sequencing and assembly (Doug Smith)

o       Sequencing (Doug Smith, summarizing results of Agencourt, JCVI, Wash U, Broad)

¤        Not too much detail in the text, mostly in methods

¤        Differences between the projects (libraries, depth of coverage, etc)

o       Assembly (Doug and Sergei)

¤        Reconciliation – need a graf on this since itÕs not standard – then point to paper on reconciliation process (Jim Yorke and colleagues)

o       Table summarizing sequencing stats for all species (will also be a supplement with additional information) (Mike Eisen and Venky Iyer)

¤        Number of reads and total bases

¤        Genome size (not from sequence – do we have estimates of this? if not, we will do these here)

¤        Assembly

á        Contigs

á        Scaffolds

á        Size

á        Coverage

¤        Genes

o       Wolbachia (this is just a reference to the paper on discovering Wolbachia in these traces)

o       Maybe a note on mtDNA here too?

n      Annotation

o       Coding genes (Venky Iyer, Mike Eisen)

¤        Different methods used (Venky Iyer, Mike  Eisen, Mike Brent, Chris Ponting, Andreas Heger, Lior Pachter, Don Gilbert, NCBI,  others?)

¤        Consensus sets (Venky Iyer, Aaron Mackey)

¤        Problems with consensus sets

¤        Masking of alignments (Tim Sackton)

¤        Synteny (Venky Iyer, AJ Bhuktar)

o       ncRNAs (Casey Bergman)

o       tRNAs (Casey Bergman)

¤        Have RFAM annotations

¤        Have asked Ian Holmes to provide de novo annotations

o       miRNAs (Chung-I Wu)

o       Transposable elements (there are several groups who have tried to characterize TEs in these different genomes, and, since there is no really solid method yet, I think they should be encouraged to submit a description of what they did and found)

n      Coding gene alignments (Venky Iyer, Tim Sackton)

n      Whole-genome alignments

o       Mercator (need Lior Pachter and Colin Dewey to describe these and provide some info on their quality)

o       12-way blastz alignments from UCSC (Angie Hinrichs)

n      Genome structure – (Thom Kaufman, Steve Schaeffer, AJ Bhutkar, Bill Gelbart, others?)

o       Transposition and chromosomal rearrangements

o       Inversions

¤        Macro

¤        Micro

o       Patterns of synteny

o       Relating rates of evolution to rearrangements

n      Genome Size (Eisen Lab)

o       Where is the variation in genome size coming from?

o       What are the relative sizes of different features in different genome (e.g. introns, utrs, intergenic space, other features)

o       How are insertions and deletions of different sizes distributed across the genomes?

o       Are they expansions/losses

n      Coding gene evolution (Andy coordinating with lots of contribution from others- Rasmus Nielsen, Melissa Hubisz,  Montse Aguade, Julio Rozas, Mohamed Noor, Lindy McBride, Roman Arguello, Chris Ponting, Andreas Heger )

o       Rates

¤        dN/dS

¤        adjusted for codon useage/GC etc..

¤        amino acid sequence rates – radical vs. conservative across whole taxa

¤        rate heterogeneity

o       Relationship between rates and

¤        genome features

á        GC

á        recombination

á        TE density

¤        gene properties

á        GO

á        expression

o       Properties of

¤        extremely conserved genes

¤        rapidly diverging genes

¤        specific classes of genes

o       Selection

¤        Positive selection

¤        Lineage specific positive selection

¤        Inference of selection stratified by functional class

á        Sex related genes – Rama Singh, Mariana Wolfner

á        Genes involved in speciation (Dan Barbash, Allen Orr, Daven Presgraves)

o       Codon usage (Severio Vacario, Jeff Powell, Akashi, Andreas Heger, Chris Ponting)

o       Selection at synonymous sites (Nielsen, Chip Aquadro)

o       Gene structure evolution

¤        Intron/Exon structure (Scott xxx, Hartl)

¤        Splice signals (Angela Brooks and Michael Eisen)

¤        Promoters

n      Intron Evolution (Josep Cameron, Wolfgang Stephan, Dmitri Petrov, Scott Roy, Dan Hartl, Andy Clark)

o       Mechanisms for expansion and contraction

o       Correlation with genome size

o       Driven by indel balance or selection?

n      Multigene families (Matt Rasmussen)

o       Expansion along different lineages

n      Specific families

o       Innate Immunity (Andy Clark w/ Brian Lazzaro, Tim Sackton, Todd Schlenke, Jay Evans, Dan Hultmark)

o       Cytochrome P450s – Phil Batterham, Charles Robin

o       OR/Gr – Roman Arguello, Lindy McBride

n      Novel/Lineage specific (Eisen)

o       lineage specific genes

o       lineage specific exons/introns

n      Genome organization

o       Rearrangement

o       Patterns of synteny

¤        Statistical model for this – do we have enough power?

¤        Rates of flux of genes onto and off of X chromosome

¤        Moving from arm to arm

o       Genes moving between eu and heterochromatin

n      RNA evolution (Casey Bergman)

n      Regulatory sequence evolution (Mike Eisen, Dan Pollard)

o       Binding sites

o       Turnover

o       Couple this with Kevin White and Brian OliverÕs evolution of gene expression?

n      Transposon evolution (Casey Bergman)

o       Gain/loss

o       Evolution of specific families

¤        age/history

n      Comparison of rates and patterns of evolution in different genomic features

o       e.g. Are genome rearrangments related to transposons positions? GC?

n      Phylogeny

o       Gene trees v. species trees (Dan Pollard – pointer to discordance paper)

o       Species divergence times (Patrick OÕGrady)

n      WhatÕs missing from D. melanogaster?

n      Sections of genomic features

o       X chromosome (Manyuan Long, Nadia Singh)

¤        Contrasting rates and patterns of divergence between X and autosomes

¤        Codon usage differences

¤        Male-biased genes

¤        Male v. Female mutation rates

o       Y chromosome (Bernardo Carvalho, Leonardo Koerich, Doris Bachtrog, , Amanda Larracuente, Andy Clark)

¤        Gene content, structure, organization

¤        Rates of gene gain and loss

¤        Rates and patterns of sequence divergence

o       dot chromosome (Sally Elgin, Manyuan Long)

¤        Gene content, structure, patterns of divergence

¤        Traffic of genes onto and off of dot

¤        Polymorphisms

o       Mitochondria (David Rand, Kristi Montooth, Dawn Abt, Jeffrey Hoffman)

¤        Assembly of mitochondrial genomes

¤        Analysis of divergence

¤        Gene phylogenies

Discussion points

n      Benefits of having many species

n      Species/Tree choice – depends on the question(s)

n      Relationship between lineage specific biology and genomic features

n      Analysis of genomes on a phylogeny provides an opportunity to ask questions at the interface of full-genome biology and evolution:

o       To what extent are genomic features driven by adaptive evolution as opposed to being frozen accidents?

o       To what extent do genomic features constrain the evolutionary options of species?

o       Are there genomic signatures of extinction liability?