THE TREE OF LIFE

occams razor - always prefer the simplest explanation in science

on the origin of species

Darwin considered distribution and speciation but didn't try to make trees

ernst haeckel

made the first serious attempt to relate all animal groups together in a tree

retained elements of scala naturae - traces to Aristotle

inferring the tree of life

a lot of what we know about development comes from model organisms

increasingly the number of models diversify (especially important in drug discovery eg. mice can't get sepsis)

can set conservation priorities according to species remoteness (relative to the tree of life)

combining conservation status with evolutionary distinct status

helps with temporal analysis and molecular clocks to pinpoint divergence

phenetics vs cladistics

phenetics

groups on overall similarity

homoplasy "averaged out"

no rooting

classification only

cladistics

groups on synapomorphy (shared derived characters)

parsimony, likelihood or bayesian inference

character evolution is reconstructed

tree attempts to reconstruct history

sources of data

1. morphology (possibly including fossils)

2. molecules

3. fossils (the only way to obtain time calibration

morphological homology

a character shared between species that was also present in their common ancestor

eg, human arm has humerus, radius, ulna, carpals, metacarpals, phalanges... so do turtle flipper, horse leg, bird wing, bad wing, seal flipper

problems with morphological characters

  1. which characters to code
  1. how to code character states
  1. models of character evolution
  1. convergence and homoplasy

fitch parsimony (unordered)

changes between all states occur with equal cost to tree length

wagner parsimony (ordered)

changes between states 0 and 2 occur with a cost of 2 steps

complex eyes are a good example

character conflict & homoplasy

bird, crocodile, lizard, turtle - how are they related

similarities according to external surface, physiology & limbs? - excludes bird

similarities according to akulll and gait? - excludes turtle

parsimony

looks at where on the tree characters have to change & count them up

problems with nucleotide characters

  1. which molecules to sequence
  1. need genes evolving at right speed
  1. signal becomes saturated
  1. alignment
  1. need right model of sequence evolution
  1. third codon positions? down weighting 3rd codon positions
  1. gene duplication (need orthologous genes)

principally 3 classes

nuclear genes

mitochondrial genes

ribosomes - changes in regions that match up cause ribosomes to not work - selection keeps them together, but open regions can mutate more freely

if changes keep happening then signal gets lost

eg. can have 21 changes but only 17 differences

simulating sequence evolution

  1. begin with a DNA sequence of 10,000 basepairs
  1. pick one basepair at random and substitute it to another basepair
  1. repeat 10,000 times

sequences may be of different lengths

must penalises number of ad hoc alignments insertions and deletions

jukes-cantor - all changes occur with equal probability (between AGTC)

difference between purines and pyrimidines makes this untrue

swapping G for A is easier than swapping G for C

Kimura Two Parameter - alpha = transitions, beta = trans versions

gene trees vs. species trees

distinguishing orthologues from paralogues

hemoglobin: alpha, beta, gamma in mammals

sequences are equally distant irrespective of organism

primordial hemoglobin

duplicates into alpha and beta

speciates into human alpha and cow alpha

speciates into human beta and cow beta

model of sequence divergence can be used to extract the duplication dates of the different haemoglobin chains

model explains why the distance between Human alpha and cow alpha is shorter than proximity between human alpha and human beta

arthropod phylogeny

hexapoda

chelicerata

crustacea

nyriapoda

potential relationships of major groups

atelocerata & mandibulata

atelocerata & schizoramia

LCA

chelicerata

mandibulata

crustacea

atelocerata

myriapoda

Hexapoda

LCA

schizoramia

atelocerata

chelicerata

crustacea

myriapoda

Hexapoda

pancrustacea & paradoxopoda

LCA

paradoxopoda

pancrustacea

chelicerata

myriapoda

crustacea

hexapoda

pancrustacea & mandibulata

LCA

chelicerata

mandibulata

myriapoda

pancrustacea

crustacea

hexapoda

what is probably true

  1. much depends on method of analysis
  1. myriapods dispersed throughout the tree
  1. pancrustacea is supported but neither Hexapoda or crustacea emerge as a clade

pancrustacea, myriapoda, chelicerate - separate clades

on a part of a tree where a lot has gone extinct

spiders

scorpions

horseshoe crabs

long debated para or monophyletic

mostly agreed clade now

phylogenomics of arthropods

insects emerge from a paraphyletic crustacea

molecular analyses

led to inconsistent and inexplicable results

contain some evidently untenable relationships

possible solutions

  1. better taxon sampling
  1. better character sampling - use several markers
  1. include data on morphology
  1. include data from fossils
  1. better/multiple analytical techniques
  1. taxonomic congruence or total evidence

total evidence analysis

uses all available characters and analyses the raw data (rather than trees)

can include many types of data

maximises resolution of phylogeny

strengthens weak but correct phylogenetic signals

avoids arbitrary choices of consensus methods

avoids dubious data set partitions

weighting molecular vs morphological characters

the largest morphological matrices are only several thousand characters

with sequence data 5000 base pairs is nothing - molecular data is much bigger

its not always the case thet lots and lots of data 'wins the argument'