THE TREE OF LIFE
occams razor - always prefer the simplest explanation in science
on the origin of species
Darwin considered distribution and speciation but didn't try to make trees
ernst haeckel
made the first serious attempt to relate all animal groups together in a tree
retained elements of scala naturae - traces to Aristotle
inferring the tree of life
a lot of what we know about development comes from model organisms
increasingly the number of models diversify (especially important in drug discovery eg. mice can't get sepsis)
can set conservation priorities according to species remoteness (relative to the tree of life)
combining conservation status with evolutionary distinct status
helps with temporal analysis and molecular clocks to pinpoint divergence
phenetics vs cladistics
phenetics
groups on overall similarity
homoplasy "averaged out"
no rooting
classification only
cladistics
groups on synapomorphy (shared derived characters)
parsimony, likelihood or bayesian inference
character evolution is reconstructed
tree attempts to reconstruct history
sources of data
1. morphology (possibly including fossils)
2. molecules
3. fossils (the only way to obtain time calibration
morphological homology
a character shared between species that was also present in their common ancestor
eg, human arm has humerus, radius, ulna, carpals, metacarpals, phalanges... so do turtle flipper, horse leg, bird wing, bad wing, seal flipper
problems with morphological characters
- which characters to code
- how to code character states
- models of character evolution
- convergence and homoplasy
fitch parsimony (unordered)
changes between all states occur with equal cost to tree length
wagner parsimony (ordered)
changes between states 0 and 2 occur with a cost of 2 steps
complex eyes are a good example
character conflict & homoplasy
bird, crocodile, lizard, turtle - how are they related
similarities according to external surface, physiology & limbs? - excludes bird
similarities according to akulll and gait? - excludes turtle
parsimony
looks at where on the tree characters have to change & count them up
problems with nucleotide characters
- which molecules to sequence
- need genes evolving at right speed
- signal becomes saturated
- alignment
- need right model of sequence evolution
- third codon positions? down weighting 3rd codon positions
- gene duplication (need orthologous genes)
principally 3 classes
nuclear genes
mitochondrial genes
ribosomes - changes in regions that match up cause ribosomes to not work - selection keeps them together, but open regions can mutate more freely
if changes keep happening then signal gets lost
eg. can have 21 changes but only 17 differences
simulating sequence evolution
- begin with a DNA sequence of 10,000 basepairs
- pick one basepair at random and substitute it to another basepair
- repeat 10,000 times
sequences may be of different lengths
must penalises number of ad hoc alignments insertions and deletions
jukes-cantor - all changes occur with equal probability (between AGTC)
difference between purines and pyrimidines makes this untrue
swapping G for A is easier than swapping G for C
Kimura Two Parameter - alpha = transitions, beta = trans versions
gene trees vs. species trees
distinguishing orthologues from paralogues
hemoglobin: alpha, beta, gamma in mammals
sequences are equally distant irrespective of organism
primordial hemoglobin
duplicates into alpha and beta
speciates into human alpha and cow alpha
speciates into human beta and cow beta
model of sequence divergence can be used to extract the duplication dates of the different haemoglobin chains
model explains why the distance between Human alpha and cow alpha is shorter than proximity between human alpha and human beta
arthropod phylogeny
hexapoda
chelicerata
crustacea
nyriapoda
potential relationships of major groups
atelocerata & mandibulata
atelocerata & schizoramia
LCA
chelicerata
mandibulata
crustacea
atelocerata
myriapoda
Hexapoda
LCA
schizoramia
atelocerata
chelicerata
crustacea
myriapoda
Hexapoda
pancrustacea & paradoxopoda
LCA
paradoxopoda
pancrustacea
chelicerata
myriapoda
crustacea
hexapoda
pancrustacea & mandibulata
LCA
chelicerata
mandibulata
myriapoda
pancrustacea
crustacea
hexapoda
what is probably true
- much depends on method of analysis
- myriapods dispersed throughout the tree
- pancrustacea is supported but neither Hexapoda or crustacea emerge as a clade
pancrustacea, myriapoda, chelicerate - separate clades
on a part of a tree where a lot has gone extinct
spiders
scorpions
horseshoe crabs
long debated para or monophyletic
mostly agreed clade now
phylogenomics of arthropods
insects emerge from a paraphyletic crustacea
molecular analyses
led to inconsistent and inexplicable results
contain some evidently untenable relationships
possible solutions
- better taxon sampling
- better character sampling - use several markers
- include data on morphology
- include data from fossils
- better/multiple analytical techniques
- taxonomic congruence or total evidence
total evidence analysis
uses all available characters and analyses the raw data (rather than trees)
can include many types of data
maximises resolution of phylogeny
strengthens weak but correct phylogenetic signals
avoids arbitrary choices of consensus methods
avoids dubious data set partitions
weighting molecular vs morphological characters
the largest morphological matrices are only several thousand characters
with sequence data 5000 base pairs is nothing - molecular data is much bigger
its not always the case thet lots and lots of data 'wins the argument'