Please enable JavaScript.
Coggle requires JavaScript to display documents.
Neutral Theory and Diversity (Genetic drift (Wright Fisher Model (F is the…
Neutral Theory and Diversity
Hardy Weinberg Equilibrium
p^2+2pq+q^2=1
expect the allele frequencies not to change between generations
test the observed vs expected and use chi squared for signficance
Assumes
infinite population
no mutation
no migration
no selection
random mating
Genetic drift
Violation of the infinite population size
#
stochastic process
reproductive success that just happens- random sampling
affects all DNA neutral or not
Wright Fisher Model
magnitude of genetic drift is related to the size of the population
Effective population size: Size of the WF population that loses genetic variance at the same rate at the census population
Assumes
Non overlapping generations
constant population size
random union of gametes
Sexual reproduction
Probability of fixation of an allele = 1/2N
Average time to fixation= t= 4Ne
F is the sum of allele frequencies squared = Epi^2
Change in F= 1/2Ne
Heterozygosity=H=1-F
Decay is geometric
Hg+1= Hg(1-1/2N)
Expected frequency is always less than the current generation
F1= 1/ (1+4Nu) when mutation is involved
Random change in allele frequencies- at 0 is lost, at 1 is fixed
Neutral theory
First thought that most selection was significant
But variation found to be high in wild populations
Variation must be neutral in character
Genetic drift dominates evolution at the sequence level
#
Size of the population has no role
Majority of base substitutions must be neutral
Infinite alleles model
#
Every mutation is a new allele
Probability that a new allele fixes = 4nu (1/2N x 2Nu)
Probability of a new allele= 2nu
Mutation rate= u
p= rate of evolution
Amount of molecular change we see in an allele genealogy will depend on the amount of time and the mutation rate- molecular clock
Molecular Clock
1962: There is a gradual and correlative relationship between the number of mutations and the divergence time
1 more item...
The coalescent
Predominant methods look at the evolution of haplotypes backwards in time
Estimates of population parameters
Simulation tool
Testing for selection and population movement
support for the Out of Africa model using coalescence
background selection and selective sweep in the dot chromosome in Drosophila
Modelling genetic drift backwards
#
The coalescent theory
Neutral coalescent process for K sequences in a population of N diploids
Assumes
Coalescence is rare
Lineages coalescence independently
P{kth allele does not coalesce} = 1 - (1 - k)/ 2N
Can be drawn through computational models
TMRCA= 4N (1-1/K)
Led by John Kingsman in the 1980s- new area of population genetics
The neutral coalescent
Lineages coalescence rapidly at the start
Smaller samples will have a higher chance of containing TMRCA
Adding another sequence adds a short branch at the terminal
New branch does not change the total length of the terminal branches
Adding a new branches increases the total length by a factor 1/k
Measures of genetic diversity
Observed and expected heterozygosity
Expected is under the HW equilibrium
#
He= 1-F
If there is a discrepancy then the population is not in HW equilibrium
Measure's of identity
Nei's gene diversity
Measures the probability that two alleles drawn at random from the population will be different from eachother
Very similar to the expected
#
Scaled by sample size
Can be applied to haploid and diploid data
Does not work at high levels of polymorphism
Population mutation parameter= 4NeU= expected value of diversity under neutral= theta
Measures of nucleotide diversity to measure theta
Graphical representations
Mismatch distribution
Graphical representation of the pairwise differences
Raggedness statistic
Spiky= constant
Modal = population expansion
Coalescence
Use the time of each branch to mutation rate
Site Frequency spectrum
shape of the segregating sites
singletons
expectation under neutral case follows coalescence
4nu= n= theta
Number of segregating sites (Sn)
Number of nucleotide sites that vary within the length of sequence alignments
Can put this into a coalescence frame to understand different tree topologies with the segregating sites
Theta= Sn/(1+1/2+1/3...1/n-1)
Pairwise differences between all possible pairs
Analagous to Nei's but sample size indepenent
Theta= Pi under neutral