Please enable JavaScript.
Coggle requires JavaScript to display documents.
Improving Variant Calls, Improving variant calls - Coggle Diagram
Improving Variant Calls
Improving variant calls
Filtering variants
after variant calling
Hard filtering
- based on quality metrics, etc.
Sophisticated filtering
- GATK
Use
other algorithms
and take only the consensus variants
Increase in specificity, but decrease in sensitivity
By filtering poor quality reads
before alignment
Example - GATK protocol
Lecture 11, slide 4
improving alignment
before variant calling
Local Realignment around SNVs/indels
Why this is required?
True indels near the end of reads are usually not captured in alignment, because mismatches are cheaper at ends than indels. This leads to incorrect variant calling.
Reads are aligned one at a time
Can do local assembly
- sequence those parts again and reassemble
Local multiple realignment
- after reads are aligned, select sets of reads around indels (e.g. by referring to dBSNP) and do multiple alignment again.
SNPs
Removing duplicate reads
Why do duplicates occur?
Optical duplicates
- due to camera/scanner, etc. reading a sequence cluster multiple times, like in Illumina
PCR amplification bias
- Some DNA fragments are amplified more than others (especially with short fragments)
remove reads that are of same length and map to the same location
Would lead to lower mistake removals if the reads are paired-end
Problem
when read depth is a measure of expression - like in RNA-seq
Problem
when sequencing is targeted with high depth for a small region
Base quality score recalibration (BQSR)
For a particular position in the sequence, if some reads forming the consensus are very different from other reads, then removing them improves quality score for that position.
For high read depth, the software automatically does this.
chrUn
- chromosome unknown in database
Assigned contigs in the database whose chromosomal location is unknows
Mainly there to increase read mapping accuracy and decrease false positive variant calls
Guest lect. functional genomics, slide 3