Lecture 3: Transcriptomics
RNA-Sequencing protocol and implications for alignment
Gene expression quantification
Common goals of RNA-Seq based transcriptomics
Experiment design > RNA preparation > library preparation > sequence > analysis
Optional: cDNA conversion, fragmentation, amplification
biases due to cDNA library construction, sequencing, read alignment
direct and long read sequencing reduce biases
gene-level vs transcript-level expression counting
alignment- vs assembly-based transcript reconstruction
alignment vs pseudoalignment (of a read is simply a set of target sequences that the read is compatible with)
Expression normalization
normalize by a) length of transcript and b) total number of reads
-> reads / fragments per kilobase per million mapped reads RPKM / FPKM
Transcript quantification and reconstruction
if transcript structure is known: (pseudo-)alignment-based
no known transcripts available: optimization of transcript weights
Differential analysis of gene expression
Linear models, p-value, confounding factors, multiple testing correction, QQ-plot
Transcriptome = complete RNA content of a cell at a given time (varies over time unlike genome)