Please enable JavaScript.
Coggle requires JavaScript to display documents.
Principle Component Analysis (Dimension Reduction (Pros (Improves…
Principle Component Analysis
Dimension Reduction
Shoe size in ft
Shoe size in inches
Merge similar features into one
Pros
Improves visualization sata
Eliminates noise/redundancy
Cons
Loss of information
Reconstruction Error
Minimize Distance
Maximize Variance
Variance \(\propto \frac{1}{\text{similarity}}\)
Smaller variance = more redundant data
Math
Shrinking Column Space
Linear Transformation
Intuition
Principle Components
We remove extraneous information to find the principle information of data
Dimension Reduction
Projection
Data is projected onto a shared line
The line can be in multiple directions
Dot product of two vectors
Question
Why assume mean = 0
Covariance Matrix
Best Direction
Transformation/Projection
Loss of variance = Loss of info
We project to smaller dimensions
When we try and project back to larger dimension we
lost some data
Eigenvalues and Eigenvectors
Projection to K directions = First k eigenvectors of Covariance matrix
\(\Sigma\) stretches/transforms our data
EigenValues
Larger eigenvalues are more important
They stretch or shrink data more
More influential
\(\Sigma\) decomposes to eigenvalues and eigenvectors