Please enable JavaScript.
Coggle requires JavaScript to display documents.
Reduce Dimensionality Method (PCA(projection based) (properties of PC/term…
Reduce Dimensionality Method
PCA(projection based)
critieria of PC
the direction capture the most variances of the data
(which PC contributes most?)
properties of PC/term
Data are centered, each PCs are unrelated
If X ~Muti Normal, PCs are iid
PC loadings= coeff of eigenvector, their sum of squares in comp=egienvalue. It can be used to explain the relationship between variables and PCs
PC scores=SUM of (coeff*standarized variables). It can also be computed from SVD
Preprocessing data
Standardize data(correlation);apply to the case with different units
Not Stand(cov matrix; opp case
Assumption
importance&covariance
The larger variances, the larger dynamics(rather than noise)
linear combination
Biplot
Can explain the correlation between Variables via direction
Can circle the observation correspond to one of variable(find outlier)
FA(model-based)
Motivation
Can observed variables be explained by a linear combination of "common factors"(latent)
Assumption
Cov( F, U)=0; Cov(U)=Ψ(diagnol)
E(F)=E(U)=0
F,U~Multi Gassian
FA vs PCA
Rule of Thumb:Run FA if you wish to test a theoretical model of latent factors causing observed variables. while run PCA, If you want to simply reduce your correlated observed variables
PCA is to explain the variance while FA explains the covariance between the variables,Besides, FA is restricted by D of freedom
How to select # of factors?
the max # is int: d=(p+1)/2-p(K+1)-k(k-1)/2>0;
p-value should be larger than 0.05, or the result are not robust/trustworthy
FA rotation&score
Rotation: varimax(orthogonal)/promax(oblique)
Score:Battle/Tomposon method
MDS
Dissimlarity(Distance) Matrix
Mahattan
Max
Eculudean
Gower
Similarity measure
Cos measure
Correlation coeff(ρ)
Converting similarity to Dissimilarity(d=1-ρ
we can visualize the correlation matrix by MDS and get clustering easily)
Converting distance to inner product:
MDS vs PCA
follow the same linear projection(PCA also use E distance)
The ratio in MDS loss function is similar to the % variances unexplained in PCA