Basics
Matrix Factorizations
Data Science
Tensor Factorizations
Floating Point Arithmetic
Vector Norms
Matrix Norms
Matrix Calculus
Least Squares
QR Decomposition A=QR
Regression
Singular Value Decomposition \( A = U\Sigma V^T\)
Principal Component Analysis
CX Decomposition \( CX \approx U_k \Sigma_k V_k^T\)
CUR Decomposition \( CUR = U_k\Sigma_k V_k^T\)
Nonnegative Matrix Factorization \(X \approx WH,\ W,H\geq0 \)
Other Factorizations
Clustering
Discrete Laplace \(\Delta f = \tfrac{\partial^2 f}{\partial x^2} \)
Spectral Clustering
Regularization
Kernel Trick
Logistic
Classification
SVM
Soft Margin Classifier
Tensor \(X\in\mathbb{R}^{n_1\times \cdots\times n_d}\)
Basic Operations
Tucker Decomposition \(X = G \times_1 U_1 \cdots \times_d U_d\)
Uniqueness
Canonical Polyadic Decomposition \(X = \Sigma^r a^{(1)} \circ \cdots \circ a^{(d)}\)
Extensions
Other Factorizations
relative error, absolute error
backward error, overflow
Inner product \(\langle x,y\rangle\)
Euclidean norm \( \lVert x \rVert_2\)
Manhattan norm \( \lVert x \rVert_1\)
Max-norm \( \lVert x \rVert_\inf\)
p-norm \( \lVert x \rVert_p\)
Metric \( \lVert x - y \rVert\)
Angle
Operator Norm \( \lVert A \rVert = \sup_{x\neq0} \frac{\lVert Ax \rVert}{\lVert x \rVert}\)
Frobenius: \( \lVert A \rVert_F = \sqrt{\langle A, A \rangle_F} = \sqrt{\text{tr}(A^TA)} \)
span
basis
rank
outer product
Jacobian \(\frac{\partial F(x)}{\partial x}\)
h-method
Residuals
Motivation
orthonormal
orthogonal
Givens Rotations
Householder reflectors \( H_v = I - 2 \frac{vv^T}{v^Tv} \)
Gram-Schmidt-orthogonalization
Tikhonov/Ridge \( \lambda \lVert w \rVert_2^2\)
LASSO \( \lambda \lVert w \rVert_1 \)
Singular values \( \sigma \)
Thin SVD
Eigenvalues \(\sigma^2 \)/Eigenvectors
Truncated SVD
Direction of maximal variance
Dimensionality reduction
Moore-Penrose-Pseudoinverse
Uniqueness
leverage scores \(\mathcal{l}_{r,j} = \lVert U_k (j, :) \rVert_2 \)
Interpolatory operator
highest
weighted random
Uniqueness, orthogonality
nonnegative rank
Alternating optimization
Alternating nonnegative LS
Alternating LS
Multiplicative Updates
Hierarchical ALS
LU decomposition \(A = LU \)
Cholesky decomposition \(A = LL^T \)
Matrix square root \(A = BB \)
Eigendecomposition \( A = VDV^{-1}\)
Jordan decomposition \(A = PJP^{-1} \)
K-Means
Norms
Procedures
Datasets
Laplace matrix \(L = D - W \)
Similarity graph \(G = (V, E) \)
min-cut problem
RatioCut
Normalized Laplacians
Random Walk Laplacian \( D^{-1}L\)
N-cut
Symmetric Graph Laplacian \(D^{-\tfrac{1}{2}} L D^{-\tfrac{1}{2}} \)
Mercer's theorem
Linear kernel
Polynomial kernel
RBF kernel
Sigmoid kernel
\( y \approx Xw\)
log-odds
Likelihood
Hyperplane \(w^Tx+b=0\)
linearly separable
Maximal Margin Classifier
trivial
Quadratic Problem
Duality
KKT conditions
Sequential Minimal Optimization
mode-k fiber \(X(i_1, :, i_d) \in \mathbb R^{n_k}\)
slice \(X(i_1, :, :, i_d) \in \mathbb R^{n_j\times n_k}\)
mode-k hyperslice \(X(:, i_k, :)\in\mathbb R^{n_1 \times n_{k-1} \times n_{k+1} \times n_d}\)
vectorization
mode-k unfolding \(X_{(k)}\)
Miranda Scientific Simulation
EEM Fluoresence Spectroscopy
Monkey BMI Neuronal Spike
Chicago Crime Count
Kronecker product ⨂
outer product ◯
Frobenius inner product
colexicographic (stack column)/lexicographic
mode-k product \(X \times_k A\)
Khatri-Rao product ⨀
Hadamard product \(\ast\)
multilinear rank \( r_k = \text{rank}(X_{(k)})\)
Approximate Tucker decomposition
approximation error & compression ratio
Higher-order SVD \(T_r\)
Sequentially Truncated HoSVD
Alternating optimization
All-at-once optimization
quasi-optimality
CP-rank
factor matrix \(A_k\)
Approximate CP decomposition
Ambiguity
Normalized CP decomposition
Border rank problem
Matrix multiplication tensor
Nonnegative CP
Sparse CP
Orthogonal decomposition
symmetric
Symmetric CP
Generalized CP
Expectation Maximization
PARAFAC2
Tensor Train
Tree Tensor Networks
Netflix problem
Uniqueness
Relation to PCA
curse of dimensionality
\[ X^+ = (X^TX)^{-1}X^T\]
\[ \min \tfrac{1}{2}x^TQx+c^Tx\]
feasible & bounded below
weak duality
strong duality
Stationarity
Dual feasibility
Primal feasibilty
Complementary slackness
\[ \Pi_i^d n_i \text{ elements}\]
\[ r\Sigma_i^d n_i \text{ elements}\]
\[ \Pi_i^dr_i + \Sigma_i^d r_i n_i \text{ elements}\]
\( \text{argmin } \lVert y - Xw \rVert_2^2\)
\(\Sigma_i^k \sigma_iu_iv_i^T \)
Other Loss