Matrix Methods

Basics

Matrix Factorizations

Data Science

Tensor Factorizations

Floating Point Arithmetic

Vector Norms

Matrix Norms

Matrix Calculus

Least Squares

QR Decomposition A=QR

Regression

Singular Value Decomposition \( A = U\Sigma V^T\)

Principal Component Analysis

CX Decomposition \( CX \approx U_k \Sigma_k V_k^T\)

CUR Decomposition \( CUR = U_k\Sigma_k V_k^T\)

Nonnegative Matrix Factorization \(X \approx WH,\ W,H\geq0 \)

Other Factorizations

Clustering

Discrete Laplace \(\Delta f = \tfrac{\partial^2 f}{\partial x^2} \)

Spectral Clustering

Regularization

Kernel Trick

Logistic

Classification

SVM

Soft Margin Classifier

Tensor \(X\in\mathbb{R}^{n_1\times \cdots\times n_d}\)

Basic Operations

Tucker Decomposition \(X = G \times_1 U_1 \cdots \times_d U_d\)

Uniqueness

Canonical Polyadic Decomposition \(X = \Sigma^r a^{(1)} \circ \cdots \circ a^{(d)}\)

Extensions

Other Factorizations

image

image

relative error, absolute error

backward error, overflow

Inner product \(\langle x,y\rangle\)

Euclidean norm \( \lVert x \rVert_2\)

Manhattan norm \( \lVert x \rVert_1\)

Max-norm \( \lVert x \rVert_\inf\)

p-norm \( \lVert x \rVert_p\)

Metric \( \lVert x - y \rVert\)

Angle

Operator Norm \( \lVert A \rVert = \sup_{x\neq0} \frac{\lVert Ax \rVert}{\lVert x \rVert}\)

Frobenius: \( \lVert A \rVert_F = \sqrt{\langle A, A \rangle_F} = \sqrt{\text{tr}(A^TA)} \)

span

basis

rank

outer product

Jacobian \(\frac{\partial F(x)}{\partial x}\)

h-method

Residuals

Motivation

orthonormal

orthogonal

Givens Rotations

Householder reflectors \( H_v = I - 2 \frac{vv^T}{v^Tv} \)

Gram-Schmidt-orthogonalization

Tikhonov/Ridge \( \lambda \lVert w \rVert_2^2\)

LASSO \( \lambda \lVert w \rVert_1 \)

Singular values \( \sigma \)

Thin SVD

Eigenvalues \(\sigma^2 \)/Eigenvectors

Truncated SVD

Direction of maximal variance

Dimensionality reduction

Moore-Penrose-Pseudoinverse

Uniqueness

leverage scores \(\mathcal{l}_{r,j} = \lVert U_k (j, :) \rVert_2 \)

Interpolatory operator

highest

weighted random

Uniqueness, orthogonality

nonnegative rank

Alternating optimization

Alternating nonnegative LS

Alternating LS

Multiplicative Updates

Hierarchical ALS

LU decomposition \(A = LU \)

Cholesky decomposition \(A = LL^T \)

Matrix square root \(A = BB \)

Eigendecomposition \( A = VDV^{-1}\)

Jordan decomposition \(A = PJP^{-1} \)

K-Means

Norms

Procedures

Datasets

Laplace matrix \(L = D - W \)

Similarity graph \(G = (V, E) \)

min-cut problem

RatioCut

Normalized Laplacians

Random Walk Laplacian \( D^{-1}L\)

N-cut

Symmetric Graph Laplacian \(D^{-\tfrac{1}{2}} L D^{-\tfrac{1}{2}} \)

Mercer's theorem

Linear kernel

Polynomial kernel

RBF kernel

Sigmoid kernel

\( y \approx Xw\)

log-odds

Likelihood

Hyperplane \(w^Tx+b=0\)

linearly separable

Maximal Margin Classifier

trivial

Quadratic Problem

Duality

KKT conditions

Sequential Minimal Optimization

mode-k fiber \(X(i_1, :, i_d) \in \mathbb R^{n_k}\)

slice \(X(i_1, :, :, i_d) \in \mathbb R^{n_j\times n_k}\)

mode-k hyperslice \(X(:, i_k, :)\in\mathbb R^{n_1 \times n_{k-1} \times n_{k+1} \times n_d}\)

vectorization

mode-k unfolding \(X_{(k)}\)

Miranda Scientific Simulation

EEM Fluoresence Spectroscopy

Monkey BMI Neuronal Spike

Chicago Crime Count

Kronecker product ⨂

outer product ◯

Frobenius inner product

colexicographic (stack column)/lexicographic

mode-k product \(X \times_k A\)

Khatri-Rao product ⨀

Hadamard product \(\ast\)

multilinear rank \( r_k = \text{rank}(X_{(k)})\)

Approximate Tucker decomposition

approximation error & compression ratio

Higher-order SVD \(T_r\)

Sequentially Truncated HoSVD

Alternating optimization

All-at-once optimization

quasi-optimality

CP-rank

factor matrix \(A_k\)

Approximate CP decomposition

Ambiguity

Normalized CP decomposition

Border rank problem

Matrix multiplication tensor

Nonnegative CP

Sparse CP

Orthogonal decomposition

symmetric

Symmetric CP

Generalized CP

Expectation Maximization

PARAFAC2

Tensor Train

Tree Tensor Networks

Netflix problem

Uniqueness

Relation to PCA

curse of dimensionality

\[ X^+ = (X^TX)^{-1}X^T\]

\[ \min \tfrac{1}{2}x^TQx+c^Tx\]

feasible & bounded below

weak duality

strong duality

Stationarity

Dual feasibility

Primal feasibilty

Complementary slackness

\[ \Pi_i^d n_i \text{ elements}\]

\[ r\Sigma_i^d n_i \text{ elements}\]

\[ \Pi_i^dr_i + \Sigma_i^d r_i n_i \text{ elements}\]

\( \text{argmin } \lVert y - Xw \rVert_2^2\)

\(\Sigma_i^k \sigma_iu_iv_i^T \)

Other Loss