Please enable JavaScript.

Coggle requires JavaScript to display documents.

Numerical Analysis I - Coggle Diagram

- - - - Row Operation
        \(A_{i.} \Leftarrow A_{i.} + c_{ij} * A_{j.}\)
        \(j\) is pivot and \(c_{ij} = -\frac{A_i}{A_{pivot}}\)
        Each operation can be expressed as \(E_{ij} = \begin{bmatrix} 1 & 0 & 0 & 0 & ... \\ c_{ij} & 1 & 0 & 0 & ... \\ 0 & 0 & 1 & 0 & ... \\ 0 & 0 & 0 & 1 & ... \end{bmatrix} \)
      - Find L and U
        \((E_1 E_2 ... E_n A)x= b\)
        \(Ux = E_1 E_2 ... E_n b\)
        \(L = E_n^{-1} E_{n-1}^{-1} ... E_1^{-1} \)
        
        Stroke of lucks
        
        \(E_{i_oj_o}^{-1} = \begin{bmatrix} 1 & 0 & 0 & 0 & ... \\ -c_{i_oj_o} & 1 & 0 & 0 & ... \\ 0 & 0 & 1 & 0 & ... \\ 0 & 0 & 0 & 1 & ... \end{bmatrix} \)
        
        \(E_{21}^{-1} * E_{31}^{-1} * ... * E_{ij}^{-1} = \begin{bmatrix} 1 & 0 & 0 & 0 & ... \\ -c_{21} & 1 & 0 & 0 & ... \\ -c_{31} & -c_{31} & 1 & 0 & ... \\ -c_{41} & -c_{42} & -c_{43} & 1 & ... \end{bmatrix} \)
      - Computational Efficiency
        Each row (except for first row) does the following calculation:
        \(A_{i.} \Leftarrow A_{i.} + c_{ij} * A_{j.}\)
        
        one division \(c\)
        
        n multiplication \(c * A_j.\)
        
        n additions \(A_{i.} + c * A_j.\)
        Total: 2n + 1 flops
        
        So \(n * n\) matrix take \((n-1)(2n+1)\) flops for first step
        
        Generalized
        Step k requires:
        \((n-k)(2(n-k)+3\)
        \(=2(n-k)^2 + 3(n-k)\)
        So the total is:
        \(\Sigma^{n-1}_{k=1} 2(n-k)^2 + 3(n-k)\)
        \(= \frac{2}{3}n^3 + O(n^2)\)
      - LU With Pivoting
        Gaussian Elimination with Partial Pivoting (GEPP)
        
        If the pivot is zero
        
        pivot is small relative to other entries
        
        \(M_3 P_3 M_2 P_2 M_1 P_1 A = U\)
        \(P_i\) represents row swapping
        \(M_i\) represents column operation
        
        \(P_3 P_2 P_1 A = \tilde{M_1}^{-1} \tilde{M_2}^{-1} \tilde{M_3}^{-1} U\)
        Similar to the 2nd stroke of luck
        \(\tilde{M_1}^{-1} \tilde{M_2}^{-1} \tilde{M_3}^{-1} = \begin{bmatrix} 1 & 0 & 0 & 0 & ... \\ -c_{21} & 1 & 0 & 0 & ... \\ -c_{31} & -c_{31} & 1 & 0 & ... \\ -c_{41} & -c_{42} & -c_{43} & 1 & ... \end{bmatrix} \)
        Where in \(\tilde{M_k} = (P_n P_{n-1} ...) M_k (P_{k+1}^T ... P_n^T)\),
        \(P_n P_{n-1} ...\) is row swap
        and
        \(P_{k+1}^T ... P_n^T\) is column swap
        of \(M_k\)
        
        3rd stroke of luck
        Define:
        \(\tilde{M_3} = M_3\)
        \(\tilde{M_2} = P_3 M_2 P_3^T \Rightarrow \tilde{M_2} P_3 = P_3 M_2\)
        \(\tilde{M_1} = P_3 P_2 M_1 P_2^T P_3^T \Rightarrow \tilde{M_1} P_3 P_2 = P_3 P_2 M_1\)
        Gives
        \((\tilde{M_3}) (\tilde{M_2} [P_3) (P_3^T] \tilde{M_1} P_3 P_2) P_1 A = U\)
        \(\tilde{M_3} \tilde{M_2} \tilde{M_1} P_3 P_2 P_1 A = U\)
    - - Chelosky Decomposition
        SPD matrices do not require pivoting to ensure stability.
        \(A = GG^T\), \(G\) is lower triangular matrix
    - - Gram-Schmidt Process
        project vector to higher dimension
        \(q_1 = \frac{a_1}{||a_1||}\)
        \(e_2 = a_2 - (a_2^Tq_1)q_1\), \(q_2 = \frac{e_2}{||e_2||}\)
        ...
        \(e_n = a_n - proj_{span(q_1, q_2, ..., q_{n-1})} a_n\), \(q_n = \frac{e_n}{||e_n||}\)
        This process can be expressed with matrix multiplication (QR) (Week7Notes 2 pg.2)
        
        It is more computationally expensive
        
        But it is stable \(\kappa_2(Q) = 1 \Rightarrow \kappa_2(R) = \kappa_2(A)\)
      - Householder Reflectors
        Opposite of Gram-Schmidt. From higher dimension to lower dimension
        Project \(a\) onto the \(n - 1\) dimensional hyperplane orthogonal to unit vector \(u \in \mathbb{R}^n\)
        \(a - (u^Ta)u = a - u(u^Ta) = a - (uu^T)a\)
        Then, go twice as far to reflect \(a\) on the opposite side of the n - 1 hyperplane orthogonal to the unit vector \(u\)
        \(a - 2(uu^T)a = (I - 2uu^T)a\)
        
        Typical choice of \(u\)
        \(= a + sign(a_1) * ||a||e_1\)
  - - - \(T = M^{-1}N = I - M^{-1}A\)
        \(C = M^{-1}b\)
      - \(x = (I - M^{-1}A)x + M^{-1}b\)
        \(x^{(k+1)} = x^{(k)} + M^{-1}(b - Ax)\)
      - Stationary Iteration
        Solve \(Md^{(k)} = b - Ax^{(k)}\)
        Then \(x^{(x+1)} = x^{(k)} + d^{(k)}\)
    - - Spectrum of T
        determines necessary & sufficient condition
        \(x^{(k+1)} - x = T(x^k - x)\)
        \(e^{(k + 1)} = Te^{(k)}\)
        Suppose T is diagonalizable
        \(e^{(0)} = c_1v_1 + c_2v_2 + ... + c_nv_n\) for v_i is eigenvectors
        \(e^{(k)} = T^ke^{(0)} = c_1 \lambda_1^k v_1 + c_2 \lambda_2^k v_2 + ... + c_n \lambda_n^k v_n\)
        So \(e^{(k)} \rightarrow 0\) iff \(|\lambda_1| < 1\)
        Suppose \(1 > |\lambda_1| > |\lambda_2| > ...\)
        Then \(e^{(k+1)} \approx \lambda_1^k e^{(k)}\)
        For large \(k\), rate of linear convergence \(\rho(T) = |\lambda_1|\)
        \(\frac{-1}{log_{10}(\rho(T))}\) iterations for 1 digit of accuracy
      - Convergence without knowing \(\rho\)(T)
        
        \(|a_{ii}| > \Sigma_{j=1,j \neq i}^n |a_ij|\) for every row i
        
        SPD matrices
        
        Gauss-Seidel always converges
        
        Jacobi may diverge
        
        Constant diagonal
  - - - If difference is large, the problem is ill-conditioned.
        
        ex)
        
        tan(x) for large x
        
        solving the system \(x+y=1, x+cy=0, c \neq 1\)
- - - - Generally,
        \(\kappa_2(A) \)
        \(= ||A||_2||A^{\dagger}||_2\)
        \(=[(V \Sigma^T U^T)(U \Sigma V^T)]^{-1}(V \Sigma^T U^T)\)
        \(=(V \begin{bmatrix} \theta_1^2 & 0 & 0 & ... \\ 0 & \theta_2^2 & 0 & ... \\ ... \\ 0 & 0 & ... & \theta_n^2 \end{bmatrix} V^T)^{-1}(V \Sigma^T U^T)\)
        \(=(V \begin{bmatrix} \frac{1}{\theta_1^2} & 0 & 0 & ...\\ 0 & \frac{1}{\theta_2^2} & 0 & ...\\ ...\\ 0 & 0 & 0 & \frac{1}{\theta_n^2} \end{bmatrix} V^T)(V \Sigma^T U^T)\)
        \(=(V \begin{bmatrix} \frac{1}{\theta_1} & 0 & 0 & ...\\ 0 & \frac{1}{\theta_2} & 0 & ...\\ ...\\ 0 & 0 & 0 & \frac{1}{\theta_n} \end{bmatrix} U^T)\)
        So \(\kappa_2(A) = \frac{\theta_1}{\theta_n}\)
      - But,
        \(A^TA =V \begin{bmatrix} \frac{1}{\theta_1^2} & 0 & 0 & ...\\ 0 & \frac{1}{\theta_2^2} & 0 & ...\\ ...\\ 0 & 0 & 0 & \frac{1}{\theta_n^2} \end{bmatrix} V^T)\)
        So \(\kappa_2(A^TA) = \frac{\theta_1^2}{\theta_n^2} = \kappa_2(A)^2\)
        Conditioning is much worse