#education

Power Tower

source: YouTube, Graph

Power Tower for $b = 1.52$ Figure 1: Power Tower for \(b = 1.52\)

Power Tower for $b = 2^{1/2}$ Figure 2: Power Tower for \(b = 2^{1/2}\)

New Linear Algebra

source: videos, slides

New way to teach linear algebra.

\[ \begin{align*} Ax = \begin{bmatrix} 1 & 4 & 5 \\ 3 & 2 & 5 \\ 2 & 1 & 3 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 1 \\ 3 \\ 2 \end{bmatrix} x_1 + \begin{bmatrix} 4 \\ 2 \\ 1 \end{bmatrix} x_2 + \begin{bmatrix} 5 \\ 5 \\ 3 \end{bmatrix} x_3 \end{align*} \]

\(A = CR\)

\[ \begin{align*} A = CR = \begin{bmatrix} 1 & 4 \\ 3 & 2 \\ 2 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 & 1 \\ 0 & 1 & 1 \end{bmatrix} \end{align*} \]

  • getting independent columns, which is the basis for the column space.
    • clearly, you can always do this
    • it’s just that sometimes it’s not interesting (i.e. if it’s already a basis)
  • \(R\) is the basis for the row space, also rank 2
  • number of independent columns = \(r\) = number of independent rows
  • \(\operatorname{rank}(C(A)) = \operatorname{rank}(R(A))\)
  • \(Ax = 0\) has one solution, and there are \(n-r\) independent solutions to \(Ax = 0\)
  • \(R\) turns out to be the row reduced echolon form of \(A\)

\(Ax = 0\)

\[ \begin{align*} \begin{bmatrix} \text{row } 1 \\ \vdots \\ \text{row } m \end{bmatrix} \begin{bmatrix} \\ x \\ \, \end{bmatrix} = \begin{bmatrix} 0 \\ \vdots \\ 0 \end{bmatrix} \end{align*} \]

  • 0 on the right means that the rows must all be orthogonal to \(x\)
  • i.e. every \(x\) in the nullspace \(N(A)\) must be orthogonal to the row space.
  • if you take transposes you get the equivalent statement for column space

Figure 1: Big Picture Linear Algebra

Big Picture Linear Algebra

\(A = BC\)

\[ \begin{align*} A = BC = b_1 c_1^* + b_2 c_2^* + \ldots \end{align*} \]

  • here, \(b_i\) are the column vectors of \(B\), while \(c_i^*\) are the row vectors of \(C\).
  • you can think of matrix also as column by row multiplication
    • it’s more like building blocks there, adding up rank 1 matrices

\(A = LU\)

  • you can rewrite \(A\) as a product of lower-triangular and upper-triangular matrices.
  • that’s what you’re doing when you’re solving simultaneous equations by elimination

\(Q^\top Q\)

  • \(Q\) with orthogonal(normal) column vectors, such that \(Q^\top Q = I\)
  • not necessarily true that \(QQ^\top = I\), if not square
  • \(QQ^\top\) is like projection:
    • \(QQ^\top QQ^\top = QQ^\top\)
  • if square, then \(QQ^\top = I\) and \(Q^\top = Q^{-1}\)
  • length is preserved by \(Q\)

\(A=QR\)

  • Gram-Schmidt: orthogonalize the columns of \(A\).
  • assuming full rank of \(C(A)\)
  • \(Q^\top A = Q^\top Q R = R\)
  • Major application: Least Squares
    • want to minimize \(\norm{b - Ax}^2 = \norm{e}^2\)
    • normal equation for minimizing solution (best \(\hat{x}\)): \(A^\top e = 0\)
      • if \(A = QR\), then you get \(R \hat{x} = Q^\top b\)

Figure 2: Least Square

Least Square

\(S = Q \Lambda Q^\top\)

  • symmetric matrix \(S=S^\top\) have orthogonal eigenvectors
  • how to show \(x^\top y = 0\)
    • \(Sx=\lambda x \implies x^\top S^\top = \lambda x^\top \implies x^\top S y = \lambda x^\top y\)
    • \(Sy=\alpha y \implies x^\top S y = \alpha x^\top y\)
    • hence, \(\lambda x^\top y = \alpha x^\top y\). but \(\lambda \neq \alpha\), since eigenvalues are distinct
      • hence, \(x^\top y = 0\)
  • things are not nice when things aren’t square
    • but \(A^\top A\) is always square, symmetric, PSD
    • same holds for \(AA^\top\)
      • same eigenvalues
      • \(x \to Ax\) to convert eigenvectors
  • from definition of eigenvalues: \(AX = X \Lambda\)
    • if you don’t have symmetry, then you can’t just hit it with \(X^\top\) and get things to cancel

\(A = U \Sigma V^\top\)

  • \(U^\top U = I, V^\top V = I\)
  • \(AV = U \Sigma\) (like the eigen-decomposition, but not quite)
    • \(Av_i = \sigma_i u_i\)
  • \(U, V\) are rotations, \(\Sigma\) stretch.
  • how to choose orthonormal \(v_i\) in the row space of \(A\)
    • \(v_i\) are eigenvectors of \(A^\top A\) (which are orthonormal, from above)
    • \(A^\top A v_i = \lambda_i v_i = \sigma_i^2 v_i\)
    • so then \(A^\top A V = \Sigma^2 V\)
  • how to choose \(u_i\)
    • \(u_i = \frac{1}{\sigma_i} A v_i\)
    • check orthonormal: \(u_i^\top u_j\)
  • combining, \(A^\top A v_i = \sigma_i^2 v_i \implies A^\top u_i = \sigma_i v_i\)

Figure 3: Singular Value Decomposition

Singular Value Decomposition