Multivariate linear regression before linear algebra:
$$ \begin{align*} y^{(1)} &\approx \theta_0 + \theta_1 x_1^{(1)} + \theta_2 x_2^{(1)} + \ldots + \theta_n x_n^{(1)} \\ y^{(2)} &\approx \theta_0 + \theta_1 x_1^{(2)} + \theta_2 x_2^{(2)} + \ldots + \theta_n x_n^{(2)} \\ \ldots \\ y^{(m)} &\approx \theta_0 + \theta_1 x_1^{(m)} + \theta_2 x_2^{(m)} + \ldots + \theta_n x_n^{(m)} \end{align*} $$Multivariate linear regression after linear algebra $$ \mathbf{y} \approx X \boldsymbol{\theta} $$
user | Moonlight | The Shape of Water | Frozen | Moana |
---|---|---|---|---|
Alice | 5 | 4 | 1 | |
Bob | 5 | 2 | ||
Carol | 5 | |||
David | 5 | 5 | ||
Eve | 5 | 4 |
Matrix completion problem, matrix factorization
A matrix is an rectangular array of numbers
$$ A = \left[ \begin{array}{cc} 101 & 10 \\ 54 & 13 \\ 10 & 47 \end{array} \right] $$When $A$ has $m$ rows and $n$ columns, we say that:
The entry in row $i$ and column $j$ is denoted $A_{ij}$
import numpy as np
# Pass a list of lists (rows) to the np.array constructor
A = np.array([[101, 10],
[54, 13],
[10, 47]])
print(A)
m,n = A.shape
print ("A has %d rows and %d columns" % (m, n))
Note that Python is zero-indexed
A = np.array([[101, 10],
[54, 13],
[10, 47]])
print (A[0,0]) # A_11 in math
print (A[2,1]) # A_32 in math
print (A[1,1]) # A_22 in math
print (A[1,2]) # A_23 in math. ERROR!
A vector is an $n \times 1$ matrix:
$$ \mathbf{x} = \left[ \begin{array}{c} 8 \\ 2.4 \\ 1 \\ -10 \end{array} \right] $$Consider $\mathbf{x} \in \mathbb{R}^4$ defined as $\small \mathbf{x} = \left[ \begin{array}{c} 8 \\ 2.4 \\ 1 \\ -10 \end{array} \right] $
x = np.array([8, 2.4, 1, -10]) # in numpy a vector is a 1d array
print(x[0])
print(x[3])
If two matrices have the same size, we can add them by adding corresponding elements
$$ \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} + \begin{bmatrix} 3 & 5 \\ -1 & 0 \end{bmatrix} = \begin{bmatrix} 4 & 7 \\ 2 & 4 \end{bmatrix} $$A = np.array([[1, 2],
[3, 4]])
B = np.array([[3, 5],
[-1, 0]])
print(A + B)
A = np.array([[1, 2],
[3, 4]])
C = np.array([[1, 2, 3]])
print(A + C)
# Beware: broadcasting. This will work, and is a nice feature, but
# is *not* an accepted linear algebra operation
A = np.array([[1, 2],
[3, 4]])
D = np.array([10, 20])
print(A + D)
# Do this to broadcast a column vector
A = np.array([[1, 2],
[3, 4]])
D = np.array([[10], [20]]) # a 2x1 vector or "column vector"
print(A + D)
A scalar $x \in \mathbb{R}$ is a real number (i.e., not a vector)
$$\text{e.g., } 2,\, 3,\, \pi,\, \sqrt{2},\, 1.843,\, \ldots$$Scalar times a matrix:
$$ 2 \cdot \begin{bmatrix} 1 & 3 \\ -2 & 0 \end{bmatrix} = \begin{bmatrix} 2 & 6 \\ -4 & 0 \end{bmatrix} $$(multiply each entry by the scalar)
B = 2 * np.array([[1,3], [-2,0]])
print(B)
Let $\mathbf{x}, \mathbf{y}$ be vectors of same size ($\mathbf{x}, \mathbf{y} \in \mathbb{R}^n$).
Their dot product is $$ \begin{align*} \mathbf{x}^T \mathbf{y} &= \sum_{i=1}^n x_i y_i \\ &= \begin{bmatrix}x_1 & x_2 & \ldots & x_n \end{bmatrix} \begin{bmatrix}y_1 \\ y_2 \\ \vdots \\ y_n \end{bmatrix} \end{align*} $$
x = np.array([1,2,3])
y = np.array([2,4,5])
print(np.dot(x, y))
Can multiply two matrices if their inner dimensions match
$$ A \in \mathbb{R}^{m \times n}, B \in \mathbb{R}^{n \times p} $$$$ C = AB \quad \in \mathbb{R}^{m \times p} $$The product has entries $$ C_{ij} = \sum_{k=1}^n A_{ik} B_{kj} $$
Dot product of $i$th row of $A$ and $j$th column of $B$.
$\newcommand{\r}{\mathbf}$
$$ \begin{bmatrix} c_{11} & c_{12} & c_{13}\\ c_{21} & c_{22} & c_{23}\\ c_{31} & \r{c_{32}} & c_{33}\\ c_{41} & c_{42} & c_{43} \end{bmatrix} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \r{a_{31}} & \r{a_{32}} \\ a_{41} & a_{42} \\ \end{bmatrix} \begin{bmatrix} b_{11} & \r{b_{12}} & b_{13} \\ b_{21} & \r{b_{22}} & b_{23} \end{bmatrix} $$$$ c_{32} = a_{31}b_{12} + a_{32}b_{22} $$A = np.array([[1, -1], [0, 3]])
B = np.array([[3, 2], [-1, 0]])
print(A * B) # NOT matrix multiplication
# OH NO! A*B gives elementwise multiplication!
# Use np.dot for matrix multiplication
print(np.dot(A, B))
print(A.dot(B))
A (worthy) special case of matrix-matrix multiplication:
$$ A \in \mathbb{R}^{m \times n}, \quad \mathbf{x} \in \mathbb{R}^n $$$$ \mathbf{y} = A\mathbf{x} \in \mathbb{R}^{m} $$Definition $$ y_i = \sum_{j=1}^n A_{ij} x_j $$
A = np.array([[1, -1], [0, 3]])
x = np.array([1, -1])
z = np.array([8, 1.5])
print(np.dot(A, x))
print(np.dot(A, z))
A = np.array([[1, -1], [0, 3]])
x = np.array([1, -1])
print(np.dot(x,A))
A = np.array([[1, -1], [0, 3]])
# A vector can also be represented as a 2D array. In this case
# you must be careful about whether it is a row or column vector
x_rowvec = np.array([[1, -1]])
x_colvec = np.array([[1], [-1]])
print(np.dot(A, x_rowvec)) # error
#print(np.dot(A, x_colvec))
#print(np.dot(x_rowvec, A))
Transposition of a matrix swaps the rows and columns $$ A = \begin{bmatrix}1 & -1 \\ 0 & 3 \end{bmatrix}, \quad A^T = \begin{bmatrix}1 & 0 \\ -1 & 3 \end{bmatrix}. $$
Definition:
$\newcommand{\b}{\mathbf}$
$$ A = \begin{bmatrix} 3 & 2 \\ -1 & 0 \\ 1 & 4 \end{bmatrix}\qquad A^T = \begin{bmatrix} 3 & -1 & 1 \\ 2 & 0 & 4 \end{bmatrix} $$$$ \b{x} = \begin{bmatrix} 1 \\ -3 \\ 2\end{bmatrix} \qquad \b{x}^T = \begin{bmatrix} 1 & -3 & 2\end{bmatrix} $$A = np.array([[3, 2], [-1, 0], [1, 4]])
print(A.T) # numpy array has a transpose property
The norm of a vector is $$ \begin{align*} \| \b{x} \| &= \sqrt{x_1^2 + x_2^2 + \ldots + x_n^2} \\ &= \sqrt{\b{x}^T \b{x}} \end{align*} $$
Geometric interpretation: length of the vector
Transpose of transpose
$$ (A^T)^T = A $$
Transpose of sum
$$ (A+B)^T = A^T + B^T $$
Transpose of product
$$ (AB)^T = B^T A^T $$
The identity matrix $I \in \mathbb{R}^{n \times n}$ has entries
$$ I_{ij} = \begin{cases}1 & i=j \\ 0 & i \neq j\end{cases}, $$$$ I_{1 \times 1} = [1], \qquad I_{2 \times 2} = \begin{bmatrix}1 & 0 \\ 0 & 1\end{bmatrix}, \qquad I_{3 \times 3} = \begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{bmatrix}. $$For any $A, B$ of appropriate dimensions $$ \begin{align*} IA &= A \\ BI &= B \end{align*} $$
# Use numpy.eye to create identity matrices of different dimensions
I = np.eye(1)
print(I)
I = np.eye(2)
print(I)
I = np.eye(3)
print(I)
I = np.eye(2)
A = np.array([[1,2], [3,4]])
print(A)
print(np.dot(A, I))
print(np.dot(I, A))
The inverse $A^{-1} \in \mathbb{R}^{n \times n}$ of a square matrix $A \in \mathbb{R}^{n \times n}$ satisfies $$ AA^{-1} = I = A^{-1}A $$
Compare to division of scalars $$ x x^{-1} = 1 = x^{-1} x $$
Not all matrices are invertible
Is $B$ the inverse of $A$?
Verify on your own.
A = np.array([[1, -1], [0, 3]])
# Use numpy.linalg.inv to invert a matrix
print(np.linalg.inv(A))
Inverse of inverse
$$ (A^{-1})^{-1} = A $$
Inverse of product
$$(AB)^{-1} = B^{-1}A^{-1} $$
Inverse of transpose $$ (A^{-1})^T = (A^T)^{-1} := A^{-T} $$