Matrix Calculus
Differentiating Scalars
Let \(f : \mathbb{R}^n \rightarrow \mathbb{R}\).
With respect to vectors
Let \(\mathbf{x} \in \mathbb{R}^n\). Then, \(\frac{\partial f}{\partial \mathbf{x}} = [\frac{\partial f}{\partial x_1}, \cdots, \frac{\partial f}{\partial x_n}]\) or the transpose of the gradient, i.e \(\frac{\partial f}{\partial \mathbf{x}} = (\nabla f)^T\).
Some common examples are given below:
- If \(f(\mathbf{x}) = \frac{1}{2}\mathbf{x}^T\mathbf{A}\mathbf{x}\) then \(f'(\mathbf{x})=\frac{1}{2}(\mathbf{A}+\mathbf{A}^T)\mathbf{x}\) and \(f''(\mathbf{x})=\frac{1}{2}(\mathbf{A}+\mathbf{A}^T)\).
- If \(f(\mathbf{w}) = \frac{1}{2}||\mathbf{y}-\mathbf{X}\mathbf{w}||^2\) then \(f'(\mathbf{w})=\mathbf{X}^T(\mathbf{X}\mathbf{w}-\mathbf{y})\) and \(f''(\mathbf{w})=\mathbf{X}^T\mathbf{X}\).
With respect to matrices
Let \(\mathbf{W} \in \mathbb{R}^{m \times n}\), then \(\frac{\partial f}{\partial \mathbf{W}} = \begin{bmatrix} \frac{\partial f}{\partial \mathbf{W}_{11}} & \dots & \frac{\partial f}{\partial \mathbf{W}_{1n}}\\ \vdots & \ddots & \vdots\\ \frac{\partial f}{\partial \mathbf{W}_{m1}} & \dots & \frac{\partial f}{\partial \mathbf{W}_{mn}} \end{bmatrix}\)
Differentiating Vectors
Let \(\mathbf{f} : \mathbb{R}^m \rightarrow \mathbb{R}^n\).
With respect to vectors
This gives the Jacobian.
With respect to matrices
Let, \(\mathbf{W} \in \mathbb{R}^{n\times m}\), then \(\frac{\partial \mathbf{f}}{\partial \mathbf{W}} \in \mathbb{R}^{n \times m \times n}\) *, in practice we compute \(\frac{\partial \mathbf{f}_k}{\partial \mathbf{W}_{ij}}\) separately.