Gradient

Given a scalar field \(f : \mathbb{R}^n \rightarrow \mathbb{R}\), its gradient, denoted \(\nabla f\) is the vector of partial derivatives \[ \nabla f = \begin{bmatrix} \partial_1 f\\ \vdots\\ \partial_n f \end{bmatrix}\] Where \(\partial_i f = \frac{\partial f}{\partial x^i}\) denotes the partial derivative of \(f\) in the direction of the coordinate \(x^i\), formally defined as \[ \frac{\partial f}{\partial x_i } = \lim_{\epsilon \rightarrow 0} \frac{f(x_1, \cdots x_i + \epsilon \cdots x_n) - f(x_1, \cdots, x_n)}{\epsilon} \]

To emphasize on the variables held constant, sometimes the notation \(\big( \frac{\partial f}{\partial x_i} \big)_{x_1,\cdots,x_{i-1},x_{i+1},\cdots,x_n}\) is used.

In this form, the gradient can be rewritten as \(\nabla f = \partial_i f \mathbf{e}_i\), thus the above definition implicitly chooses the standard Euclidean basis.

The symbol \(\nabla\) can be thought of as an operator as well, in type notation this would be \(\nabla :: (\mathbb{R}^n \rightarrow \mathbb{R}) \rightarrow \mathbb{R}^n\) such that \(\nabla = \mathbf{e}_i \partial_i\).

Note that the gradient returns a vector field. The gradient can be used to rewrite directional derivative of \(f\) in the direction of the unit vector \(\mathbf{\hat{n}}\) as \(D_{\mathbf{\hat{n}}}(f) = \nabla f \cdot \mathbf{\hat{n}} = |\nabla f||\mathbf{\hat{n}}| \cos(\theta)\) - this dot product measures the rate of change of \(f\) in the direction of \(\mathbf{\hat{n}}\) thus we can see it is maximized when \(\theta=0\) i.e. \(\mathbf{\hat{n}}\) points in the same direction as \(\nabla f\).