16.2 Derivatives of vector-valued functions
In a single variable, defining higher-order derivatives is easy. We simply have to keep repeating differentiation:
and so on. However, this is not that straightforward with multivariable functions.
So far, we have only talked about gradients, the generalization of the derivative for vector-scalar functions.
As ∇f(a) is a column vector, the gradient is a vector-vector function ∇ : ℝn →ℝn. We only know how to compute the derivative of vector-scalar functions. It’s time to change that!
16.2.1 The derivatives of curves
Curves, often describing the solutions of dynamical systems, are one of the most important objects in mathematics. We don’t explicitly use them in machine learning, but they are underneath algorithms such as gradient descent. (Where we traverse a discretized curve leading to a local minimum.)
Formally, a curve – that is, a scalar-vector function – is given by a function...