Derivatives, Differentials, and the Chain Rule
This note covers derivatives and the chain rule at the level needed for multivariable calculus (MIT 18.02). It treats the differential as a linear approximation --- a machine that tells you how much changes when you make a small displacement. The deeper interpretation of as a 1-form (a linear map on vectors, living in the dual space) is covered in 1-Forms and the Dual Basis.
Derivatives: rates of change
A derivative measures how fast a function’s output changes as its input changes.
Single variable
For a function , the derivative at a point is
The output is a number --- the slope of the tangent line at . Alternative notation: .
Several variables: partial derivatives
For a function , the partial derivative with respect to at a point is
where is the -th standard basis vector (all zeros except a 1 in position ). This holds all other variables fixed and differentiates with respect to alone. The output is again a number.
Example. For :
Vector-valued functions
A vector-valued function (such as a parametrized curve) has a derivative that is computed component by component:
The derivative is a vector --- the velocity vector of the curve at parameter . See Arc-Length Parametrization for how this velocity vector relates to speed and arc length.
The differential: linear approximation
The differential captures the best linear approximation to how changes near a point.
Single variable
For , the differential at is
where represents a small change in the input. The meaning: if you change by a small amount , then changes by approximately . This is the tangent-line approximation.
Example. If , then . At , a change of gives . The actual change is , so the linear approximation is very close.
Several variables
For , the differential is
Here each represents a small change in the -th coordinate. The differential tells you: if you make small changes simultaneously, the total change in is approximately .
Example. For :
At the point with displacements , :
Differential vs. derivative
A derivative ( or ) is a number --- a rate of change. The differential () is a linear expression in the displacements --- it tells you how much changes for a given small displacement. The partial derivatives appear as coefficients in the differential. In the notation , the are the coefficients and the are the displacement variables.
A deeper view exists
At the 18.02 level, and are “small increments.” But they have a precise identity: they are basis elements of the dual space --- linear functions that extract components from vectors. The differential is then a linear map on vectors, not just a bookkeeping device. This deeper interpretation is covered in 1-Forms and the Dual Basis and is not needed for the chain rule or arc-length computations.
The chain rule
The chain rule tells you how to differentiate a composition of functions. It is the key tool for reparametrization (changing variables).
Single variable
If , then
The derivative of a composition is the derivative of the outer function (evaluated at the inner function) times the derivative of the inner function.
Example. Let . Here and , so:
Several variables
If is a function of variables, and each depends on a parameter , then
Each term accounts for how changes because changes with . The total rate of change is the sum of all these contributions.
Vector-valued functions
The chain rule applies component by component to vector-valued functions. If and is a change of parameter, then
This is used in Arc-Length Parametrization to prove the unit-speed property: the arc-length function satisfies , so by the inverse function theorem , and the chain rule gives
which has magnitude 1.
The chain rule for differentials
The chain rule has a clean expression in differential notation. If where , then:
But also, computing directly:
Setting these equal:
which gives
This is an equality of differentials, not of numbers. It holds for any displacement . This kind of manipulation --- applying to both sides of an equation and using linearity --- is a powerful technique for deriving relationships between differentials without going through partial derivatives explicitly.
See also
- Arc-Length Parametrization --- uses the chain rule to derive the unit-speed property
- Position Vectors and Coordinate-Free Geometry --- position vectors and affine combinations, prerequisite for parametrized curves
- Directional Derivatives and the Gradient --- directional derivatives as a special case of the chain rule
- 1-Forms and the Dual Basis --- the deeper interpretation of and as linear maps on vectors (enrichment, beyond 18.02)