Derivatives, Differentials, and the Chain Rule

This note covers derivatives and the chain rule at the level needed for multivariable calculus (MIT 18.02). It treats the differential as a linear approximation --- a machine that tells you how much changes when you make a small displacement. The deeper interpretation of as a 1-form (a linear map on vectors, living in the dual space) is covered in 1-Forms and the Dual Basis.

Derivatives: rates of change

A derivative measures how fast a function’s output changes as its input changes.

Single variable

For a function , the derivative at a point is

The output is a number --- the slope of the tangent line at . Alternative notation: .

Several variables: partial derivatives

For a function , the partial derivative with respect to at a point is

where is the -th standard basis vector (all zeros except a 1 in position ). This holds all other variables fixed and differentiates with respect to alone. The output is again a number.

Example. For :

Vector-valued functions

A vector-valued function (such as a parametrized curve) has a derivative that is computed component by component:

The derivative is a vector --- the velocity vector of the curve at parameter . See Arc-Length Parametrization for how this velocity vector relates to speed and arc length.

The differential: linear approximation

The differential captures the best linear approximation to how changes near a point.

Single variable

For , the differential at is

where represents a small change in the input. The meaning: if you change by a small amount , then changes by approximately . This is the tangent-line approximation.

Example. If , then . At , a change of gives . The actual change is , so the linear approximation is very close.

Several variables

For , the differential is

Here each represents a small change in the -th coordinate. The differential tells you: if you make small changes simultaneously, the total change in is approximately .

Example. For :

At the point with displacements , :

Differential vs. derivative

A derivative ( or ) is a number --- a rate of change. The differential () is a linear expression in the displacements --- it tells you how much changes for a given small displacement. The partial derivatives appear as coefficients in the differential. In the notation , the are the coefficients and the are the displacement variables.

A deeper view exists

At the 18.02 level, and are “small increments.” But they have a precise identity: they are basis elements of the dual space --- linear functions that extract components from vectors. The differential is then a linear map on vectors, not just a bookkeeping device. This deeper interpretation is covered in 1-Forms and the Dual Basis and is not needed for the chain rule or arc-length computations.

The chain rule

The chain rule tells you how to differentiate a composition of functions. It is the key tool for reparametrization (changing variables).

Single variable

If , then

The derivative of a composition is the derivative of the outer function (evaluated at the inner function) times the derivative of the inner function.

Example. Let . Here and , so:

Several variables

If is a function of variables, and each depends on a parameter , then

Each term accounts for how changes because changes with . The total rate of change is the sum of all these contributions.

Vector-valued functions

The chain rule applies component by component to vector-valued functions. If and is a change of parameter, then

This is used in Arc-Length Parametrization to prove the unit-speed property: the arc-length function satisfies , so by the inverse function theorem , and the chain rule gives

which has magnitude 1.

The chain rule for differentials

The chain rule has a clean expression in differential notation. If where , then:

But also, computing directly:

Setting these equal:

which gives

This is an equality of differentials, not of numbers. It holds for any displacement . This kind of manipulation --- applying to both sides of an equation and using linearity --- is a powerful technique for deriving relationships between differentials without going through partial derivatives explicitly.

See also