Critical Points and the Hessian

Critical points from the gradient

A critical point of a differentiable function $f (x, y)$ is a point $(x_{0}, y_{0})$ where the gradient vanishes:

\nabla f (x_{0}, y_{0}) = (\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}) = (0, 0) .

See Directional Derivatives and the Gradient for the full derivation of $\nabla f$ and the formula $D_{u} f = \nabla f \cdot u$ . That formula has an immediate consequence here: if $\nabla f = 0$ , then $D_{u} f = 0 \cdot u = 0$ for every unit vector $u$ . The directional derivative is zero in every direction. Geometrically, the surface $z = f (x, y)$ is momentarily flat at $(x_{0}, y_{0})$ --- it has a horizontal tangent plane.

But “flat” is ambiguous. A hilltop is flat at the summit (local maximum). A valley floor is flat at the bottom (local minimum). A mountain pass is flat at the saddle --- it curves up in one direction and down in another (saddle point). The gradient alone cannot distinguish these cases because it only captures first-order behaviour. To classify a critical point, we need second-order information: the second partial derivatives of $f$ .

Second-order Taylor expansion at a critical point

The tool for extracting second-order behaviour is the Taylor expansion --- the approximation of a function by a polynomial that matches the function’s value and derivatives at a point. For a function of two variables, the second-order Taylor expansion of $f$ around the point $(x_{0}, y_{0})$ is

f (x_{0} + h, y_{0} + k) \approx f (x_{0}, y_{0}) + f_{x} h + f_{y} k + \frac{1}{2} (f_{xx} h^{2} + 2 f_{x y} hk + f_{yy} k^{2}),

where all partial derivatives are evaluated at $(x_{0}, y_{0})$ , $h = x - x_{0}$ is the displacement in $x$ , and $k = y - y_{0}$ is the displacement in $y$ . The notation $f_{xx} = \frac{\partial ^{2} f}{\partial x ^{2}}$ , $f_{x y} = \frac{\partial ^{2} f}{\partial x \partial y}$ , and $f_{yy} = \frac{\partial ^{2} f}{\partial y ^{2}}$ denotes the second partial derivatives --- derivatives of derivatives.

At a critical point, $f_{x} = 0$ and $f_{y} = 0$ , so the linear terms vanish. Subtracting the constant $f (x_{0}, y_{0})$ , the change in function value is controlled entirely by the quadratic form --- a homogeneous polynomial of degree 2 in $h$ and $k$ :

Δ f \approx \frac{1}{2} (A h^{2} + 2 B hk + C k^{2}),

where we introduce the shorthand

A = f_{xx} (x_{0}, y_{0}), B = f_{x y} (x_{0}, y_{0}), C = f_{yy} (x_{0}, y_{0}) .

A quadratic form is an expression of the type $α h^{2} + β hk + γ k^{2}$ --- a polynomial where every term has total degree exactly 2. Quadratic forms are the two-variable analogue of $a x^{2}$ in single-variable calculus: they capture the curvature of the function.

The key insight

Near a critical point, the function’s behaviour is determined by the sign of the quadratic form $A h^{2} + 2 B hk + C k^{2}$ . If this form is always positive (for every displacement $(h, k) \neq = (0, 0)$ ), the critical point is a local minimum. If always negative, a local maximum. If it changes sign depending on direction, a saddle point.

Quadratic form as matrix product: the Hessian

The quadratic form $A h^{2} + 2 B hk + C k^{2}$ can be written as a matrix product. Define the column vector $v = (h k)$ and the $2 \times 2$ matrix

H = (A B B C) = (f_{xx} f_{x y} f_{x y} f_{yy}) .

Then

v^{T} H v = (h k) (A B B C) (h k) = A h^{2} + 2 B hk + C k^{2},

and so $Δ f \approx \frac{1}{2} v^{T} H v$ .

The matrix $H$ is the Hessian matrix of $f$ at the critical point --- the matrix of all second partial derivatives. It is named after the German mathematician Ludwig Otto Hesse (1811—1874), who introduced it in the context of algebraic geometry.

Notice that $B = f_{x y}$ appears in both off-diagonal positions. This is because of Clairaut’s theorem (also called Schwarz’s theorem): if the second partial derivatives $f_{x y}$ and $f_{y x}$ are both continuous, then $f_{x y} = f_{y x}$ --- the order of differentiation does not matter. This symmetry $f_{x y} = f_{y x}$ is what makes $H$ a symmetric matrix (a matrix equal to its own transpose: $H = H^{T}$ ).

Notation bridge

MIT 18.02 uses $A = f_{xx}$ , $B = f_{x y}$ , $C = f_{yy}$ and checks $A C - B^{2}$ . Politecnico di Torino teaches this as the Hessian matrix $H$ , checking $det (H)$ and eigenvalues. Same object, same test, different packaging.

The second-derivative test: classification table

The sign behaviour of the quadratic form $v^{T} H v$ determines the nature of the critical point. The classification depends on the determinant of the Hessian, $det (H) = A C - B^{2}$ , and the sign of $A = f_{xx}$ :

$det (H) = A C - B^{2}$	$A$	Classification
$> 0$	$> 0$	Local minimum
$> 0$	$< 0$	Local maximum
$< 0$	any	Saddle point
$= 0$	any	Degenerate (test inconclusive)

Why these conditions work

When $det (H) > 0$ and $A > 0$ : both eigenvalues are positive (see next section), so the quadratic form is always positive --- the surface curves upward in every direction. Local minimum.

When $det (H) > 0$ and $A < 0$ : both eigenvalues are negative, so the quadratic form is always negative --- the surface curves downward in every direction. Local maximum.

When $det (H) < 0$ : the eigenvalues have opposite signs, so the quadratic form is positive in some directions and negative in others --- the surface curves up one way and down another. Saddle point.

When $det (H) = 0$ : at least one eigenvalue is zero, and the second-order information is insufficient. The point could be a minimum, maximum, saddle, or something more exotic. Higher-order derivatives are needed. This case is called degenerate.

A quadratic form that is always positive (for all non-zero $v$ ) is called positive definite. Always negative: negative definite. Takes both signs: indefinite. Has a zero but is otherwise non-negative (or non-positive): semi-definite.

The diagram below shows the contour patterns for each case: elliptical contours closing around a minimum (left), hyperbolic contours at a saddle with eigenvector arrows showing the up/down directions (center), and elliptical contours around a maximum (right).

Eigenvalue connection

The classification table above is really a statement about the eigenvalues of $H$ . An eigenvalue $λ$ of a matrix $M$ is a scalar such that $M w = λ w$ for some non-zero vector $w$ (called an eigenvector) --- the matrix acts on $w$ by simply scaling it. The eigenvalues of a $2 \times 2$ matrix are the roots of its characteristic polynomial $det (M - λ I) = 0$ .

Because $H$ is symmetric, a powerful result from linear algebra guarantees that its eigenvalues are well-behaved:

Spectral Theorem (for real symmetric matrices). Every real symmetric matrix has all real eigenvalues and can be diagonalized by an orthogonal matrix (a matrix whose columns are mutually perpendicular unit vectors). In other words, symmetric matrices have no complex eigenvalues, and their eigenvectors can be chosen to be orthogonal.

For our $2 \times 2$ Hessian with eigenvalues $λ_{1}$ and $λ_{2}$ , two standard facts from linear algebra connect eigenvalues to the determinant and trace:

det (H) = λ_{1} λ_{2}, tr (H) = λ_{1} + λ_{2} = A + C .

The trace $tr (H)$ is the sum of the diagonal entries. Now the classification table translates directly into eigenvalue language:

Eigenvalue condition	Quadratic form	Critical point
$λ_{1} > 0, λ_{2} > 0$	Positive definite	Local minimum
$λ_{1} < 0, λ_{2} < 0$	Negative definite	Local maximum
$λ_{1}$ and $λ_{2}$ have opposite signs	Indefinite	Saddle point
$λ_{1} = 0$ or $λ_{2} = 0$	Semi-definite	Degenerate

When both eigenvalues are positive, their product $det (H) = λ_{1} λ_{2} > 0$ and $A = f_{xx} > 0$ (because $A$ is the $(1, 1)$ entry of a positive-definite matrix). When both are negative, the product is still positive but $A < 0$ . When they have opposite signs, the product is negative. This is exactly the $A C - B^{2}$ test in the previous section.

The eigenvalue perspective also reveals the principal directions of curvature: the eigenvectors of $H$ point in the directions along which the surface curves the most and the least. The corresponding eigenvalues are the curvatures in those directions.

Worked examples

Example 1: $f (x, y) = x^{2} + y^{2}$ (local minimum)

Gradient.

\nabla f = (2 x, 2 y) .

Setting $\nabla f = (0, 0)$ gives the critical point $(0, 0)$ .

Hessian.

f_{xx} = 2, f_{x y} = 0, f_{yy} = 2, H = (2002) .

Determinant and classification.

det (H) = 2 \cdot 2 - 0^{2} = 4 > 0, A = 2 > 0.

$det (H) > 0$ and $A > 0$ : local minimum.

Eigenvalues. $λ_{1} = 2$ , $λ_{2} = 2$ . Both positive, confirming positive definiteness. The surface is a paraboloid opening upward.

Example 2: $f (x, y) = - x^{2} - y^{2}$ (local maximum)

Gradient.

\nabla f = (- 2 x, - 2 y) .

Critical point: $(0, 0)$ .

Hessian.

f_{xx} = - 2, f_{x y} = 0, f_{yy} = - 2, H = (- 2 0 0 - 2) .

Determinant and classification.

det (H) = (- 2) (- 2) - 0^{2} = 4 > 0, A = - 2 < 0.

$det (H) > 0$ and $A < 0$ : local maximum.

Eigenvalues. $λ_{1} = - 2$ , $λ_{2} = - 2$ . Both negative, confirming negative definiteness. The surface is an inverted paraboloid.

Example 3: $f (x, y) = x^{2} - y^{2}$ (saddle point)

Gradient.

\nabla f = (2 x, - 2 y) .

Critical point: $(0, 0)$ .

Hessian.

f_{xx} = 2, f_{x y} = 0, f_{yy} = - 2, H = (20 0 - 2) .

Determinant and classification.

det (H) = 2 \cdot (- 2) - 0^{2} = - 4 < 0.

$det (H) < 0$ : saddle point, regardless of $A$ .

Eigenvalues. $λ_{1} = 2$ , $λ_{2} = - 2$ . Opposite signs confirm indefiniteness. Along the $x$ -axis the surface curves up ( $+ 2 x^{2}$ ); along the $y$ -axis it curves down ( $- 2 y^{2}$ ). The surface is a hyperbolic paraboloid --- the classic saddle shape.

Example 4: $f (x, y) = x^{3}$ (degenerate)

Gradient.

\nabla f = (3 x^{2}, 0) .

Setting $\nabla f = (0, 0)$ : $3 x^{2} = 0$ gives $x = 0$ , and $y$ is free --- but $f_{y} = 0$ everywhere, so the entire $y$ -axis consists of critical points. Take $(0, 0)$ .

Hessian.

f_{xx} = 6 x, f_{x y} = 0, f_{yy} = 0, H_{(0, 0)} = (0000) .

Determinant and classification.

det (H) = 0 \cdot 0 - 0^{2} = 0.

$det (H) = 0$ : degenerate --- the second-derivative test is inconclusive.

Eigenvalues. $λ_{1} = 0$ , $λ_{2} = 0$ . Both zero: the Hessian carries no curvature information at all.

What actually happens. Along the $x$ -axis, $f (x, 0) = x^{3}$ , which is increasing (not a minimum or maximum). The origin is an inflection point of the single-variable slice $x \mapsto x^{3}$ . To resolve degenerate cases, one must examine third- or higher-order derivatives, or analyse the function directly. The second-derivative test simply cannot help here.

Edmondo's Vault

Explorer

Critical Points and the Hessian

Critical Points and the Hessian

Critical points from the gradient

Second-order Taylor expansion at a critical point

Quadratic form as matrix product: the Hessian

The second-derivative test: classification table

Eigenvalue connection

Worked examples

Example 1: $f (x, y) = x^{2} + y^{2}$ (local minimum)

Example 2: $f (x, y) = - x^{2} - y^{2}$ (local maximum)

Example 3: $f (x, y) = x^{2} - y^{2}$ (saddle point)

Example 4: $f (x, y) = x^{3}$ (degenerate)

See also

Graph View

Table of Contents

Backlinks

Edmondo's Vault

Explorer

Critical Points and the Hessian

Critical Points and the Hessian

Critical points from the gradient

Second-order Taylor expansion at a critical point

Quadratic form as matrix product: the Hessian

The second-derivative test: classification table

Eigenvalue connection

Worked examples

Example 1: f(x,y)=x2+y2 (local minimum)

Example 2: f(x,y)=−x2−y2 (local maximum)

Example 3: f(x,y)=x2−y2 (saddle point)

Example 4: f(x,y)=x3 (degenerate)

See also

Graph View

Table of Contents

Backlinks

Example 1: $f (x, y) = x^{2} + y^{2}$ (local minimum)

Example 2: $f (x, y) = - x^{2} - y^{2}$ (local maximum)

Example 3: $f (x, y) = x^{2} - y^{2}$ (saddle point)

Example 4: $f (x, y) = x^{3}$ (degenerate)