Confidence Levels in Risk Management

Why this note exists

The settlement-and-clearing note says CCPs calibrate initial margin to “a 99% confidence level.” The aig-and-the-cds-crisis note shows what happens when there is no margin at all. But what does “99% confidence” actually mean? How is it computed? Why does LCH use 99.7% while NSCC uses 99%? And why does the Gaussian assumption — which underlies the simplest models — systematically underestimate risk in the tail?

Prerequisites

  • settlement-and-clearing — CCP margin mechanics
  • Basic probability (what a distribution, mean, and standard deviation are). No formal statistics background assumed — we build from scratch.

Sources

BIS/CPMI-IOSCO, Principles for Financial Market Infrastructures (PFMI, 2012), Principle 6. BCBS-IOSCO, Margin Requirements for Non-Centrally Cleared Derivatives (2015, rev. 2019). NSCC Quantitative Disclosures (DTCC, 2025). LCH CPMI-IOSCO Self-Assessment (2022). CME SPAN 2 Methodology. Basel FRTB (BCBS d457, 2019).

Notation

  • — portfolio loss (positive = money lost)
  • — confidence level (e.g., 0.99)
  • — standard deviation of returns (volatility)
  • — the -th quantile of the standard normal distribution (e.g., )
  • — time horizon in days
  • (nu) — degrees of freedom for the Student-t distribution (lower = fatter tails)

What Is Value at Risk (VaR)?

VaR answers one question: What is the maximum loss over a given time horizon such that the probability of exceeding that loss is no more than ?

Formally, for a portfolio with loss random variable :

This is the quantile of the loss distribution. At , you’re finding the 99th percentile: there is a 1% chance of losing more than over the horizon.

Concrete example. A portfolio with \text{VaR}_{99\%}^{1\text{-day}} = \10\text{M}10M. On the remaining 1 day (roughly 2.5 days per year), losses could be larger — VaR says nothing about how much larger.

Three Methods to Compute VaR

Method 1: Parametric (Variance-Covariance)

Assumption: portfolio returns are normally distributed.

For a single position with portfolio value , daily return standard deviation :

For a multi-asset portfolio with position vector and covariance matrix (estimated from historical returns using EWMA — Exponentially Weighted Moving Average — or equally-weighted windows):

This is NSCC’s core approach — they compute both an EWMA estimate (more reactive to recent volatility) and an equally-weighted estimate, then take the higher of the two.

Pros: fast, analytically tractable, easy to decompose risk by factor. Cons: assumes normality (fat tails!), assumes linear positions (fails for options), covariance estimation is fragile in high dimensions.

Method 2: Historical Simulation

No distributional assumption. Take actual historical returns and replay them on today’s portfolio.

With 500 days of history at 99% confidence, the VaR is approximately the 5th-worst loss (). This is the approach at the heart of LCH’s PRISMA methodology.

Volatility scaling (important nuance): LCH doesn’t use raw historical returns. They scale each historical return by the ratio of current volatility to the volatility prevailing on that historical date:

This makes historical scenarios reflect today’s market regime while preserving the shape of the empirical distribution — a best-of-both- worlds approach that captures fat tails without being locked to a single volatility level.

Pros: no distributional assumption, captures fat tails naturally. Cons: limited by the historical window (can’t simulate what hasn’t happened), sensitive to window length.

Method 3: Monte Carlo Simulation

Specify a stochastic model for risk factors, simulate many paths, compute the portfolio P&L for each, and take the quantile.

For multi-asset portfolios with options, this means: simulate correlated risk factors (using Cholesky decomposition of the correlation matrix), reprice the full portfolio at each simulated state including nonlinear instruments, and take the quantile of the resulting P&L distribution.

This is the direction CME’s SPAN 2 has taken — replacing the old grid-based SPAN (Standard Portfolio Analysis of Risk, a system that evaluated 16 predefined price/volatility scenarios) with a full VaR framework using Monte Carlo.

Pros: handles any instrument (options, exotics), any distribution. Cons: computationally expensive, model-dependent (garbage in, garbage out).

Interactive exploration

Open notebook to compute VaR using all three methods on the same data, toggle between normal and fat-tailed distributions, and see how the answers diverge at high confidence levels.

”Confidence Level” vs “Confidence Interval”

These terms share vocabulary but mean different things.

VaR Confidence LevelStatistical Confidence Interval
Distribution of what?Future P&L outcomesSampling distribution of an estimator
What’s random?The future realizationThe interval (parameter is fixed)
Statement typePredictive / probabilisticInferential / epistemic
”99% confident” means”99% of outcomes won’t exceed ""This procedure captures the truth 99% of the time”

The mathematical connection: both involve quantiles of probability distributions. The 99th percentile is the 99th percentile whether you compute it for a loss distribution or a sampling distribution. The underlying math (inverse CDF) is identical.

The conceptual difference: VaR says “according to our model, 99% of future realizations fall within this bound.” A confidence interval says “if we repeated this estimation procedure many times, 99% of the resulting intervals would contain the true parameter.” VaR is about outcomes; confidence intervals are about estimation uncertainty.

The meta-level connection: you can construct a confidence interval around your VaR estimate: “Our 99% VaR estimate is $50M +/- $5M at 95% confidence.” CCPs doing backtesting (testing whether their VaR model’s 99th percentile is actually the 99th percentile) are implicitly working with this — a statistical inference problem about the accuracy of a predictive model.

How CCPs Actually Implement Confidence Levels

The Regulatory Floor: PFMI Principle 6

“A CCP should adopt initial margin models and parameters that are risk-based and generate margin requirements sufficient to cover its potential future exposure to participants in the interval between the last margin collection and the close out of positions following a participant default. Initial margin should meet an established single-tailed confidence level of at least 99 percent.” — BIS/CPMI-IOSCO, PFMI (2012), Principle 6, Key Consideration 3

Key phrase: “at least 99 percent.” Many CCPs choose to exceed this.

CCP-Specific Implementations

CCPMethodConfidence LevelLiquidation HorizonLook-back
NSCC (DTCC)Parametric VaR (EWMA + equal-weight)99%~3 daysRolling window
LCH (PRISMA)Historical simulation + vol scaling99.7%5 days (house) / 7 days (client)10 years
CME (SPAN 2)Monte Carlo VaR99%1-5 days (product-dependent)Calibrated to stress periods
Bilateral OTC (BCBS-IOSCO)Model-dependent99%10 daysMust include stress period

Why does LCH use 99.7% while NSCC uses 99%? Two factors:

  1. Product characteristics. LCH clears interest rate swaps — large notionals, long-dated positions, illiquid compared to equities. NSCC clears equities — smaller, more liquid, faster to liquidate. Higher confidence compensates for harder-to-exit positions.
  2. Liquidation horizon. NSCC assumes ~3-day close-out (T+1 settlement
    • buffer). LCH assumes 5-7 days. Longer horizons mean more price uncertainty, so a higher confidence level provides additional cushion.

NSCC Margin Components

NSCC’s margin (“Required Fund Deposit”) isn’t just VaR — it’s a sum:

  • Volatility Charge (the VaR component): dual EWMA/equal-weight calculation, takes the higher
  • Mark-to-Market Charge: covers unrealized losses
  • Gap Risk Measure: add-on for concentrated positions susceptible to jump risk (e.g., earnings, M&A)
  • Margin Floor: proportional to portfolio market value (prevents margin from going implausibly low in calm markets)

Fat Tails: Why the Gaussian Lies

Normal vs Student-t distribution. At 99% VaR (blue dashed), the two distributions diverge modestly. At 99.7% (purple dashed), the gap is substantial. The shaded area represents probability mass the Gaussian misses — extreme events that are far more likely under fat-tailed distributions.

Empirical financial returns exhibit:

  • Excess kurtosis (fat tails): extreme events are far more frequent than a normal distribution predicts.
  • Skewness: crashes are larger than rallies (especially equities).
  • Volatility clustering: large moves cluster together (GARCH effects).

The higher the confidence level, the deeper into the tail you go, and the more the Gaussian underestimates risk:

EventNormal DistributionEmpirical (typical equity)
3 move0.13% (1 in 740)~0.5-1% (1 in 100-200)
4 move0.003% (1 in 31,574)~0.1% (1 in 1,000)
5 move0.00003% (1 in 3.5M)~0.01-0.03% (1 in 3,000-10,000)

This isn’t abstract. The 2008 financial crisis, the 2010 Flash Crash, the 2015 CHF de-peg, the 2020 COVID crash, and the 2021 GameStop episode all produced moves that were extreme outliers under Gaussian assumptions but uncomfortably plausible under fat-tailed models.

What CCPs Do to Compensate

  1. Use historical simulation instead of parametric VaR. Let the data speak rather than imposing a distributional assumption. LCH’s PRISMA is the exemplar.

  2. Stress testing. PFMI Principle 6 (KC5) requires CCPs to conduct stress tests using “extreme but plausible market conditions”: historical scenarios (Lehman, GFC, COVID), hypothetical scenarios, and reverse stress tests (“what scenario would exhaust the default fund?”). Results inform default fund sizing.

  3. Expected Shortfall (ES) / CVaR. While most CCPs still use VaR for margin, the Basel FRTB (Fundamental Review of the Trading Book) has shifted bank capital requirements to Expected Shortfall at 97.5%:

ES answers: “Given that we’re in the worst % of outcomes, what is the average loss?” VaR tells you the threshold; ES tells you how bad it gets beyond the threshold. ES is also coherent (technically: sub-additive — diversification always reduces risk, which VaR does not guarantee).

The Basel Committee chose 97.5% ES specifically because it roughly equals 99% VaR under normality but is strictly more conservative for fat-tailed distributions.

  1. Procyclicality buffers. During calm markets, historical VaR drops, margins fall, everyone leverages up — then when a crisis hits, margins spike, forced liquidation amplifies the crash. PFMI and EMIR require anti-procyclicality measures: margin floors, stressed VaR lookback periods that always include a stress period, and buffer add-ons (e.g., 25% above model output).

The Scaling Rule

Derivation

Under a random walk assumption, daily log-returns are independent and identically distributed (i.i.d.):

The cumulative return over days is . Since the are independent:

Therefore:

This is why the BCBS-IOSCO 10-day horizon produces margin approximately times the 1-day VaR.

When It Breaks Down

The rule relies on three assumptions, each of which fails:

  1. Independence of returns. Financial returns exhibit volatility clustering (GARCH effects) and momentum/mean-reversion patterns. If returns are positively autocorrelated (momentum), risk grows faster than . If mean-reverting, slower.

  2. Constant volatility. Volatility is stochastic. During stress, it spikes and stays elevated. The rule using today’s underestimates multi-day risk if vol is rising.

  3. Continuous price paths. The rule assumes diffusion (no jumps). Prices can gap overnight (earnings, central bank decisions, geopolitical shocks). A 3-day liquidation horizon doesn’t help if the gap happens in one tick.

In practice, CCPs address these by: computing VaR directly at the target horizon using overlapping multi-day returns (rather than scaling up 1-day VaR), adding gap risk charges, and using concentration add-ons for large positions where market impact is material.

See it

The companion notebook Section 3 lets you toggle autocorrelation on and off and watch the prediction diverge from the actual multi-day VaR.


Questions to sit with:

  1. LCH uses 99.7% confidence with historical simulation over a 10-year lookback. NSCC uses 99% parametric VaR with a rolling window. If both meet the PFMI Principle 6 floor, which approach is more robust to a “never-seen-before” crisis? Historical simulation can only replay history; parametric models can extrapolate beyond the sample. But parametric models assume a distribution shape. Is there a principled way to choose between “I’ll replay what actually happened” and “I’ll model what might happen”?

  2. Expected Shortfall is sub-additive (diversification always helps), while VaR is not — you can construct portfolios where combined VaR exceeds the sum of individual VaRs. Construct an intuitive example of this failure. (Hint: think of two positions that are individually safe but jointly catastrophic — each has a small probability of a large loss, and the losses are mutually exclusive.)

  3. The scaling rule says margin should scale with the square root of the liquidation horizon. But during the 2020 COVID crash, CCPs raised margins dramatically — not because the horizon changed, but because spiked. If both and affect margin, and both are uncertain, how should a CCP set margin during a crisis without triggering the procyclicality spiral (higher margin → forced liquidation → more volatility → even higher margin)?

See also