Confidence Levels in Risk Management
Why this note exists
The settlement-and-clearing note says CCPs calibrate initial margin to “a 99% confidence level.” The aig-and-the-cds-crisis note shows what happens when there is no margin at all. But what does “99% confidence” actually mean? How is it computed? Why does LCH use 99.7% while NSCC uses 99%? And why does the Gaussian assumption — which underlies the simplest models — systematically underestimate risk in the tail?
Prerequisites
- settlement-and-clearing — CCP margin mechanics
- Basic probability (what a distribution, mean, and standard deviation are). No formal statistics background assumed — we build from scratch.
Sources
BIS/CPMI-IOSCO, Principles for Financial Market Infrastructures (PFMI, 2012), Principle 6. BCBS-IOSCO, Margin Requirements for Non-Centrally Cleared Derivatives (2015, rev. 2019). NSCC Quantitative Disclosures (DTCC, 2025). LCH CPMI-IOSCO Self-Assessment (2022). CME SPAN 2 Methodology. Basel FRTB (BCBS d457, 2019).
Notation
- — portfolio loss (positive = money lost)
- — confidence level (e.g., 0.99)
- — standard deviation of returns (volatility)
- — the -th quantile of the standard normal distribution (e.g., )
- — time horizon in days
- (nu) — degrees of freedom for the Student-t distribution (lower = fatter tails)
What Is Value at Risk (VaR)?
VaR answers one question: What is the maximum loss over a given time horizon such that the probability of exceeding that loss is no more than ?
Formally, for a portfolio with loss random variable :
This is the quantile of the loss distribution. At , you’re finding the 99th percentile: there is a 1% chance of losing more than over the horizon.
Concrete example. A portfolio with \text{VaR}_{99\%}^{1\text{-day}} = \10\text{M}10M. On the remaining 1 day (roughly 2.5 days per year), losses could be larger — VaR says nothing about how much larger.
Three Methods to Compute VaR
Method 1: Parametric (Variance-Covariance)
Assumption: portfolio returns are normally distributed.
For a single position with portfolio value , daily return standard deviation :
For a multi-asset portfolio with position vector and covariance matrix (estimated from historical returns using EWMA — Exponentially Weighted Moving Average — or equally-weighted windows):
This is NSCC’s core approach — they compute both an EWMA estimate (more reactive to recent volatility) and an equally-weighted estimate, then take the higher of the two.
Pros: fast, analytically tractable, easy to decompose risk by factor. Cons: assumes normality (fat tails!), assumes linear positions (fails for options), covariance estimation is fragile in high dimensions.
Method 2: Historical Simulation
No distributional assumption. Take actual historical returns and replay them on today’s portfolio.
With 500 days of history at 99% confidence, the VaR is approximately the 5th-worst loss (). This is the approach at the heart of LCH’s PRISMA methodology.
Volatility scaling (important nuance): LCH doesn’t use raw historical returns. They scale each historical return by the ratio of current volatility to the volatility prevailing on that historical date:
This makes historical scenarios reflect today’s market regime while preserving the shape of the empirical distribution — a best-of-both- worlds approach that captures fat tails without being locked to a single volatility level.
Pros: no distributional assumption, captures fat tails naturally. Cons: limited by the historical window (can’t simulate what hasn’t happened), sensitive to window length.
Method 3: Monte Carlo Simulation
Specify a stochastic model for risk factors, simulate many paths, compute the portfolio P&L for each, and take the quantile.
For multi-asset portfolios with options, this means: simulate correlated risk factors (using Cholesky decomposition of the correlation matrix), reprice the full portfolio at each simulated state including nonlinear instruments, and take the quantile of the resulting P&L distribution.
This is the direction CME’s SPAN 2 has taken — replacing the old grid-based SPAN (Standard Portfolio Analysis of Risk, a system that evaluated 16 predefined price/volatility scenarios) with a full VaR framework using Monte Carlo.
Pros: handles any instrument (options, exotics), any distribution. Cons: computationally expensive, model-dependent (garbage in, garbage out).
Interactive exploration
Open notebook to compute VaR using all three methods on the same data, toggle between normal and fat-tailed distributions, and see how the answers diverge at high confidence levels.
”Confidence Level” vs “Confidence Interval”
These terms share vocabulary but mean different things.
| VaR Confidence Level | Statistical Confidence Interval | |
|---|---|---|
| Distribution of what? | Future P&L outcomes | Sampling distribution of an estimator |
| What’s random? | The future realization | The interval (parameter is fixed) |
| Statement type | Predictive / probabilistic | Inferential / epistemic |
| ”99% confident” means | ”99% of outcomes won’t exceed " | "This procedure captures the truth 99% of the time” |
The mathematical connection: both involve quantiles of probability distributions. The 99th percentile is the 99th percentile whether you compute it for a loss distribution or a sampling distribution. The underlying math (inverse CDF) is identical.
The conceptual difference: VaR says “according to our model, 99% of future realizations fall within this bound.” A confidence interval says “if we repeated this estimation procedure many times, 99% of the resulting intervals would contain the true parameter.” VaR is about outcomes; confidence intervals are about estimation uncertainty.
The meta-level connection: you can construct a confidence interval around your VaR estimate: “Our 99% VaR estimate is $50M +/- $5M at 95% confidence.” CCPs doing backtesting (testing whether their VaR model’s 99th percentile is actually the 99th percentile) are implicitly working with this — a statistical inference problem about the accuracy of a predictive model.
How CCPs Actually Implement Confidence Levels
The Regulatory Floor: PFMI Principle 6
“A CCP should adopt initial margin models and parameters that are risk-based and generate margin requirements sufficient to cover its potential future exposure to participants in the interval between the last margin collection and the close out of positions following a participant default. Initial margin should meet an established single-tailed confidence level of at least 99 percent.” — BIS/CPMI-IOSCO, PFMI (2012), Principle 6, Key Consideration 3
Key phrase: “at least 99 percent.” Many CCPs choose to exceed this.
CCP-Specific Implementations
| CCP | Method | Confidence Level | Liquidation Horizon | Look-back |
|---|---|---|---|---|
| NSCC (DTCC) | Parametric VaR (EWMA + equal-weight) | 99% | ~3 days | Rolling window |
| LCH (PRISMA) | Historical simulation + vol scaling | 99.7% | 5 days (house) / 7 days (client) | 10 years |
| CME (SPAN 2) | Monte Carlo VaR | 99% | 1-5 days (product-dependent) | Calibrated to stress periods |
| Bilateral OTC (BCBS-IOSCO) | Model-dependent | 99% | 10 days | Must include stress period |
Why does LCH use 99.7% while NSCC uses 99%? Two factors:
- Product characteristics. LCH clears interest rate swaps — large notionals, long-dated positions, illiquid compared to equities. NSCC clears equities — smaller, more liquid, faster to liquidate. Higher confidence compensates for harder-to-exit positions.
- Liquidation horizon. NSCC assumes ~3-day close-out (T+1 settlement
- buffer). LCH assumes 5-7 days. Longer horizons mean more price uncertainty, so a higher confidence level provides additional cushion.
NSCC Margin Components
NSCC’s margin (“Required Fund Deposit”) isn’t just VaR — it’s a sum:
- Volatility Charge (the VaR component): dual EWMA/equal-weight calculation, takes the higher
- Mark-to-Market Charge: covers unrealized losses
- Gap Risk Measure: add-on for concentrated positions susceptible to jump risk (e.g., earnings, M&A)
- Margin Floor: proportional to portfolio market value (prevents margin from going implausibly low in calm markets)
Fat Tails: Why the Gaussian Lies
Normal vs Student-t distribution. At 99% VaR (blue dashed), the two
distributions diverge modestly. At 99.7% (purple dashed), the gap is
substantial. The shaded area represents probability mass the Gaussian
misses — extreme events that are far more likely under fat-tailed
distributions.
Empirical financial returns exhibit:
- Excess kurtosis (fat tails): extreme events are far more frequent than a normal distribution predicts.
- Skewness: crashes are larger than rallies (especially equities).
- Volatility clustering: large moves cluster together (GARCH effects).
The higher the confidence level, the deeper into the tail you go, and the more the Gaussian underestimates risk:
| Event | Normal Distribution | Empirical (typical equity) |
|---|---|---|
| 3 move | 0.13% (1 in 740) | ~0.5-1% (1 in 100-200) |
| 4 move | 0.003% (1 in 31,574) | ~0.1% (1 in 1,000) |
| 5 move | 0.00003% (1 in 3.5M) | ~0.01-0.03% (1 in 3,000-10,000) |
This isn’t abstract. The 2008 financial crisis, the 2010 Flash Crash, the 2015 CHF de-peg, the 2020 COVID crash, and the 2021 GameStop episode all produced moves that were extreme outliers under Gaussian assumptions but uncomfortably plausible under fat-tailed models.
What CCPs Do to Compensate
-
Use historical simulation instead of parametric VaR. Let the data speak rather than imposing a distributional assumption. LCH’s PRISMA is the exemplar.
-
Stress testing. PFMI Principle 6 (KC5) requires CCPs to conduct stress tests using “extreme but plausible market conditions”: historical scenarios (Lehman, GFC, COVID), hypothetical scenarios, and reverse stress tests (“what scenario would exhaust the default fund?”). Results inform default fund sizing.
-
Expected Shortfall (ES) / CVaR. While most CCPs still use VaR for margin, the Basel FRTB (Fundamental Review of the Trading Book) has shifted bank capital requirements to Expected Shortfall at 97.5%:
ES answers: “Given that we’re in the worst % of outcomes, what is the average loss?” VaR tells you the threshold; ES tells you how bad it gets beyond the threshold. ES is also coherent (technically: sub-additive — diversification always reduces risk, which VaR does not guarantee).
The Basel Committee chose 97.5% ES specifically because it roughly equals 99% VaR under normality but is strictly more conservative for fat-tailed distributions.
- Procyclicality buffers. During calm markets, historical VaR drops, margins fall, everyone leverages up — then when a crisis hits, margins spike, forced liquidation amplifies the crash. PFMI and EMIR require anti-procyclicality measures: margin floors, stressed VaR lookback periods that always include a stress period, and buffer add-ons (e.g., 25% above model output).
The Scaling Rule
Derivation
Under a random walk assumption, daily log-returns are independent and identically distributed (i.i.d.):
The cumulative return over days is . Since the are independent:
Therefore:
This is why the BCBS-IOSCO 10-day horizon produces margin approximately times the 1-day VaR.
When It Breaks Down
The rule relies on three assumptions, each of which fails:
-
Independence of returns. Financial returns exhibit volatility clustering (GARCH effects) and momentum/mean-reversion patterns. If returns are positively autocorrelated (momentum), risk grows faster than . If mean-reverting, slower.
-
Constant volatility. Volatility is stochastic. During stress, it spikes and stays elevated. The rule using today’s underestimates multi-day risk if vol is rising.
-
Continuous price paths. The rule assumes diffusion (no jumps). Prices can gap overnight (earnings, central bank decisions, geopolitical shocks). A 3-day liquidation horizon doesn’t help if the gap happens in one tick.
In practice, CCPs address these by: computing VaR directly at the target horizon using overlapping multi-day returns (rather than scaling up 1-day VaR), adding gap risk charges, and using concentration add-ons for large positions where market impact is material.
See it
The companion notebook Section 3 lets you toggle autocorrelation on and off and watch the prediction diverge from the actual multi-day VaR.
Questions to sit with:
-
LCH uses 99.7% confidence with historical simulation over a 10-year lookback. NSCC uses 99% parametric VaR with a rolling window. If both meet the PFMI Principle 6 floor, which approach is more robust to a “never-seen-before” crisis? Historical simulation can only replay history; parametric models can extrapolate beyond the sample. But parametric models assume a distribution shape. Is there a principled way to choose between “I’ll replay what actually happened” and “I’ll model what might happen”?
-
Expected Shortfall is sub-additive (diversification always helps), while VaR is not — you can construct portfolios where combined VaR exceeds the sum of individual VaRs. Construct an intuitive example of this failure. (Hint: think of two positions that are individually safe but jointly catastrophic — each has a small probability of a large loss, and the losses are mutually exclusive.)
-
The scaling rule says margin should scale with the square root of the liquidation horizon. But during the 2020 COVID crash, CCPs raised margins dramatically — not because the horizon changed, but because spiked. If both and affect margin, and both are uncertain, how should a CCP set margin during a crisis without triggering the procyclicality spiral (higher margin → forced liquidation → more volatility → even higher margin)?
See also
- settlement-and-clearing — CCP margin mechanics, PFMI Principle 6
- aig-and-the-cds-crisis — what happens with no margin at all
- volatility — realized vs implied vol, volatility clustering