Skip to Content
IntelligenceConformal predictionVanilla conformal

Vanilla Conformal Prediction

What goes wrong with a naive approach

You have a predictor y^\hat{y} for a quantity you care about — frame time, row height, allocation deltas — and you want an upper bound so the consumer can preallocate, gate, or alert safely. The textbook answers all fail in a TUI:

  • Parametric y^+2σ\hat{y} + 2\sigma — assumes Gaussian residuals. Real residuals are heavy-tailed (one pathological frame, a single multi-line row, a flushed writer) and the Gaussian bound under-covers exactly when you need it most.
  • Empirical max — uses the worst-observed residual. Over-covers after one outlier forever; the bound never tightens.
  • Hand-tuned percentile — “use the 95th percentile of the last N samples”. Works, but “finite-sample coverage” is a hope, not a guarantee; and no one in the codebase knows why it’s 95 instead of 90.

Conformal prediction (Vovk et al., 2005) gives you the hand-tuned percentile with a proof. The only assumption is exchangeability of calibration points — much weaker than i.i.d. — and the coverage guarantee is finite-sample: it holds after n=10n = 10 observations, not “in the limit”.

Mental model

Keep a rolling buffer of calibration residuals Ri=yiy^iR_i = |y_i - \hat{y}_i|. On each prediction, quote the bound:

y^+=y^+q,q=Quantile(1α)(n+1)/n(R1..n)\hat{y}^+ = \hat{y} + q, \qquad q = \text{Quantile}_{\lceil (1-\alpha)(n+1) \rceil / n} (R_{1..n})

With probability at least 1α1 - \alpha over the next observation, yn+1y^+y_{n+1} \le \hat{y}^+. No distributional assumption beyond exchangeability. If residuals drift, the bound drifts with them.

Conformal prediction is the reverse of parametric uncertainty. Instead of asserting a noise model and computing a bound, you compute an empirical quantile and borrow the coverage guarantee from exchangeability. It is the most honest uncertainty estimator available for TUI workloads.

The math

Calibration and the lifted quantile

Given nn calibration residuals, the conformal quantile is the (1α)(n+1)/n100%\lceil (1-\alpha)(n+1)/n \rceil \cdot 100\% empirical quantile. The (n+1)(n+1) lift is what secures the finite-sample guarantee — it accounts for the test point as if it were also in the calibration set.

qn=R((1α)(n+1))q_n = R_{(\lceil (1-\alpha)(n+1) \rceil)}

(where R(k)R_{(k)} is the kk-th order statistic).

Coverage theorem

If R1,,Rn+1R_1, \ldots, R_{n+1} are exchangeable, then:

P(Rn+1qn)1αP(R_{n+1} \le q_n) \ge 1 - \alpha

Not asymptotic; exact for any nn.

Hysteresis factor

To stop the bound from flapping around a threshold, FrankenTUI applies a multiplicative hysteresis η\eta (default 1.11.1):

y^++=η(y^+q)\hat{y}^{++} = \eta \cdot (\hat{y} + q)

This widens the bound by 10% — a trade of miniscule over-coverage for stability.

E-process layer

For anytime-valid alerts on top of the conformal bound, pair with an e-process over normalised residuals zt=(Rtq)/σ0z_t = (R_t - q) / \sigma_0:

et=exp ⁣(λzt12λ2σ02)e_t = \exp\!\left(\lambda z_t - \tfrac{1}{2}\lambda^2 \sigma_0^2\right)

Alert when ses1/α\prod_s e_s \ge 1/\alpha. See e-processes for the theory.

Defaults

ParameterDefault
α\alpha0.05
λ\lambda0.5
min calibration size10
max calibration size500
hysteresis η\eta1.1

Worked example — one-liner alert gate

// Sliding window of residuals |y_t - ŷ_t|. let q = conformal.quantile(alpha); // 95th-percentile if α=0.05 if observed > eta * (y_hat + q) { raise_alert(); }

On a stream of 200 frame-time residuals with α=0.05\alpha = 0.05:

k=0.95201=191,q=R(191)k = \lceil 0.95 \cdot 201 \rceil = 191, \qquad q = R_{(191)}

If the next frame exceeds η(y^+q)\eta(\hat{y} + q), the alert is valid in the conformal sense — the false-alarm probability under H0H_0 is at most α\alpha.

Rust interface

crates/ftui-runtime/src/conformal_alert.rs
use ftui_runtime::conformal_alert::{ConformalAlert, ConformalConfig}; let mut alert = ConformalAlert::new(ConformalConfig { alpha: 0.05, lambda: 0.5, min_calibration: 10, max_calibration: 500, hysteresis: 1.1, }); alert.observe(residual); // push into calibration window let bound = alert.upper_bound(y_hat); if observed > bound { // conformal threshold breached }

How to debug

The alert emits conformal_alert lines:

{"schema":"conformal_alert","y_hat":18.2, "q":4.1,"bound":24.53,"hysteresis":1.1, "observed":29.8,"triggered":true, "calibration_size":200,"alpha":0.05}
FTUI_EVIDENCE_SINK=/tmp/ftui.jsonl cargo run -p ftui-demo-showcase # Empirical coverage over the run: jq -c 'select(.schema=="conformal_alert") | {bound, observed, inside: (.observed <= .bound)}' \ /tmp/ftui.jsonl | jq -s 'map(.inside) | [add, length] | "\(.0)/\(.1)"'

You should see roughly 1α1-\alpha coverage. If it’s much lower, the calibration window is too small or stale; if it’s 100%, the bound is too loose and α\alpha can be raised.

Pitfalls

Exchangeability is not i.i.d., but it isn’t free either. If the residual distribution drifts (new hardware, new workload), old residuals pollute the quantile. Keep the window short (max_calibration=500) and/or bucket by context — see Mondrian conformal for the bucketed version.

Cold-start over-coverage. With n=10n = 10 the lifted quantile is 0.9511=11\lceil 0.95 \cdot 11 \rceil = 11 — the maximum of 10 observations. Until the window fills, the bound is extremely conservative. That is the correct behaviour, but users can mistake it for a bug.

Cross-references

Where next