SLO schema (`slo.yaml`)

slo.yaml at the repo root is FrankenTUI’s machine-readable service level objective definition. Every budget the kernel honours — frame render p99, layout compute p99, Bayesian posterior update latency, heap RSS — is named and bounded here. CI validates the schema on every push and runs a deterministic replay that exercises safe-mode when breaches are injected.

Source: /slo.yaml.

Why a single SLO file

The benchmark gate (benchmark gate) provides the enforcement mechanism. The SLO file provides the intention — what the kernel promises its users. Keeping the two aligned in one place means:

Budgets are reviewable in a single PR.
A CI breach maps one-to-one to a documented promise.
The runtime’s degradation cascade can key off the same metric names that appear in tests.

Top-level layout


# Global thresholds
regression_threshold: 0.10
noise_tolerance: 0.05
safe_mode_breach_count: 3
safe_mode_error_rate: 0.10
 
metrics:
  render_frame_p99_us:
    metric_type: latency
    max_value: 4000.0
    max_ratio: 1.25
    safe_mode_trigger: true
 
  # …many more metrics…

Global thresholds

Field	Meaning
`regression_threshold`	Fractional overage above baseline that counts as a regression (default 10%).
`noise_tolerance`	Measurement variance absorbed before alerting (default 5%).
`safe_mode_breach_count`	How many `safe_mode_trigger: true` metrics must breach before the runtime enters safe mode.
`safe_mode_error_rate`	Error-rate metric above which safe mode engages independently of latency.

Per-metric fields

Field	Type	Meaning
`metric_type`	`latency` \| `memory` \| `error_rate`	Unit family. `latency` is microseconds; `memory` is bytes or count; `error_rate` is a fraction 0–1.
`max_value`	`f64`	Absolute ceiling. Exceeding it is a breach.
`max_ratio`	`f64`	Max ratio vs baseline. Exceeding it is a breach. Optional.
`safe_mode_trigger`	`bool`	When `true`, breaching this metric counts toward `safe_mode_breach_count`.

Breach semantics: a metric breaches if value > max_value or value / baseline > max_ratio.

Metric categories

The schema groups metrics into two planes.

Data plane — frame rendering pipeline

Budgets on the hot path: every render has to meet these to hit the 60 Hz target.


render_frame_p50_us:    { max_value: 500.0,  max_ratio: 1.15 }
render_frame_p95_us:    { max_value: 2000.0, max_ratio: 1.20 }
render_frame_p99_us:    { max_value: 4000.0, max_ratio: 1.25, safe_mode_trigger: true }
render_frame_p999_us:   { max_value: 8000.0, max_ratio: 1.50, safe_mode_trigger: true }
 
layout_compute_p50_us:  { max_value: 200.0 }
layout_compute_p95_us:  { max_value: 800.0 }
layout_compute_p99_us:  { max_value: 1500.0 }
 
diff_strategy_p50_us:   { max_value: 100.0 }
diff_strategy_p95_us:   { max_value: 500.0 }
diff_strategy_p99_us:   { max_value: 1000.0 }
 
ansi_present_p50_us:    { max_value: 150.0 }
ansi_present_p95_us:    { max_value: 600.0 }
ansi_present_p99_us:    { max_value: 1200.0 }

Memory and error budgets on the same plane:


heap_rss_bytes:                       { max_value: 104857600.0, max_ratio: 1.50, safe_mode_trigger: true }
allocations_per_frame:                { max_value: 500.0, max_ratio: 1.30 }
false_positive_strategy_switch_rate:  { max_value: 0.05, safe_mode_trigger: true }
malformed_ansi_rate:                  { max_value: 0.01 }

Decision plane — intelligence layer

Budgets on the statistical kernels behind the runtime’s adaptive behaviour.


posterior_update_p99_us:  { max_value: 500.0, safe_mode_trigger: true }
voi_computation_p99_us:   { max_value: 400.0 }
conformal_predict_p95_us: { max_value: 100.0 }
eprocess_update_p50_us:   { max_value: 10.0 }
bocpd_update_p50_us:      { max_value: 25.0 }
cascade_decision_p99_us:  { max_value: 100.0 }

The decision plane’s budgets are deliberately tight — a sluggish posterior-update hurts every diff decision that follows. See intelligence overview for what these kernels actually do.

`BreachResult`

When the runtime evaluates a metric against the SLO, it produces a BreachResult:


pub struct BreachResult {
    pub metric_name: String,
    pub metric_type: MetricType,   // Latency | Memory | ErrorRate
    pub value: f64,
    pub max_value: f64,
    pub max_ratio: Option<f64>,
    pub baseline: Option<f64>,
    pub breached: bool,
    pub safe_mode_trigger: bool,
    pub reason: BreachReason,      // OverMaxValue | OverMaxRatio | None
}

Breach results are emitted as events on the evidence sink and counted toward safe_mode_breach_count. Reaching that count flips the runtime into safe mode — see frame budget for the degradation cascade that kicks in.

CI validation

CI runs two gates against this file on every push:

Schema validation. Every metric declares a known metric_type, has numeric max_value, and safe_mode_trigger is a bool. Unknown keys fail.
Deterministic safe-mode replay. A fixture injects breaches on the safe_mode_trigger: true metrics and asserts the runtime transitions into safe mode. If the cascade doesn’t fire, CI fails.

Relationship to the benchmark gate

SLO (slo.yaml) is the promise — the outermost ceiling.
Benchmark gate (tests/baseline.json) is the enforcement — the per-benchmark measured baseline with tolerance.

The gate’s budgets should always be ≤ the SLO’s max_value. If a benchmark’s baseline creeps up past an SLO ceiling, either the SLO must widen (deliberate promise change) or the gate has to fail.

See benchmark gate for the mechanics and telemetry events for the metric names in their canonical runtime form.

Adding a new metric

Pick a name consistent with existing conventions

Latency metrics end in _p{50,95,99,999}_us. Memory metrics are either _bytes or _per_frame. Error rates are _rate.

Decide whether it should trigger safe mode

A metric should set safe_mode_trigger: true only if breaching it means the kernel is genuinely unsafe for interactive use. A slow posterior update is annoying; a 4 ms p99 frame render is user-visible every single frame.

Add the metric and budget


metrics:
  my_new_metric_p99_us:
    metric_type: latency
    max_value: 750.0
    max_ratio: 1.30
    safe_mode_trigger: false

Wire a benchmark

Add a criterion benchmark that emits the same name and ensure the tests/baseline.json percentile budget stays within the SLO ceiling.

Run the gates


./scripts/perf_regression_gate.sh --check-only
./scripts/bench_budget.sh --check-only

Confirm both pass at the new budget.

Pitfalls

Don’t raise an SLO to hide a regression. The SLO is a promise. Document a relaxation in the PR description and in the commit history; reviewers should push back on silent widening.

safe_mode_trigger cascades. Flipping a metric to true without understanding the degradation cascade may cause the runtime to enter safe mode more eagerly than intended. Test with the deterministic safe-mode replay before landing.

Percentile choice is load-bearing. If the SLO promises p99 and the benchmark gate measures p95, the two are unrelated. Keep the percentile consistent across SLO, gate, and telemetry.

Frame budget + degradation Benchmark gate Runtime telemetry Telemetry events (reference)Evidence sink Conformal: Mondrian Control theory

SLO schema (slo.yaml)