Shadow-run comparison
A shadow-run is a two-lane, seed-identical execution of the same
Model and event sequence, with frame checksums compared at the end.
It is FrankenTUI’s primary mechanism for proving that swapping the
runtime underneath (threading → Asupersync, old diff → new diff, legacy
tick strategy → predictive) is behaviour-preserving.
Source: crates/ftui-harness/src/shadow_run.rs + lab_integration.rs.
Mental model
ShadowRunConfig { seed, viewport, time_step, labels }
│
├──▶ Lab "baseline" ─▶ LabSession ─▶ frames + checksums
│ ▲
│ scenario closure
│ ▼
└──▶ Lab "candidate" ─▶ LabSession ─▶ frames + checksums
│
zip & compare ─▶ ShadowVerdictBoth lanes start from the same seed, render at the same viewport, and
tick with the same time_step_ms. The scenario closure is called
twice — once per lane, on two independently constructed Models — and
the resulting FrameRecords (index, timestamp_ms, FNV-1a checksum) are
zipped and compared. Any divergent checksum fails the test.
API at a glance
ShadowRunConfig
pub struct ShadowRunConfig {
pub prefix: String,
pub scenario_name: String,
pub seed: u64,
pub viewport_width: u16,
pub viewport_height: u16,
pub time_step_ms: u64,
pub baseline_label: String, // default "baseline"
pub candidate_label: String, // default "candidate"
}Builder defaults: 80×24 viewport, 16ms time step.
| Method | Purpose |
|---|---|
ShadowRunConfig::new(prefix, scenario_name, seed) | Construct with defaults. |
.viewport(w, h) | Override frame capture size. |
.time_step_ms(ms) | Override deterministic tick cadence. |
.lane_labels(baseline, candidate) | Rename the lanes in JSONL output. |
ShadowVerdict
pub enum ShadowVerdict {
Match, // every frame checksum matched
Diverged, // at least one frame checksum differed
}ShadowRunResult
| Field | Type | Meaning |
|---|---|---|
verdict | ShadowVerdict | Overall outcome. |
scenario_name | String | Label reused in JSONL. |
seed | u64 | Seed used for both lanes. |
frame_comparisons | Vec<FrameComparison> | Per-frame (index, baseline_checksum, candidate_checksum, matched). |
first_divergence | Option<usize> | Index of the first mismatched frame. |
frames_compared | usize | Count of zipped frames. |
baseline / candidate | LabOutput | Full per-lane record (frames, events, anomalies). |
baseline_label / candidate_label | String | Human-readable lane names. |
run_total | u64 | Monotonic counter from shadow_runs_total(). |
Helpers: diverged_count(), match_ratio().
ShadowRun::compare
ShadowRun::compare(config, model_factory, scenario_fn) -> ShadowRunResultmodel_factory: impl Fn() -> M— called twice, once per lane, so eachLabSessionowns an independent model instance.scenario_fn: impl Fn(&mut LabSession<M>)— drives the session. Must be deterministic; no wall-clock reads, no RNG without a seeded source.
All comparison evidence is emitted as JSONL via TestJsonlLogger.
Worked example: certifying a runtime migration
use ftui_harness::shadow_run::{ShadowRun, ShadowRunConfig, ShadowVerdict};
use ftui_runtime::program::{Cmd, Model};
use ftui_render::frame::Frame;
use ftui_core::event::Event;
use ftui_core::geometry::Rect;
use ftui_widgets::paragraph::Paragraph;
#[derive(Clone, Debug)]
enum Msg { Tick }
impl From<Event> for Msg {
fn from(_: Event) -> Self { Msg::Tick }
}
#[derive(Default)]
struct TickCounter { ticks: u64 }
impl Model for TickCounter {
type Message = Msg;
fn update(&mut self, _: Msg) -> Cmd<Msg> {
self.ticks += 1;
Cmd::none()
}
fn view(&self, frame: &mut Frame) {
let s = format!("ticks={}", self.ticks);
let area = Rect::new(0, 0, frame.width(), 1);
Paragraph::new(s.as_str()).render(area, frame);
}
}
#[test]
fn runtime_migration_preserves_frames() {
let config = ShadowRunConfig::new("migration_test", "tick_counter", 42)
.viewport(80, 24)
.time_step_ms(16)
.lane_labels("threading", "asupersync");
let result = ShadowRun::compare(
config,
TickCounter::default,
|session| {
session.init();
for _ in 0..30 {
session.tick();
session.capture_frame();
}
},
);
assert_eq!(result.verdict, ShadowVerdict::Match);
assert_eq!(result.first_divergence, None);
assert!(result.match_ratio() > 0.999);
}On a divergence:
match result.verdict {
ShadowVerdict::Match => {}
ShadowVerdict::Diverged => {
let idx = result.first_divergence.unwrap();
let fc = &result.frame_comparisons[idx];
panic!(
"divergence at frame {}: baseline={:016x} candidate={:016x}",
fc.index, fc.baseline_checksum, fc.candidate_checksum,
);
}
}LabSession methods used inside the scenario
| Method | Effect |
|---|---|
init() | Calls Model::init, logs viewport to JSONL. |
send(msg) | Dispatch a message; records to event log. |
inject_event(evt) / inject_events(&evts) | Real Events through From<Event>. |
tick() | Inject Event::Tick, advance deterministic clock by time_step_ms. |
capture_frame() | Render at the configured viewport, store FNV-1a checksum. |
capture_frame_at(w, h) | Same, at custom dimensions. |
frame_records() / event_log() | Inspect the record for this lane. |
now_ms() | Deterministic milliseconds since session start. |
Aggregating shadow runs for rollout decisions
ShadowRun is the raw primitive. For release-gating — combining
multiple scenarios with an optional benchmark gate — wrap the results in
a RolloutScorecard:
use ftui_harness::rollout_scorecard::{
RolloutScorecard, RolloutScorecardConfig, RolloutVerdict,
};
let mut scorecard = RolloutScorecard::new(
RolloutScorecardConfig::default()
.min_shadow_scenarios(3)
.min_match_ratio(1.0)
.require_benchmark_pass(true),
);
scorecard.add_shadow_result(result_a);
scorecard.add_shadow_result(result_b);
scorecard.add_shadow_result(result_c);
scorecard.set_benchmark_gate(gate_result);
assert_eq!(scorecard.evaluate(), RolloutVerdict::Go);Pitfalls
The scenario closure must be pure. Reading the system clock, using
rand::thread_rng(), or touching global state will make the two
lanes diverge. Route randomness through a seeded source and time
through session.now_ms().
Frame counts must match. If the scenario captures 30 frames in
one lane and 29 in the other (e.g. an early Cmd::Quit in candidate
only), frames_compared drops and the mismatch is a divergence.
Prefer scenarios that run to a fixed iteration count.
Don’t share a Model across lanes. Use the model_factory
closure; constructing it once and cloning will leak cross-lane state
for anything behind Arc or Rc.