Gestures

GestureRecognizer is the layer that turns the flat Event stream into semantic intent: a Click, a DoubleClick, a DragStart, a LongPress, or a Chord. Widgets never look at raw mouse coordinates or modifier bitflags — they listen for SemanticEvents and react to verbs, which keeps their logic short and their tests trivial.

The recognizer is small (a few hundred lines, crates/ftui-core/src/gesture.rs:L100-L300) and deterministic. Every input that changes its output carries an explicit Instant so the state machine has no hidden clock; ftui-harness exploits this to drive exact timings in snapshot tests.

This page documents the state machine, the four timing constants, the dead-zone that makes single-cell mice tolerable, and how focus loss interacts with in-flight gestures.

Motivation

Raw mouse events are ambiguous. Down → Up → Down → Up in the same cell could be two single clicks if they’re seconds apart, a double-click if they’re within 300 ms, or a drag-start-plus-release if the middle movement exceeded the threshold. Pushing that disambiguation into every widget is both error-prone and wasteful — every list, button, and table would re-implement the same timer logic. A dedicated recognizer owns the problem once.

State machine


                ┌──────────┐
                │   Idle   │
                └────┬─────┘
     MouseDown       │
     ───────────▶  DOWN(pos, button, t0)
                    │ start DragTracker(started=false)
                    │ start long-press timer (500 ms)
                    │
          MouseDrag                 MouseUp (d ≤ 1 cell)
          (|Δ| ≥ 3 cells)           ──────────────────▶
          ──────────▶ DRAGGING       if t < 300 ms of last click
          emit DragStart             and same button, same pos:
                    │                   count += 1
          MouseDrag │                emit Click/DoubleClick/TripleClick
          emit DragMove              ──────────────────▶ Idle
                    │
          MouseUp   │                Escape | FocusLost
          emit      │                ──────────────────▶
          DragEnd   │                if drag.started: emit DragCancel
          ──────────▶ Idle           reset all state

Keyboard chords run in parallel: any press with CTRL | ALT | SUPER accumulates into chord_buffer. A sequence of length ≥ 2 (e.g. Ctrl-K Ctrl-C) emits a Chord; an un-modified keypress or timeout clears the buffer.

Timing constants

GestureConfig::default() applies the following (crates/ftui-core/src/gesture.rs:L63-L73):

Field	Default	Role
`multi_click_timeout`	300 ms	Max gap between clicks to form a double/triple.
`long_press_threshold`	500 ms	Stationary hold that promotes to `LongPress`.
`drag_threshold`	3 cells	Manhattan distance before `DragStart` fires.
`chord_timeout`	1000 ms	Max span for a multi-key chord sequence.
`swipe_velocity_threshold`	50 cells/s	Min velocity for `Swipe`.
`click_tolerance`	1 cell	Dead-zone for multi-click position matching.

All are tunable; the defaults have settled after months of showcase use and match the timings typical desktop stacks converge on (macOS HIG and GNOME use similar values).

The dead-zone

Terminal cells are ~9×18 px, so a hand tremor over a single physical click often jumps one row or column between press and release. Without a tolerance, the recognizer would classify a ~90%-accurate click as a 1-cell drag and suppress the Click entirely.

click_tolerance: 1 resolves this: during multi-click accumulation, the new event’s position is considered “the same” as the previous click if the manhattan distance is ≤ 1. The same tolerance does not relax drag_threshold — a deliberate drag still fires after 3 cells of movement.

Semantic events

crates/ftui-core/src/semantic_event.rs


pub enum SemanticEvent {
    Click       { pos, button },
    DoubleClick { pos, button },
    TripleClick { pos, button },
    LongPress   { pos, duration },
 
    DragStart { pos, button },
    DragMove  { start, current, delta: (i16, i16) },
    DragEnd   { start, end },
    DragCancel,
 
    Chord { sequence: Vec<ChordKey> },        // non-empty
    Swipe { direction, distance, velocity },
}

is_drag() and is_click() helpers let widgets filter without exhaustive matching.

Worked example

examples/gesture.rs


use std::time::{Duration, Instant};
use ftui_core::event::{Event, MouseButton, MouseEvent, MouseEventKind};
use ftui_core::gesture::{GestureConfig, GestureRecognizer};
use ftui_core::semantic_event::SemanticEvent;
 
let mut gr = GestureRecognizer::new(GestureConfig::default());
let t0 = Instant::now();
 
// Mouse-down at (10, 5)
let down = Event::Mouse(MouseEvent {
    x: 10, y: 5,
    kind: MouseEventKind::Down(MouseButton::Left),
    modifiers: Default::default(),
});
let _ = gr.process(&down, t0);
 
// Mouse-up at (10, 5) 80 ms later — emits Click.
let up = Event::Mouse(MouseEvent {
    x: 10, y: 5,
    kind: MouseEventKind::Up(MouseButton::Left),
    modifiers: Default::default(),
});
let events = gr.process(&up, t0 + Duration::from_millis(80));
assert!(matches!(events.as_slice(), [SemanticEvent::Click { .. }]));
 
// Second press-release within 300 ms promotes to DoubleClick.
let t1 = t0 + Duration::from_millis(180);
let _  = gr.process(&down, t1);
let events = gr.process(&up, t1 + Duration::from_millis(60));
assert!(matches!(events.as_slice(), [SemanticEvent::DoubleClick { .. }]));

Long-press is pump-driven

LongPress is not produced by a raw event at all — there is no “long press” byte on the wire. Instead, the runtime calls GestureRecognizer::check_long_press(now) on every tick; if a stationary mouse-down has elapsed past long_press_threshold, the next tick emits the event.

This is why long-press only works on a live runtime with a heartbeat; headless replays must feed synthetic tick events at the desired cadence. The runtime’s default tick is 16 ms, so 500 ms resolves within ~±8 ms.

Invariants

Click and Drag are mutually exclusive for a single down→up pair. Once the drag threshold is crossed, the recognizer commits to the drag path and suppresses the click.
Click multiplicity is monotone: Click → DoubleClick → TripleClick within the window. No event skips or regresses.
Chord sequences are always non-empty — the type invariant on SemanticEvent::Chord { sequence } is enforced by construction.
reset() clears everything to idle. Use this on focus loss or when the application suspends.
Focus loss cancels drags. Event::Focus(false) emits DragCancel if a drag was in flight and clears the long-press timer; see gesture.rs:L228-L238.

Call reset() after regaining focus from a suspend (e.g. Ctrl-Z / fg). The recognizer already resets on Focus(false), but terminals differ wildly in whether they emit a focus event at suspend time — some do not. A defensive reset() on resume is cheap insurance against a ghost DragMove appearing 30 seconds after the user let go.

Chord sequences


use ftui_core::event::{Event, KeyCode, KeyEvent, KeyEventKind, Modifiers};
use ftui_core::gesture::GestureRecognizer;
use ftui_core::gesture::GestureConfig;
use std::time::Instant;
 
let mut gr = GestureRecognizer::new(GestureConfig::default());
let t = Instant::now();
 
// Ctrl-K
let k = Event::Key(KeyEvent {
    code: KeyCode::Char('k'),
    modifiers: Modifiers::CTRL,
    kind: KeyEventKind::Press,
});
let _ = gr.process(&k, t);                       // buffered, no emit
 
// Ctrl-C within chord_timeout — emits the full sequence
let c = Event::Key(KeyEvent {
    code: KeyCode::Char('c'),
    modifiers: Modifiers::CTRL,
    kind: KeyEventKind::Press,
});
let events = gr.process(&c, t + std::time::Duration::from_millis(200));
// events == [SemanticEvent::Chord { sequence: [Ctrl-K, Ctrl-C] }]

An unmodified keypress between the two steps (e.g. pressing j) clears the buffer — chords are reserved for “held modifier + key” sequences.

Cross-references

Events & input — where raw Events come from.
Terminal session — flipping the mouse and focus modes that feed the recognizer.
Model trait — how the runtime dispatches semantic events to update().
Frame API — where hit tests land after a Click.
Screen modes — focus-loss ergonomics for inline mode.

Where next

How this piece fits in core.

Core overview

Where raw Events come from.

Events & input