One clock.
Live and replay.
FerroReplay routes every timestamp and timer in an engine through one injected Clock — a monotonic SystemClock in production, a deadline-ordered VirtualClock in replay. Live and replay run the identical code path and produce byte-identical state.
Primitives
One trait, two clocks, grid intervals
A small surface, drawn tight. The timeline is i64 nanoseconds since the UNIX epoch — never a tokio Instant — so a virtual clock owns its timeline as a first-class implementation rather than a hack.
The Clock trait
The contract every component reads time through. An async trait on an epoch-nanosecond timeline — no ambient access to the wall clock anywhere else in the codebase.
now_ns()— non-decreasing within a processsleep_until(deadline_ns)— past deadlines resolve immediatelysleep(duration)— relative tonow- Cancel-safe — every caller sits in a biased
select! Send + Sync + Debug + 'static, behindArc<dyn Clock>
SystemClock & VirtualClock
Two implementations of the one trait. The live clock anchors the epoch to a monotonic instant; the replay clock owns and advances its own timeline deterministically.
- SystemClock — one wall-clock read at construction
- Monotonic across NTP steps · no per-call syscall
- The sole sanctioned wall-clock site in any codebase
- VirtualClock —
advance_to(t_ns)drives replay - Deadline-ordered wakes · O(1) dead-waiter discard
Grid-aligned intervals
interval(clock, period) ticks on a fixed grid and returns the scheduled stamp, not the observed time — so publication cadences and heartbeats replay exactly.
- First tick at
now + period tick()returns the grid-aligned timestamp- Skip-not-burst — matches tokio
MissedTickBehavior::Skip - A late deadline fires once, then realigns past
now - Cancel-safe — a dropped tick re-arms the same deadline
Architecture
Inject once, replay forever
Four moves. Time is injected at the boundary, read live through the monotonic clock, re-driven through the virtual clock on replay, and held byte-identical by the determinism laws and the wallclock gate.
Inject
Every timestamp and timer flows through one Arc<dyn Clock>. No component reaches for SystemTime or Instant directly — the clock is a dependency, not an ambient.
Run live
SystemClock anchors the epoch timeline to a monotonic instant once at construction, so now_ns never goes backwards across NTP steps and costs no syscall per call.
Replay
The replay producer swaps in a VirtualClockand advances it to each recorded event's timestamp before emitting — the same engine code path, driven on a timeline it owns.
Verify
The determinism laws — monotonicity, wakes-before-further-advance — plus the wallclock gate keep live and replay byte-identical. A regression fails CI before it ships.
Use Cases
Four workflows on one timeline
Entry points across the primitive. Each is a short snippet that takes the workflow from injected clock to deterministic result. The deeper guides live in the docs.
Inject the clock into an engine
The clock is constructed once at the composition root and threaded through the engine as Arc<dyn Clock>. Every timestamp read and every timer wait goes through it — which is precisely what makes the later swap to a virtual timeline a no-op for the rest of the code.
use std::sync::Arc; use std::time::Duration; use ferro_replay::{Clock, SystemClock}; // One time source for the whole engine — the only // component allowed to observe the wall clock. let clock: Arc<dyn Clock> = Arc::new(SystemClock::new()); let started_ns = clock.now_ns(); // …on the I/O-timeout race / retry-backoff path: clock.sleep(Duration::from_millis(250)).await;
Drive virtual time in a test
In a test the same engine runs on a VirtualClock. Advancing time to a deadline wakes the sleeper, which observes now equal to its own deadline — never a moment later. Advancing before a sleeper registers still wakes it, so tests are race-free without sleeps or polling.
use std::sync::Arc; use ferro_replay::{Clock, VirtualClock}; let clock = Arc::new(VirtualClock::new(1_000)); let c = clock.clone(); let task = tokio::spawn(async move { c.sleep_until(2_000).await; c.now_ns() // observes 2_000 — its own deadline }); // Race-free: advancing wakes the sleeper either way. clock.advance_to(2_000).await; assert_eq!(task.await.unwrap(), 2_000);
Tick a grid-aligned publication cadence
A 1-second publication interval ticks on the grid and hands back the scheduled stamp. When a consumer stalls, the armed deadline fires late exactly once — a prompt catch-up — then the schedule realigns to the first grid point past now. Missed ticks are skipped, never bursted, so a stall can't trigger a thundering catch-up.
use std::time::Duration; use ferro_replay::interval; let mut publish = interval(clock.clone(), Duration::from_secs(1)); loop { let tick_ns = publish.tick().await; // scheduled grid stamp — replay reproduces // tick_ns exactly, byte for byte publish_snapshot(tick_ns); }
Replay a session, byte for byte
The replay producer is the oneadvancer. For each recorded event it advances virtual time to the event's timestamp, then feeds it to the engine — the identical code path that ran live. Because the clock fires timers in deadline order and wakes each before time runs further, the rebuilt state is byte-identical to the run that produced the recording.
use ferro_replay::VirtualClock; let clock = VirtualClock::new(session_start_ns); // The sole advancer. Same engine.handle() as live. for event in recorded_session { clock.advance_to(event.local_ns).await; engine.handle(event).await; } // State hash == the live run that recorded it.
Consumer Discipline
The wallclock gate
FerroReplay is only as deterministic as its consumers are disciplined. Every adopting repo enforces no direct time APIs outside FerroReplay — and the gate ships with the crate, as reference copies you wire into your own CI.
clippy disallowed-methods
Copy the crate's clippy.toml into your repo: it disallows SystemTime::now, Instant::now, the tokio::time timer set, and chrono::{Utc,Local}::now, with CI running cargo clippy -- -D warnings.
textual gate
Wire scripts/check-wallclock.sh as a CI job. In a consumer the ALLOWED list is empty — the crate owns the sanctioned sites, so any match in your src/ is a bug. Convention: never write the gated tokens even in comments.
drop chrono's clock feature
Build chrono without the clock feature so Utc::now doesn't even exist to call. The cheapest gate is the one the compiler enforces.
a worked reference
gamma-engine is the reference adopter — its .gitlab-ci.yml runs the wallclock-gate job over a real engine. Start from it rather than from scratch.
Guarantees
The determinism laws are load-bearing
Every consumer's replay correctness rests on a handful of contracts, pinned by the unit suite that travelled with the code. Breaking any of them is a major version bump.
Monotonicity
now_ns never goes backwards within a process. SystemClockanchors to a monotonic instant at construction, so an NTP step can't rewind the timeline; a virtual clock advances explicitly and only forward.
Wakes before advance
advance_to pops due waiters in (deadline, seq) order, sets nowto each waiter's deadline, wakes it, and yields so the woken task runs while now equals its deadline. Time never runs ahead of the timers it fires.
Scheduled, not observed
interval ticks carry the grid-aligned scheduled stamp, not the wall-clock moment of firing — so 1-second publications and heartbeat ts values replay exactly, with missed ticks skipped rather than bursted.
O(1) dead-waiter discard
A cancelled sleep — a dropped receiver from a select-cancelled interval tick — is discarded on pop without a wake or yield. Dead waiters cost O(1) each, not a scheduler round-trip, even when they accumulate one-per-event on a busy replay.
One advancer, current-thread
Strict ordering holds on a current-thread runtime; the multi-thread scheduler would reintroduce wake-order races. In production exactly one component — the replay producer — calls advance_to.
Pinned by the suite
The unit suite travelled verbatim with the code from gamma-engine — deadline-ordering, race-free pre-registration, cancel-safety, and skip-not-burst are all asserted, so a regression fails CI before it can ship.
Code Quality
Small surface, no surprises
A determinism primitive earns trust by being legible. FerroReplay is a tiny, dependency-lean crate with no unsafe in its own code and a discipline it enforces on itself first.
Lean by design
Two runtime dependencies — async-trait for the trait and tokio (time, sync, rt) for timers and channels. No hidden transitive graph, no math DSL, nothing to audit that isn't on the surface.
Two sanctioned sites
The only wall-clock reads in the crate live in SystemClock, behind two targeted #[allow(clippy::disallowed_methods)] blocks. The crate passes its own gate — the discipline starts at home.
Pinned by git tag
Consumed by semver tag, never by path. Consumers pin a tag and upgrade deliberately — the ferro-risk / ferro-wave precedent — so a FerroReplay change never silently alters a downstream engine's replay behavior.
API Surface
The whole trait fits on one screen
A taste below — the guides walk each workflow end to end, and the source is small enough to read in a sitting.
One async trait on an epoch-nanosecond timeline, two implementations behind it
Components depend on Arc<dyn Clock> and read time through now_ns / sleep_until / sleep. SystemClock serves live; VirtualClock serves replay and tests. Swapping one for the other is the only change the rest of the code ever sees.
interval(clock, period) builds a ClockInterval whose tick() returns the grid-aligned scheduled stamp — the same value live and on replay.
// the contract every component reads time through pub trait Clock: Send + Sync + Debug + 'static { fn now_ns(&self) -> i64; async fn sleep_until(&self, deadline_ns: i64); async fn sleep(&self, dur: Duration) { // default: relative to now_ns() } }
Talk to us
Reach out about adopting the wallclock gate in your engine, wiring deterministic replay, or driving virtual time in your test suite.
hello@morphiqlabs.comTell us about your setup
- Runtime — tokio current-thread vs multi-thread, and where timers live today
- Replay goals — deterministic tests, session replay, state-rebuild on restart
- Existing time APIs — how many direct
SystemTime/Instantsites the gate has to retire - Platform layer — standalone engine, or building on Anvil