Testing a fix
You have a diagnosed failure and an idea for a fix — a new prompt, a tool-schema correction, a policy update, a code change. Replay & repair lets you prove the fix against the exact trace that broke, without shipping anything.
The loop
Section titled “The loop”- Reproduce. Lumni replays the bundle unmodified and confirms the original failure reproduces.
- Apply the candidate fix. Point replay at your changed prompt / schema / policy / agent version.
- Re-run in a sandbox. The failing trace runs again with the change, using the bundle’s captured tool responses so it’s safe and deterministic.
- Compare. Lumni diffs the new outcome against the failure — did the detector stop firing? Did the run reach the intended outcome?
- Check for collateral damage. The same fix is run against other traces in the failure’s cluster, so a fix that resolves one case but breaks a sibling is caught here, not in production.
Reading the result
Section titled “Reading the result”Replay returns one of three verdicts — Fix Verified, Fix Regresses, or Uncertain — along with the side-by-side comparison and the evidence behind the call. The verdict, and the decision to ship, is recorded in the evidence ledger.