Skip to content

Replay

Backup replay — re-apply the erasures a database restore resurrected.

Restoring a backup brings back every subject erased after the backup point. The append-only audit trail records which erasures were committed in that window; this package derives a ReplayPlan from a surviving copy of the trail (the restored database’s own trail lost the window) and re-runs each erasure through the existing ErasurePlanner — a mechanism for converging after a restore, never a determination that anything is compliant. Semantics are pinned in ADR 0023.

class Replayer:
def __init__(planner: ErasurePlanner, audit_sink: AuditSink, *, refs_for: Callable[[str], tuple[SubjectRef, ...]] | None = None) -> None

Replays the erasures a backup restore resurrected (ADR 0023).

A restore brings back every subject whose erasure was committed after the backup point. The surviving audit trail says exactly which those are; plan classifies it and replay re-runs the wired ErasurePlanner per subject — no second erasure engine, so ADR 0007/0008/0009 semantics apply verbatim and each replayed erasure appends its full audit sequence.

Replay is a mechanism for converging after a restore, never a determination that the restore — or the deployment — is compliant.

def plan(events: Sequence[AuditEvent], *, backup_taken_at: datetime) -> ReplayPlan

Classify a surviving trail against the backup point.

Delegates to ReplayPlan.derive> — a pure function; see there for the classification rules.

Args:

  • events (Sequence[AuditEvent]): The surviving trail (external sink, replica, or pre-restore dump; see ReplaySource).
  • backup_taken_at (datetime): When the restored backup was taken (timezone-aware; the boundary is inclusive).

Returns:

  • ReplayPlan — The plan — inspect it before executing.

Raises:

  • ConfigurationError — If backup_taken_at is timezone-naive.
def replay(session: Session, plan: ReplayPlan) -> tuple[ErasureResult, ...]

Re-apply every replayable erasure in the plan.

Per entry, in plan order: one additive ERASURE_REPLAYED event is appended before any mutation (ADR 0015’s ordering rule — if the sink is down, nothing changes), then the planner’s erase_subject re-runs the erasure, appending its full ADR 0009 sequence. Subjects listed under indeterminate or failed_only are never executed.

Runs in the caller’s open session and never commits (ADR 0006). Fail-fast: the first failure re-raises and later entries are not started — erase_subject’s contract forbids committing after it raises, so continuing in the same session would be unsound. The caller rolls back; independently committed audit events persist (duplicates possible, missing never), and re-running the replay converges — a replay of a replay is a no-op success.

Args:

  • session (Session): An open database session; commit or roll back the whole replay as one unit.
  • plan (ReplayPlan): The derived plan to execute.

Returns:

  • ErasureResult — class:~effaced.ErasureResult per replayed subject, in
  • ... — plan order.

Raises:

  • ConfigurationError — If the planner is not wired for execution.
  • ResolverError — If refs_for returns a ref whose kind matches no registered resolver.
class ReplayPlan(BaseModel):
backup_taken_at: datetime
entries: tuple[ReplayPlanEntry, ...] = ()
failed_only: tuple[str, ...] = ()
indeterminate: tuple[str, ...] = ()

What one surviving trail says must be replayed after a restore.

Derived purely from audit events (ADR 0023): same events in, equal plan out — no clock, no database. Subjects whose post-backup window shows a committed local erasure are replayable; everything the trail cannot settle is surfaced, never guessed, in the same counted-never-guessed posture as the retention sweep (ADR 0012).

Fields:

  • backup_taken_at (datetime): The cutoff instant the plan was derived against (timezone-aware; the boundary is inclusive).
  • entries (tuple[ReplayPlanEntry, ...]): Replayable subjects, ordered by (last_completed_at, subject_id) for deterministic execution.
  • failed_only (tuple[str, ...]): Subjects whose post-cutoff attempts all failed — those erasures rolled back, so the restore resurrected nothing of them. Listed for completeness, never executed.
  • indeterminate (tuple[str, ...]): Subjects with an interrupted post-cutoff attempt (ERASURE_REQUESTED with no terminal event) — the trail does not show whether anything was committed. Operator’s call.
def derive(events: Sequence[AuditEvent], *, backup_taken_at: datetime) -> ReplayPlan

Classify a surviving trail against a backup point.

A pure function: no I/O, no clock, and any ordering of the same events yields an equal plan. Per subject, looking only at events with occurred_at >= backup_taken_at (inclusive — whether a commit at the backup instant made the backup is unknowable, and over-replay is a convergent no-op): any ERASURE_LOCAL_COMPLETED makes the subject replayable; otherwise any ERASURE_STEP_FAILED lists it under failed_only; otherwise any ERASURE_REQUESTED lists it under indeterminate. All other event types are ignored.

Args:

  • events (Sequence[AuditEvent]): The surviving trail — from an external sink, a replica, or a pre-restore dump. The restored database’s own trail lost the post-backup window and cannot serve here.
  • backup_taken_at (datetime): When the restored backup was taken. Must be timezone-aware; the trail’s timestamps are UTC.

Returns:

  • ReplayPlan — The plan: replayable entries plus the surfaced remainder.

Raises:

  • ConfigurationError — If backup_taken_at is timezone-naive — comparing it against the trail’s UTC timestamps would be a silent lie.
class ReplayPlanEntry(BaseModel):
completions: int = Field(ge=1)
last_completed_at: datetime
source_event_id: UUID
subject_id: str = Field(min_length=1)

One subject whose committed erasure the restore resurrected.

The entry is evidence, not just a work item: it counts the qualifying ERASURE_LOCAL_COMPLETED events and cites the latest one, so the decision to replay is traceable back to the surviving trail.

Fields:

  • completions (int): How many qualifying completions the window holds.
  • last_completed_at (datetime): When the latest qualifying completion occurred.
  • source_event_id (UUID): The latest qualifying completion event — the evidence the replay decision rests on.
  • subject_id (str): The subject identifier (the events’ subject_ref).

Protocol — implement these members in your own class; do not subclass.

class ReplaySource(Protocol):
...

Anything that can read the whole trail from an instant onward.

ReplayPlan.derive> consumes plain event sequences, so any surviving record works — this protocol is the convenience shape for the common case of pointing at a database that still holds the post-backup window.

It is deliberately not part of AuditSink: adding a required method there would break every existing custom sink’s isinstance check. This is a standalone capability (the RectifyingResolver pattern, looser still — a dump-file loader can be a replay source without being a sink at all). DatabaseAuditSink implements it.

def read_since(since: datetime) -> Sequence[AuditEvent]

Read every subject’s events from since onward, oldest first.

The boundary is inclusive (occurred_at >= since), matching the replay window rule (ADR 0023); ordering ties in occurred_at resolve by event_id so repeated reads agree. since must be timezone-aware — implementations reject a naive bound rather than let it silently shift the window.

Args:

  • since (datetime): The instant to read from, inclusive (timezone-aware).

Returns:

  • Sequence[AuditEvent] — All events at or after since, across all subjects.