Erasure
Art. 17 erasure — atomic locally, saga-driven externally.
ErasurePlan
Section titled “ErasurePlan”class ErasurePlan(BaseModel): external_steps: tuple[ErasureStep, ...] local_steps: tuple[ErasureStep, ...] refs: tuple[SubjectRef, ...] = () steps: tuple[ErasureStep, ...] = () subject_id: ValidatedSubjectIdThe full programme for erasing one subject.
Local steps run inside one atomic transaction in FK-safe order; external steps are enqueued durably and fanned out afterwards. Plans are inspectable so callers (and tests) can assert exactly what an erasure will touch before anything happens.
Fields:
- external_steps (
tuple[ErasureStep, ...]): Steps that run through the saga/outbox after commit. - local_steps (
tuple[ErasureStep, ...]): Steps that run inside the local database transaction. - refs (
tuple[SubjectRef, ...]): The external-system references the erasure will hand to resolvers — recorded for inspectability; the executor matches them to resolvers at execution time. - steps (
tuple[ErasureStep, ...]): All steps in execution order (local first, then external). - subject_id (
ValidatedSubjectId): The subject being erased — a single-columnstror a compositeCompositeSubjectId, echoed back from the call.
ErasurePlanner
Section titled “ErasurePlanner”class ErasurePlanner: def __init__(data_map: DataMap, graph: SubjectGraph, registry: ResolverRegistry | None = None, *, executor: StepExecutor | None = None, outbox: Outbox | None = None, audit_sink: AuditSink | None = None) -> NonePlans and executes subject erasure across local and external data.
The local deletion runs in one atomic transaction in FK-safe order, honouring per-field strategies (delete / anonymize / retain). External calls cannot join that transaction, so they are enqueued durably in the same transaction and fanned out afterwards by the saga runner — the system is always in a known, recorded state, even on partial failure.
The row-level semantics (when a whole row is deleted versus anonymized
in place) are defined in ADR 0007; ref→resolver routing in ADR 0008;
the execution and audit semantics of erase_subject in ADR
0009. Changing any of them changes what gets deleted and is MAJOR
under widened SemVer.
ErasurePlanner.erase_subject
Section titled “ErasurePlanner.erase_subject”def erase_subject(session: Session, subject_id: SubjectIdentifier, *, refs: tuple[SubjectRef, ...] = ()) -> ErasureResultExecute the plan: atomic local phase + durable external enqueue.
Local steps run in FK-safe order and the external steps’ outbox entries are written through the same session, so the caller’s commit makes the whole erasure durable at once — and a rollback undoes every row change and every outbox entry together. This method never commits or rolls back the session itself; after it raises, do not commit the session.
Audit semantics (ADR 0009): ERASURE_REQUESTED is appended
before the first step, one ERASURE_STEP_SUCCEEDED after each
local step (RETAIN included — the retention decision is the
record; the append is part of the step, so a step whose outcome
cannot be recorded audits as failed), ERASURE_STEP_FAILED on
the first failure (then the original exception re-raises), and
ERASURE_LOCAL_COMPLETED last. With the default
DatabaseAuditSink each event commits
independently of the caller’s transaction, so the attempt stays
recorded even when the erasure rolls back. Validation failures
raise before any event — a malformed call never became a
data-subject request.
Each ref is routed to the resolver whose name equals the
ref’s kind (ADR 0008). A registered resolver with no matching
ref is skipped — recorded in the completion payload’s
skipped_resolvers, absent from enqueued_external — and a
ref kind matching no resolver fails loudly.
Re-running for an already-erased subject is a no-op success: row-deleting tables report zero, surviving rows (anonymized in place or retained) re-match by subject id and are reported again, and matched external work is re-enqueued under fresh idempotency keys — resolvers treat “already gone” as success, so duplicates converge.
Args:
- session (
Session): An open database session; the local phase commits or rolls back as one unit together with the outbox entries. - subject_id (
SubjectIdentifier): Identifier on the subject table. - refs (
tuple[SubjectRef, ...]): External-system references, routed by kind (ADR 0008).
Returns:
ErasureResult— The local-phase outcome with per-table counts. A surviving rowErasureResult— anonymized in some columns and retained in others counts inErasureResult— bothanonymizedandretained. External outcomes landErasureResult— in the audit trail asynchronously.
Raises:
ConfigurationError— If the planner was built without an executor, outbox, or audit sink.ResolverError— If a ref’skindmatches no registered resolver — a typo must not silently drop an external system from the erasure.
ErasurePlanner.plan
Section titled “ErasurePlanner.plan”def plan(subject_id: SubjectIdentifier, *, refs: tuple[SubjectRef, ...] = ()) -> ErasurePlanCompute the erasure programme without executing anything.
A pure function of the manifest and refs: no session, no I/O,
and calling it twice yields equal plans.
Args:
- subject_id (
SubjectIdentifier): The subject identifier — a single-columnstror a compositeCompositeSubjectId; echoed back unchanged on the plan. - refs (
tuple[SubjectRef, ...]): External-system references, recorded on the plan for the resolver steps.
Returns:
ErasurePlan— The ordered, inspectable plan (local steps first, FK-safe).
Raises:
RetentionViolationError— If a table must keep rows under a retention duty while a table on its path to the subject is planned for row deletion.ManifestError— If a table survives erasure only because the manifest declares nothing erasable on it, while a table on its path to the subject is planned for row deletion.
ErasureResult
Section titled “ErasureResult”class ErasureResult(BaseModel): anonymized: dict[str, int] = Field(default_factory=dict) completed_at: datetime deleted: dict[str, int] = Field(default_factory=dict) enqueued_external: tuple[str, ...] = () retained: dict[str, int] = Field(default_factory=dict) subject_id: ValidatedSubjectIdOutcome of the local phase of an erasure.
External steps complete asynchronously; their outcomes land in the audit trail as the saga runner processes the outbox.
Fields:
- anonymized (
dict[str, int]): Record counts anonymized, by table. - completed_at (
datetime): When the local phase finished (UTC); durable once the caller commits. - deleted (
dict[str, int]): Record counts deleted, by table. - enqueued_external (
tuple[str, ...]): Resolver names whose erasure was enqueued. - retained (
dict[str, int]): Record counts left in place under a retention duty, by table. - subject_id (
ValidatedSubjectId): The subject that was erased — echoed back from the call (single-columnstror compositeCompositeSubjectId).
ErasureStep
Section titled “ErasureStep”class ErasureStep(BaseModel): columns: tuple[str, ...] = () external: bool = False strategy: ErasureStrategy target: str = Field(min_length=1)One action the erasure will take, in execution order.
Local DELETE steps remove whole rows; local ANONYMIZE and
RETAIN steps are column-level and must name the columns they
touch (or, for RETAIN, deliberately leave untouched). External
steps address a whole subject in a resolver, never columns — a
validator makes any other shape unrepresentable.
Fields:
- columns (
tuple[str, ...]): The column names the step touches; empty for whole-row deletion and for external steps. - external (
bool):Truewhen the step is a resolver call that runs through the saga/outbox after the local transaction commits. - strategy (
ErasureStrategy): What happens to the matched records. - target (
str): Table name (local) or resolver name (external).
ErasureVerification
Section titled “ErasureVerification”class ErasureVerification(BaseModel): residual: dict[str, int] = Field(default_factory=dict) subject_id: ValidatedSubjectId surviving: dict[str, int] = Field(default_factory=dict) verified: bool verified_at: datetimeThe verdict of reading the annotated surface back after an erasure.
A verification re-runs the manifest’s subject-scoping as a read and counts what is left for the subject, per table. It proves execution fidelity — that a caller trigger, an FK cascade, an ORM event, or a partial commit did not resurrect rows the plan deleted — and nothing wider. Three boundaries are load-bearing and deliberately not covered:
- It re-reads the same annotated surface the plan was built from, so PII that was never annotated is invisible by construction; this is not a discovery-completeness check.
- A row orphaned off the subject’s path (reachable by no hop chain to the subject) is unreachable by the scoping predicate, so it is invisible here too.
- Anonymized cell values are not verified — surrogates are random, never NULL, so without a before-state a reader cannot distinguish a surrogate from an original. Confirming a value was rewritten needs a before-state and is out of scope.
The hard assertion is therefore narrower than “everything is gone”:
verified is true iff every row-deleted table holds zero
subject-scoped rows. surviving is informational only — anonymize
and retain tables keep rows by design — and never flips verified.
Fields:
- residual (
dict[str, int]): Per-table leftover subject-scoped row counts for tables the plan whole-row deletes;verifiedis the claim that all of these are zero. - subject_id (
ValidatedSubjectId): The subject whose surface was read back — echoed back from the call (single-columnstror compositeCompositeSubjectId). - surviving (
dict[str, int]): Per-table subject-scoped row counts for tables the plan anonymizes in place or retains — expected to be non-zero, reported for the record, never a failure. - verified (
bool):Trueiff every row-deleted table is empty for this subject (residualis all zero). - verified_at (
datetime): When the read-back ran (UTC).
StepExecutor
Section titled “StepExecutor”Protocol — implement these members in your own class; do not subclass.
class StepExecutor(Protocol): ...Anything that can execute one local erasure step for one subject.
The storage-specific half of erase_subject: it turns a step plus the subject graph’s hop chains
into subject-scoped statements in the caller’s open transaction. The
SQLAlchemy implementation is
ErasureExecutor.
StepExecutor.execute
Section titled “StepExecutor.execute”def execute(session: Session, graph: SubjectGraph, step: ErasureStep, subject_id: SubjectIdentifier) -> intRun one local step scoped to one subject.
Implementations must never commit or roll back the session — the
step is durable exactly when the caller’s transaction is. DELETE
removes the matched rows, ANONYMIZE rewrites the step’s columns
with irreversible surrogates, and RETAIN touches nothing and
only counts what stays.
Args:
- session (
Session): The caller’s open session. - graph (
SubjectGraph): Resolved hop chains from each table to the subject. - step (
ErasureStep): The local step to run. - subject_id (
SubjectIdentifier): The subject identifier (single-columnstror compositeCompositeSubjectId).
Returns:
int— The number of rows the step covered (deleted, anonymized, orint— counted as retained).
Raises:
ConfigurationError— If the step is external — resolver calls never run inside the local transaction.