The manifest
This content is for 0.1. Switch to the latest version for up-to-date documentation.
The manifest is effaced’s answer to “where is the personal data?”: a
complete, versioned DataMap of every annotated store. It is derived,
never authored — adapters walk your models and build it from the
annotations they find, so it cannot drift from the
schema the way a hand-maintained config file would:
data_map = collect_data_map(Base.metadata)Structure
Section titled “Structure”DataMap— the whole manifest: a tuple of tables plus itsschema_version.TableEntry— one store: itsname, itssubject_link(how rows reach the subject;Noneuntil declared — graph resolution refuses a PII-holding table without one), and its annotatedcolumns.ColumnEntry— one annotated field: itsnameand the fullPiiSpecattached to it.
collect_data_map includes only tables carrying at least one annotation
(a pii() column or a subject_link()). Everything is frozen pydantic —
manifests are values you can diff, snapshot, and test against. The exact
complement — what your schema holds that the manifest does not cover —
is what the completeness linter reports.
Versioning: old manifests are never rejected
Section titled “Versioning: old manifests are never rejected”Serialize with data_map.to_payload() (for audit snapshots, diffing,
tooling), load with DataMap.from_payload(...). Every payload carries the
MANIFEST_SCHEMA_VERSION it was written under, and the loading rules are
strict in one direction only:
- Any change to the serialized format bumps
MANIFEST_SCHEMA_VERSIONand ships an explicit forward migration — a MAJOR release. - Old manifests are migrated forward, never rejected. A payload you snapshotted years ago must load on every future effaced.
- A manifest newer than the installed library fails loudly with
upgrade guidance (
ManifestError) — guessing at a format you don’t understand is not an option.
The enum vocabularies (PiiCategory, LegalBasis, ErasureStrategy)
are part of the format: adding members is MINOR, removing or renaming
them is MAJOR.
The subject graph: resolved at runtime, never serialized
Section titled “The subject graph: resolved at runtime, never serialized”The manifest records dotted relationship paths ("order.user"); the
engines need join columns. That resolution happens at runtime, against
the ORM mappers:
graph = resolve_subject_graph(data_map, Base.registry)The resulting SubjectGraph holds one TableAccessPlan per table — its
hop chain of foreign-key column pairs down to the subject, and whether the
table is fully PII-owned (every physical column annotated, primary-key,
or foreign-key — the precondition for whole-row deletion, see
erasure). The graph’s deletion_order is FK-safe: children
before parents, the subject table last.
Two properties make the split worth understanding:
- The graph is runtime-only and never serialized. Join columns are a property of the current schema; persisting them would freeze a stale view. The manifest persists; the graph is recomputed.
- Incoherent graphs are unrepresentable. Validators reject duplicate
accesses, chains that don’t terminate at the subject table, and graphs
with no subject — consumers (the planner, the exporter) never re-check.
resolve_subject_graphitself fails loudly on unmapped tables, paths through many-to-many secondary tables, FK cycles, and a missing or ambiguoussubject_link("").
Both the exporter and the erasure planner verify at construction that
their data map and graph describe the same set of tables — disagreement
is a ManifestError, not a silent partial answer.
Full signatures: API reference.