The manifest
The manifest is effaced’s answer to “where is the personal data?”: a
complete, versioned DataMap of every annotated store. It is derived or
authored. The derived path has adapters walk your models and build it from
the annotations they find, so it cannot drift from the
schema the way a hand-maintained config file would:
data_map = collect_data_map(Base.metadata)The authored path loads a hand-written or externally-generated payload with
DataMap.from_payload(...) — the serialized manifest is a supported
authoring/import format, not only a derived snapshot. Both produce the same
shape and drive the engines identically; an authored manifest reaches
execution through resolve_subject_graph_from_fk(data_map, metadata) over a
reflected or hand-built MetaData, no ORM registry required.
Structure
Section titled “Structure”DataMap— the whole manifest: a tuple of tables plus itsschema_version.TableEntry— one store: itsname, itssubject_link(how rows reach the subject;Noneuntil declared — graph resolution refuses a PII-holding table without one), and its annotatedcolumns.ColumnEntry— one annotated field: itsnameand the fullPiiSpecattached to it.
collect_data_map includes only tables carrying at least one annotation
(a pii() column or a subject_link()). Everything is frozen pydantic —
manifests are values you can diff, snapshot, and test against. The exact
complement — what your schema holds that the manifest does not cover —
is what the completeness linter reports.
Versioning: old manifests are never rejected
Section titled “Versioning: old manifests are never rejected”Serialize with data_map.to_payload() (for audit snapshots, diffing,
tooling), load with DataMap.from_payload(...). Every payload carries the
MANIFEST_SCHEMA_VERSION it was written under, and the loading rules are
strict in one direction only:
- Any change to the serialized format bumps
MANIFEST_SCHEMA_VERSIONand ships an explicit forward migration — a MAJOR release. - Old manifests are migrated forward, never rejected. A payload you snapshotted years ago must load on every future effaced.
- A manifest newer than the installed library fails loudly with
upgrade guidance (
ManifestError) — guessing at a format you don’t understand is not an option.
The enum vocabularies (PiiCategory, LegalBasis, ErasureStrategy)
are part of the format: adding members is MINOR, removing or renaming
them is MAJOR.
The subject graph: resolved at runtime, never serialized
Section titled “The subject graph: resolved at runtime, never serialized”The manifest records dotted relationship paths ("order.user"); the
engines need join columns. That resolution happens at runtime, against
the ORM mappers:
graph = resolve_subject_graph(data_map, Base.registry)The resulting SubjectGraph holds one TableAccessPlan per table — its
hop chain of foreign-key column pairs down to the subject, and whether the
table is fully PII-owned (every physical column annotated, primary-key,
or foreign-key — the precondition for whole-row deletion, see
erasure). The graph’s deletion_order is FK-safe: children
before parents, the subject table last.
Two properties make the split worth understanding:
- The graph is runtime-only and never serialized. Join columns are a property of the current schema; persisting them would freeze a stale view. The manifest persists; the graph is recomputed.
- Incoherent graphs are unrepresentable. Validators reject duplicate
accesses, chains that don’t terminate at the subject table, and graphs
with no subject — consumers (the planner, the exporter) never re-check.
resolve_subject_graphitself fails loudly on unmapped tables, paths through many-to-many secondary tables, FK cycles, and a missing or ambiguoussubject_link("").
Both the exporter and the erasure planner verify at construction that
their data map and graph describe the same set of tables — disagreement
is a ManifestError, not a silent partial answer.
Full signatures: API reference.