Skip to content

Manifest

This content is for 0.1. Switch to the latest version for up-to-date documentation.

The data map — a versioned manifest derived from model annotations.

class ColumnEntry(BaseModel):
name: str = Field(min_length=1)
spec: PiiSpec

One annotated field in the data map.

Fields:

  • name (str): Field name as it exists in the store.
  • spec (PiiSpec): The personal-data declaration attached to it.
class DataMap(BaseModel):
schema_version: int = MANIFEST_SCHEMA_VERSION
tables: tuple[TableEntry, ...] = ()

The complete, versioned manifest for one application.

The manifest is derived, never authored: adapters (e.g. effaced.adapters.sqlalchemy.collect_data_map) walk your models and build it from the annotations they find. Serialize with to_payload for audit snapshots, diffing, and tooling; load with from_payload, which migrates old versions forward.

Fields:

  • schema_version (int)
  • tables (tuple[TableEntry, ...])
def from_payload(data: dict[str, Any]) -> DataMap

Deserialize a manifest, migrating old schema versions forward.

Args:

  • data (dict[str, Any]): A payload produced by to_payload (any version).

Returns:

  • DataMap — The manifest, lifted to the current schema version.

Raises:

  • ManifestError — If the payload is structurally invalid or newer than this library understands.
def table(name: str) -> TableEntry

Return the entry for one store.

Raises:

  • ManifestError — If the store is not in the manifest.
def to_payload() -> dict[str, Any]

Serialize to a JSON-compatible payload.

def fk_safe_deletion_order(tables: Sequence[str], foreign_keys: Iterable[tuple[str, str]]) -> tuple[str, ...]

Order tables so deleting in sequence never violates a foreign key.

Children come before their parents: a (child, parent) edge means child holds a foreign key referencing parent, so the child’s rows must go first. Self-referential edges are ignored — one DELETE statement removes a table’s whole subject-row set, and foreign keys are checked per statement, so parent and child rows within one table die together. The result is deterministic for a given input order.

Args:

  • tables (Sequence[str]): The table names to order, in a stable caller-chosen order.
  • foreign_keys (Iterable[tuple[str, str]]): (child, parent) edges between those tables.

Returns:

  • tuple[str, ...] — The tables in FK-safe deletion order.

Raises:

  • SubjectResolutionError — If an edge references a table outside tables, or the edges form a cross-table cycle (no safe deletion order exists).
class JoinHop(BaseModel):
source_columns: tuple[str, ...] = Field(min_length=1)
source_table: str = Field(min_length=1)
target_columns: tuple[str, ...] = Field(min_length=1)
target_table: str = Field(min_length=1)

One foreign-key hop on the path from a table toward the subject table.

Hops are pure column-pair data: source_columns[i] on source_table joins to target_columns[i] on target_table. Composite foreign keys are expressed as multiple paired columns. A self-referential hop (source_table == target_table) is valid.

Fields:

  • source_columns (tuple[str, ...])
  • source_table (str)
  • target_columns (tuple[str, ...])
  • target_table (str)
MANIFEST_SCHEMA_VERSION = 1

Current manifest schema version. Bump on ANY format change, with a matching upgrade branch in migrate — this is a MAJOR release.

class SubjectGraph(BaseModel):
accesses: tuple[TableAccessPlan, ...] = Field(min_length=1)
deletion_order: tuple[str, ...]
subject_id_column: str = Field(min_length=1)
subject_table: str = Field(min_length=1)

Resolved subject reachability for every subject-linked table.

accesses is ordered FK-safely for deletion: children before parents, the subject table last. Validators make incoherent graphs unrepresentable — exactly one access is the subject table and every other chain terminates at it — so consumers (the erasure planner, the exporter) need not re-check.

Fields:

  • accesses (tuple[TableAccessPlan, ...])
  • deletion_order (tuple[str, ...]): Table names in FK-safe deletion order (children first).
  • subject_id_column (str)
  • subject_table (str)
def access(table: str) -> TableAccessPlan

Return one table’s access plan.

Args:

  • table (str): The table name to look up.

Returns:

Raises:

  • SubjectResolutionError — If the table is not in the graph.
class TableAccessPlan(BaseModel):
fully_pii_owned: bool = False
hops: tuple[JoinHop, ...] = ()
is_subject_table: bool
table: str = Field(min_length=1)

How one table’s rows are reached from a subject identifier.

The hop chain walks from table toward the subject table; an engine turns it into a correlated join (“select/delete this table’s rows whose chain ends at the given subject id”). An empty chain means the table is the subject table.

fully_pii_owned reports whether every physical column of the table is PII-annotated, a primary-key member, or a foreign-key member — i.e. the row carries nothing but personal data and structural keys, so deleting the whole row is the faithful erasure. The conservative default is False: rows are anonymized in place unless an adapter (or a hand-built graph) explicitly establishes full ownership.

Fields:

  • fully_pii_owned (bool)
  • hops (tuple[JoinHop, ...])
  • is_subject_table (bool): Whether this table is the subject table itself.
  • table (str)
class TableEntry(BaseModel):
columns: tuple[ColumnEntry, ...] = ()
name: str = Field(min_length=1)
subject_link: SubjectLink | None = None

One data store (table/collection) that holds personal data.

Fields:

  • columns (tuple[ColumnEntry, ...]): The annotated fields, in declaration order.
  • name (str): Store name.
  • subject_link (SubjectLink | None): How records reach the data subject. None until the store declares one — graph resolution refuses a PII-holding store without it.