Annotations
Core annotation models — the data map vocabulary, storage-agnostic.
Authoring helpers that attach these to concrete ORMs live in
effaced.adapters (SQLAlchemy first). The models here are pure data:
they validate, serialize, and never import a database library.
canonical_subject_id
Section titled “canonical_subject_id”def canonical_subject_id(identifier: SubjectIdentifier) -> strSerialize a subject identifier to its canonical storage string.
A bare str is returned unchanged — it is already the one-element
canonical form, so single-column subjects are byte-identical in storage,
audit references, and SQL to how they were before composite keys existed.
A CompositeSubjectId joins its escaped element values on
a reserved separator. The escaping guarantees the result is collision-
free: distinct keys (including keys that differ only in where a separator-
like character falls, such as ("a", "b:c") versus ("a:b", "c"))
always serialize to distinct strings. Saga completion-grouping and cross-
subject isolation both depend on that distinctness.
Args:
- identifier (
SubjectIdentifier): A single-columnstror a multi-columnCompositeSubjectId.
Returns:
str— The canonical string. Round-trips throughparse_canonicalstr— for the composite case; a barestrparses back to itself.
CompositeSubjectId
Section titled “CompositeSubjectId”class CompositeSubjectId(BaseModel): values: tuple[str, ...] = Field(min_length=1)A data subject identified by an ordered tuple of column values.
Subjects whose identity spans several columns — the common multi-tenant
(tenant_id, user_id) shape, or any natural composite key — carry one
of these instead of a bare str (ADR 0025). The values are
positional: their order aligns left-to-right with the columns the
manifest declares in
subject_id_columns. effaced always matches
the whole ordered key, never a partial one, so the arity of this tuple
must equal the number of declared columns at the call boundary.
A single-column subject is just a str — this model is only for the
multi-column case. See SubjectIdentifier for the union
every engine entry point accepts, and
canonical_subject_id for the deterministic,
collision-free serialization used in storage and the audit trail.
Fields:
- values (
tuple[str, ...]): The subject’s key-column values, in declared column order; at least one, none empty.
Correction
Section titled “Correction”class Correction(BaseModel): category: PiiCategory value: str | int | float | boolOne Art. 16 correction: a category and the value it should hold.
Corrections are keyed by PiiCategory, never by column
(ADR 0013): the category is the only vocabulary shared with external
resolvers, and a category-wide write keeps denormalized copies of the
same fact consistent. Values are JSON scalars so a correction
round-trips losslessly through the outbox payload.
The value is personal data. It lives transiently in outbox rows while external rectification is in flight — cleared the moment the entry reaches a terminal status — and never appears in any audit event.
Fields:
- category (
PiiCategory): Which kind of personal data the correction targets. - value (
str | int | float | bool): The corrected value, applied to every matching field.
parse_canonical
Section titled “parse_canonical”def parse_canonical(serialized: str) -> CompositeSubjectIdParse a canonical composite string back into its ordered values.
The exact inverse of canonical_subject_id for the composite case:
it splits on unescaped separators and unescapes each element, so a value
that itself contained the separator or escape character is restored
intact.
Args:
- serialized (
str): A canonical string produced bycanonical_subject_idfrom aCompositeSubjectId.
Returns:
CompositeSubjectId— The reconstructedCompositeSubjectId.
PiiSpec
Section titled “PiiSpec”class PiiSpec(BaseModel): category: PiiCategory description: str | None = None erasure: ErasureStrategy = ErasureStrategy.DELETE legal_basis: LegalBasis | None = None purpose: str | None = None retention: RetentionPolicy | None = NoneFull declaration for one personal-data field.
Built by the adapter authoring helpers (e.g.
effaced.adapters.sqlalchemy.pii); read back by
effaced.manifest.DataMap.
Fields:
- category (
PiiCategory): What kind of personal data this is. - description (
str | None): Optional human note for audits and the PII linter. - erasure (
ErasureStrategy): What happens on Art. 17 erasure. Defaults toDELETE. - legal_basis (
LegalBasis | None): Why the data is processed at all (Art. 15 metadata). - purpose (
str | None): Processing purpose, surfaced verbatim in export bundles. - retention (
RetentionPolicy | None): Required whenerasureisRETAIN(and allowed withANONYMIZEto document why the record itself survives).
RetentionPolicy
Section titled “RetentionPolicy”class RetentionPolicy(BaseModel): anchor: str | None = None basis: LegalBasis = LegalBasis.LEGAL_OBLIGATION duration: timedelta | None = None reason: str = Field(min_length=1)Why and how long a value must outlive an erasure request.
A bounded duty needs a clock: duration is measured from the instant
stored in the anchor column. Without an anchor, a duration cannot be
evaluated — the retention sweep reports such columns as indeterminate,
never guessed (see effaced.retention.RetentionSweeper).
Fields:
- anchor (
str | None): Name of a datetime column on the same table as the annotated column, holding the instant the retention clock starts (aninvoiced_at, aclosed_at). Cross-table anchors are out of scope. The SQLAlchemy adapter validates existence and datetime-ness at collection time (ADR 0012). - basis (
LegalBasis): The lawful basis that overrides erasure. - duration (
timedelta | None): How long the duty lasts, if bounded.Nonemeans indefinite / determined externally. - reason (
str): Human-readable legal duty (e.g."§147 AO invoice retention").
SubjectIdentifier
Section titled “SubjectIdentifier”SubjectIdentifier = str | CompositeSubjectIdWhat every engine entry point accepts to name a data subject.
A bare str for a single-column subject, or a
CompositeSubjectId for a multi-column one. The bare-str
form is the one-element canonical case, so passing a string behaves exactly
as it always has (ADR 0025).
SubjectLink
Section titled “SubjectLink”class SubjectLink(BaseModel): is_subject_table: bool path: str subject_id_columns: tuple[str, ...] = Field(default=('id',), min_length=1)How a table’s records reach the data subject.
A dotted relationship path from the annotated table to the subject
table, e.g. "order.user" for an order_items table whose records
belong to the user owning the parent order. The subject table itself
uses the empty path "".
Fields:
- is_subject_table (
bool): Whether this link marks the subject table itself. - path (
str): Dotted relationship path;""marks the subject table. - subject_id_columns (
tuple[str, ...]): Ordered identifier columns on the subject table that callers’SubjectIdentifieraligns to. Defaults to("id",)— one column, the single-column case. A multi-column tuple declares a composite subject key (ADR 0025); effaced always matches the whole ordered key, never a partial one.
SubjectRef
Section titled “SubjectRef”class SubjectRef(BaseModel): extra: dict[str, str] = Field(default_factory=dict) kind: str = Field(min_length=1, max_length=255) value: str = Field(min_length=1, max_length=255)Opaque reference to one data subject, passed to resolvers.
Resolvers receive references (e.g. a Stripe customer id), never the subject’s rich PII — the library moves identifiers, not data.
Fields:
- extra (
dict[str, str]): Additional identifiers a resolver may need (string-typed on purpose — refs must stay loggable and PII-light). - kind (
str): Namespace of the identifier ("stripe","email"). Refs are routed to the resolver whosenameequals the ref’skind(ADR 0008). - value (
str): The identifier itself.