effaced-s3
effaced-s3 — first-party S3 resolver for effaced.
The resolver itself is S3Resolver. The object-store machinery it
rides on is public and stable, so S3-compatible stores (Supabase Storage,
MinIO, R2) can build their own resolvers on the same parts: the client
protocol S3ObjectClient, the prefix guard checked_prefix,
the export collector collect_object_records, the listing helpers
iter_current_objects and collect_version_identifiers, the
batched delete delete_in_batches, and the error taxonomy
(error_code, is_nonretryable, NONRETRYABLE_CODES).
checked_prefix
Section titled “checked_prefix”def checked_prefix(ref: SubjectRef) -> strThe ref’s key prefix, validated before any object-store call.
Object-store prefixes are literal substring matches, so a prefix that
is not delimiter-terminated also matches sibling subjects
(users/4 matches users/42/avatar.png) — that is cross-subject
bleed, the one thing a resolver must never do. Both guards run before
any call.
Args:
- ref (
SubjectRef): The subject reference whosevalueis the key prefix.
Returns:
str— The validated prefix, unchanged.
Raises:
ResolverError— The prefix is blank (it would address the whole bucket) or does not end with"/"(it would match sibling subjects).
collect_object_records
Section titled “collect_object_records”def collect_object_records(client: S3ObjectClient, bucket: str, prefix: str, *, source: str, include_content: bool, max_object_bytes: int | None) -> tuple[ExportRecord, ...]Map every current object under the prefix; the size cap fails loudly.
Args:
- client (
S3ObjectClient): The object-store client to list and fetch with. - bucket (
str): The bucket holding the subject’s objects. - prefix (
str): The subject’s key prefix. - source (
str): TheExportRecord.sourcelabel every produced record carries — the resolver’s name. - include_content (
bool): Fetch each object’s body (GET) or only its metadata (HEAD). - max_object_bytes (
int | None): Refuse (loudly) to export any object larger than this;Nonemeans no cap.
Returns:
ExportRecord— The records for every current object under the prefix, in listing...— order. Empty when nothing lives under the prefix.
Raises:
ResolverError— An object under the prefix exceedsmax_object_bytes— the export fails whole, never a silently thinned bundle.
collect_version_identifiers
Section titled “collect_version_identifiers”def collect_version_identifiers(client: S3ObjectClient, bucket: str, prefix: str) -> list[ObjectIdentifierTypeDef]Every (key, version) pair under the prefix — delete markers included.
Args:
- client (
S3ObjectClient): The S3 client to list with. - bucket (
str): The bucket holding the subject’s objects. - prefix (
str): The subject’s key prefix.
Returns:
list[ObjectIdentifierTypeDef]— Identifiers for all object versions and delete markers, inlist[ObjectIdentifierTypeDef]— listing order, ready fordelete_objectsbatches.
delete_in_batches
Section titled “delete_in_batches”def delete_in_batches(client: S3ObjectClient, bucket: str, identifiers: list[ObjectIdentifierTypeDef]) -> list[str]Delete every identifier in bounded batches; collect per-key error codes.
Args:
- client (
S3ObjectClient): The object-store client to delete with. - bucket (
str): The bucket holding the subject’s objects. - identifiers (
list[ObjectIdentifierTypeDef]): The (key, optional version) pairs to delete, ready fordelete_objectsbatches.
Returns:
list[str]— The per-key error codes the store reported, across all batches —list[str]— empty when every deletion succeeded. Batches keep running pastlist[str]— failures, so the codes accumulate without aborting the rest.
error_code
Section titled “error_code”def error_code(error: ClientError) -> strThe S3 error code of a ClientError, or "" when absent.
Args:
- error (
ClientError): TheClientErrorbotocore raised.
Returns:
str— TheError.Codefield of the error response body.
is_nonretryable
Section titled “is_nonretryable”def is_nonretryable(error: ClientError) -> boolWhether a ClientError should abandon instead of retry.
Args:
- error (
ClientError): TheClientErrorbotocore raised.
Returns:
bool— True for credential, permission, missing-bucket, andbool— wrong-endpoint failures; False for everything else — throttling,bool— server faults, and codes this taxonomy does not know.
iter_current_objects
Section titled “iter_current_objects”def iter_current_objects(client: S3ObjectClient, bucket: str, prefix: str) -> Iterator[ObjectTypeDef]The current (non-deleted) objects under the prefix, page by page.
Args:
- client (
S3ObjectClient): The S3 client to list with. - bucket (
str): The bucket holding the subject’s objects. - prefix (
str): The subject’s key prefix.
Yields:
ObjectTypeDef— One listing entry per current object.
NONRETRYABLE_CODES
Section titled “NONRETRYABLE_CODES”NONRETRYABLE_CODES = frozenset({'AccessDenied', 'AllAccessDisabled', 'AccountProblem', 'InvalidAccessKeyId', 'SignatureDoesNotMatch', 'InvalidBucketName', 'NoSuchBucket', 'PermanentRedirect'})Error codes that can never succeed on retry — they abandon immediately.
PartialEraseError
Section titled “PartialEraseError”class PartialEraseError(Exception): ...Some object versions under the prefix could not be deleted this attempt.
Deliberately not a ResolverError: the
saga runner retries any other exception, and a partial batch failure
is exactly that case — the keys that did delete stay deleted, the
survivors are re-listed and re-deleted on the next attempt, and
re-deleting an already-gone version is a no-op, so retries converge.
Messages carry counts and S3 error codes only — never keys or prefixes, which are user content.
S3ObjectClient
Section titled “S3ObjectClient”Protocol — implement these members in your own class; do not subclass.
class S3ObjectClient(Protocol): ...What the resolver requires of an S3 client (structural).
S3ObjectClient.delete_objects
Section titled “S3ObjectClient.delete_objects”def delete_objects(*, Bucket: str, Delete: DeleteTypeDef) -> DeleteObjectsOutputTypeDefBatch-delete up to 1000 (key, version) pairs.
S3ObjectClient.get_object
Section titled “S3ObjectClient.get_object”def get_object(*, Bucket: str, Key: str) -> GetObjectOutputTypeDefFetch one object’s body and metadata.
S3ObjectClient.head_object
Section titled “S3ObjectClient.head_object”def head_object(*, Bucket: str, Key: str) -> HeadObjectOutputTypeDefFetch one object’s metadata without the body.
S3ObjectClient.list_object_versions
Section titled “S3ObjectClient.list_object_versions”def list_object_versions(*, Bucket: str, Prefix: str, KeyMarker: str = ..., VersionIdMarker: str = ...) -> ListObjectVersionsOutputTypeDefPage every object version and delete marker under a prefix.
S3ObjectClient.list_objects_v2
Section titled “S3ObjectClient.list_objects_v2”def list_objects_v2(*, Bucket: str, Prefix: str, ContinuationToken: str = ...) -> ListObjectsV2OutputTypeDefPage the current objects under a prefix.
S3Resolver
Section titled “S3Resolver”class S3Resolver: def __init__(bucket: str, *, client: S3ObjectClient | None = None, region_name: str | None = None, include_content: bool = True, max_object_bytes: int | None = None) -> NoneExports and erases a subject’s objects held under an S3 key prefix.
Expects refs of kind "s3" (refs are routed to the resolver whose
name equals their kind — ADR 0008) whose value is the subject’s key
prefix, e.g. "users/42/"; the bucket is fixed at construction.
The prefix must be non-blank and end with "/" — anything else
raises ResolverError before any S3 call,
because an unterminated prefix also matches sibling subjects
(users/4 matches users/42/...) and a blank one is the whole
bucket.
Erasure deletes every object version and delete marker under the
prefix: a plain delete on a versioned bucket only hides data behind a
delete marker, which is not erasure. Unversioned buckets take the
same path (S3 reports their versions as "null"). Exports cover
current versions and, by default, include each object’s content
base64-encoded — for user-generated objects the bytes usually are
the personal data. include_content=False is appropriate only when
the controller provides the files through another complete channel.
Idempotency: a prefix S3 holds nothing under yields
already_absent=True — success, never an error. A partially failed
batch delete keeps deleting the rest, then raises
PartialEraseError so the saga retries; re-deletes
are no-ops, so retries converge.
Error taxonomy (see effaced_s3.errors): credential,
permission, missing-bucket, and wrong-endpoint failures raise
ResolverError; throttling, connection
faults, S3-side errors, and unknown codes propagate so the saga
runner retries. SDK-internal retries are disabled — the saga runner
owns retry and backoff (ADR 0010).
Fields:
- covered_surface (
CoveredSurface): The S3 object PII this resolver claims to reach (AttestingResolver). Returns:S3_COVERED_SURFACE, built from the exporter’s object-field tuple so it cannot drift. - name (
str): Stable resolver name recorded in manifests and audits.
S3Resolver.erase_subject
Section titled “S3Resolver.erase_subject”async def erase_subject(ref: SubjectRef) -> ResolverErasureDelete every object version under the subject’s prefix (Art. 17).
Args:
- ref (
SubjectRef):kind="s3",value=<key prefix>.
Returns:
ResolverErasure— The outcome;already_absent=Trueif S3 already heldResolverErasure— nothing under the prefix.
Raises:
ResolverError— The credentials are invalid or lack a permission, the bucket does not exist, the prefix is blank or not"/"-terminated, or S3 refused every failed deletion for non-retryable reasons — retrying cannot succeed.PartialEraseError— Some versions failed transiently this attempt; propagates so the saga retries to convergence.
S3Resolver.export_subject
Section titled “S3Resolver.export_subject”async def export_subject(ref: SubjectRef) -> ResolverExportCollect the objects held under the subject’s prefix (Art. 15).
Args:
- ref (
SubjectRef):kind="s3",value=<key prefix>.
Returns:
ResolverExport— Per object: key, size, content type, last-modified, userResolverExport— metadata, and (unless disabled) the base64-encoded body.ResolverExport— Empty when nothing lives under the prefix.
Raises:
ResolverError— The credentials are invalid or lack a permission, the bucket does not exist, the prefix is blank or not"/"-terminated, or an object exceedsmax_object_bytes— retrying cannot succeed.