feat(s3,gcs): namespace-scoped sync for multi-repo shared datastores (#533) #26
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "worktree-533"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Implements namespace-scoped sync in the S3 and GCS datastore extensions, completing the extension side of giga-swamp (swamp-club#525). When
options.namespaceis set, index operations scope to{namespace}/.datastore-index.jsonand data walks restrict to the namespace subtree. Three new methods support cross-namespace catalog exchange.Closes swamp-club#533.
What changed
Interfaces (
interfaces.tsin both extensions):DatastoreSyncOptions.namespace?: stringSyncCapabilities.namespacedSync?: booleanCatalogExportRow,CatalogExportEntrytypesDatastoreSyncService:exportCatalog,pullForeignCatalogs,fetchForeignContentSync service (
s3_cache_sync.ts,gcs_cache_sync.ts):bindNamespace()— stores namespace from firstpullChanged/pushChangedcall, asserts immutability for the instance lifetimeindexKey()— returns{namespace}/.datastore-index.jsonor global.datastore-index.json.datastore-index.jsonS3/GCS operations updated to useindexKey()discoverIndexFromBucketscoped tolistAllObjects({namespace}/)when namespace setexportCatalog— PUT catalog JSON to{namespace}/.catalog-export.jsonpullForeignCatalogs— GET foreign catalogs, skip missing/malformed JSON silentlyfetchForeignContent— GET single file from foreign namespace with path traversal validationcapabilities()returnsnamespacedSync: trueSolo mode is fully backward compatible — no namespace means global index, identical to before.
Why it's correct
Namespace immutability: A sync service instance is created once per command lifecycle.
bindNamespace()stores the namespace from the first call and throws on mismatch — prevents cross-contamination without needingMap<namespace, state>.Foreign methods bypass cached state:
exportCatalog,pullForeignCatalogs, andfetchForeignContentconstruct S3/GCS keys directly — they don't touch the index cache, sidecar, or namespace assertion. This is correct because foreign operations read from other namespaces, not the bound one.Discovery scoping:
discoverIndexFromBucketuseslistAllObjects({namespace}/)so synthesized indexes only contain the current namespace's keys — no cross-namespace entries.Path traversal defense:
fetchForeignContentrejects paths with../segments or absolute paths before constructing the storage key.Verification
Unit tests (179 total, 0 failed)
E2E via swamp CLI against MinIO (21 passed, 0 failed)
Files changed
Side finding
Filed swamp-club#534:
swamp datastore sync --pushwas not threading namespace intopushChanged()options, creating a spurious global index. Fixed in swamp20260603.013237.0.Co-Authored-By: stack72 paul@systeminit.com
Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com
When options.namespace is set, scope index operations to {namespace}/.datastore-index.json and restrict data walks to the namespace subtree. Add exportCatalog, pullForeignCatalogs, and fetchForeignContent methods for cross-namespace catalog exchange. Solo mode (no namespace) is fully backward compatible — all existing tests pass unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>Adversarial Review
Critical / High
CRITICAL -- SSH regression: trailing newline fix reverted (ssh/extensions/models/_lib/operations.ts:313, ssh/extensions/models/ssh.ts:93, ssh/manifest.yaml:3)
Origin/main contains fix 511/25 at SSH version 2026.06.03.1 which ensures identityContent ends with a trailing newline before writing to the temp file. This PR removes that logic, downgrades the version to 2026.06.01.2, removes the upgrade entry, and deletes the corresponding test.
How it breaks: OpenSSH rejects PEM private keys without a trailing newline with "Load key: invalid format". If a vault-stored key has its final newline stripped (common when secrets are stored or retrieved via APIs that trim whitespace), every exec, script, and copy call against that host will fail.
Concrete example: A vault stores a key ending with "-----END OPENSSH PRIVATE KEY-----" (no trailing newline). Before this PR, materializeTempKeys appended the newline. After this PR, the key is written as-is, and ssh fails with "Load key: invalid format".
Fix: Either revert the SSH changes entirely (they appear unrelated to the namespace-scoped sync feature), or provide a different fix for the underlying issue. The version downgrade from 2026.06.03.1 to 2026.06.01.2 is also a manifest regression that would confuse the publishing pipeline.
Medium
MEDIUM -- No validation on namespace parameter (s3_cache_sync.ts:479, gcs_cache_sync.ts:438, plus exportCatalog, pullForeignCatalogs, fetchForeignContent in both files)
The namespace string is interpolated directly into object keys with zero validation across all methods: bindNamespace, indexKey, exportCatalog, pullForeignCatalogs, and fetchForeignContent. While S3/GCS keys are opaque strings (no real path traversal), edge cases produce surprising behavior:
Empty string: bindNamespace with empty string sets namespaceBound=true with namespace="". Since indexKey uses a truthy check, empty string falls through to solo mode. But subsequent calls with namespace=undefined trigger "Namespace mismatch" because "" !== undefined. A caller passing namespace="" instead of omitting the field permanently poisons the service instance.
Slashes or special chars: namespace "a/b" produces keys like "a/b/.datastore-index.json" -- valid but may not match the design intent of a single path prefix.
Suggested fix: Add a validation check in bindNamespace that rejects empty strings, slashes, and ".." segments, matching the fetchForeignContent path validation pattern.
MEDIUM -- fetchForeignContent validates relPath but not namespace (s3_cache_sync.ts:1687-1692, gcs_cache_sync.ts:1497-1502)
fetchForeignContent has path traversal protection on relPath (checking for leading slash, backslash, and ".." segments) but applies no validation to the namespace parameter. A call like fetchForeignContent with a namespace containing path-traversal-like segments constructs unexpected keys. While this is harmless in object stores (literal key), it is inconsistent with the security posture on relPath -- both parameters are caller-supplied strings used to build keys.
Suggested fix: Apply the same traversal checks to namespace that are applied to relPath.
Low
LOW -- pullForeignCatalogs fetches sequentially (s3_cache_sync.ts:1662-1679, gcs_cache_sync.ts:1472-1489)
The for-of loop fetches each foreign catalog with a separate HTTP request, one at a time. With many foreign namespaces, this creates a serial bottleneck. The existing push/pull paths in the same file use Promise.allSettled with batched concurrency for similar fan-out patterns.
Concrete example: 10 foreign namespaces x 200ms network latency per GET = 2 seconds of serial waiting.
Suggested fix: Use Promise.allSettled with the existing concurrency batching pattern.
Verdict
FAIL -- The SSH trailing newline fix revert is a regression that will cause production SSH failures when vault-stored keys lack trailing newlines. The SSH changes appear unrelated to the namespace-scoped sync feature and should be reverted from this PR.
The datastore namespace feature itself is well-structured with good test coverage and a clean separation between namespaced and solo modes. The medium-severity namespace validation gaps are real but non-blocking.
Code Review
Blocking Issues
None.
Suggestions
Push test doesn't validate that data keys are namespace-prefixed (
gcs_cache_sync_test.ts:3441,s3_cache_sync_test.ts:3963): The "pushChanged with namespace writes per-namespace index" test seeds local files atcachePath/data/model/1/raw(no namespace) and only asserts the INDEX ends up atmy-ns/.datastore-index.json. It doesn't verify that data files land atmy-ns/data/model/1/rawin the bucket. The pull test andfetchForeignContentboth assume data IS at namespace-prefixed GCS paths (gcs.storage.set("my-ns/data/model/1/raw", ...)). In practice a push with no-namespace-prefixed local files would produce GCS keys likedata/model/1/raw, butfetchForeignContent("my-ns", "data/model/1/raw")would look formy-ns/data/model/1/raw— not found. The tests are internally consistent with themselves but do not together exercise the round-trip (pushnamespace →fetchForeignContentnamespace). Consider adding a round-trip test, or clarifying the docstring onDatastoreSyncOptions.namespaceto say whether namespace scopes data paths or only the index.namespaceparameter inexportCatalogandpullForeignCatalogsis unvalidated (gcs_cache_sync.ts:1463,s3_cache_sync.ts:1652): TherelPathparameter infetchForeignContentis validated for path traversal, but thenamespaceparameter is not. A namespace containing..would produce a GCS key like../evil/.catalog-export.json. Cloud storage APIs treat object names as opaque strings so there's no filesystem-level traversal risk, but a defensiveif (namespace.includes("..") || namespace.startsWith("/"))guard would be consistent withfetchForeignContent's treatment ofrelPathand prevent accidentally scoping to an unintended prefix. Since namespace is trust-internal this is low severity.writePartitionedIndextype inconsistency between GCS and S3 (gcs_cache_sync.ts:1384): GCS usesArray<Promise<unknown>> = []while S3 usesPromise<void>[]. Both work correctly because errors are swallowed, but usingPromise<void>[]in both (and keeping the.then(() => {})pattern like S3 does) would make the two implementations consistent with each other.SSH
manifest.yamldoesn't include a version bump rationale in this PR's context (ssh/manifest.yaml): The ssh extension version changed to2026.06.01.2alongside the s3/gcs changes. There are no functional ssh changes visible in the diff that would explain a version bump in the same PR as the datastore namespace feature. If the ssh version was bumped solely to pick up infrastructure changes (e.g. updated test scaffolding), a brief note in the PR description about what drove that bump would help future reviewers understand why ssh changed in a datastore PR. This is a documentation nit only — the code itself looks correct.e79026fe6fb4929f30d1Code Review
Blocking Issues
None.
Suggestions
isLazySkippableandpartitionKeyFromPathare namespace-unaware, silently breakingmetadataOnlyand scoped sync in namespace mode.Both functions check
parts[0] === "data", but in namespace mode the index entries carry the namespace prefix (e.g.my-ns/data/model/1/raw), soparts[0]ismy-ns. Consequently:isLazySkippable("my-ns/data/model/1/raw")returnsfalse→metadataOnly: truesilently downloads all files, defeating the lazy-hydration optimization.partitionKeyFromPath("my-ns/data/model/1/raw")returnsundefined→ scoped sync silently falls back to the monolithic index in namespace mode.capabilities()advertisesscopedSync: true,lazyHydration: true, andnamespacedSync: truesimultaneously, implying these can be composed. No existing test covers thenamespace + metadataOnlyornamespace + context.modelscombinations. If swamp-core combines them, behaviour will be silently wrong. This is consistent across the S3 and GCS implementations.Suggested fix: strip the namespace prefix before passing the relative path to
isLazySkippable/partitionKeyFromPath(or add namespace-aware variants), and add tests for the cross-feature combinations.isInternalCacheFiledoesn't exclude namespace-prefixed internal files fromdiscoverIndexFromBucket.discoverIndexFromBucketlists objects under${namespace}/, and then filters viaisInternalCacheFile. The function checks for the bare form (.datastore-index.json,.datastore-sync-state.json, etc.) but NOT the namespace-prefixed form (my-ns/.catalog-export.json). IfexportCataloghas written amy-ns/.catalog-export.jsonto GCS but the namespace index is later lost and re-discovered, the catalog file appears in the synthesized index entries. SubsequentpullChangedcalls would download it as a data file, polluting the local cache.The bug manifests in the fairly narrow window where the namespace index is absent but a catalog export exists. Suggested fix: extend
isInternalCacheFile(or thefilteredpredicate indiscoverIndexFromBucket) to strip any namespace prefix before pattern-matching, or add.catalog-export.jsonas a basename exclusion alongside_catalog.db.writePartitionedIndexwrites to the global_index/prefix rather than a namespace-scoped prefix.Partition files are written to
_index/{key}.jsonunconditionally, even in namespace mode. Two namespaces sharing a bucket and containing models with the same type+id would overwrite each other's partition files. Failures here are silently swallowed (.catch(() => {})), so the monolithic index remains correct and there's no data loss, but scoped-sync clients fall back to the monolithic path instead of the cheaper partition path. Worth noting as a follow-up once namespace + scoped-sync is supported (see suggestion 1).namespaceparameter is not validated inexportCatalog,pullForeignCatalogs, andfetchForeignContent.A caller could pass a value containing path separators or unusual characters. Because these values are used to construct cloud-storage object keys (not local filesystem paths), there's no local filesystem traversal risk, and the
assertSafePathguard covers the local side. However, an untrusted namespace value could create unexpected object-key shapes or collide with internal prefixes. Low severity, but worth adding a basic guard (e.g., reject values containing/or..) if namespaces are meant to be opaque identifiers.Minor type inconsistency in
writePartitionedIndex.GCS uses
const writes: Array<Promise<unknown>> = [];while S3 usesconst writes: Promise<void>[] = [];. Both appended values arePromise<void>after.catch(() => {}), soArray<Promise<void>>is more precise. Not functionally wrong, but breaks the intended symmetry between the two implementations.Adversarial Review
Medium
Partition index files are not namespace-scoped — cross-contamination risk on scoped pull
s3_cache_sync.ts:1567,s3_cache_sync.ts:1608-1609,gcs_cache_sync.ts:1392,gcs_cache_sync.ts:1422-1423writePartitionedIndexwrites to hardcoded_index/{key}.jsonandpullPartitionedIndexreads from_index/{key}.json, both without namespace prefix. In namespaced mode, the monolithic index correctly scopes to{namespace}/.datastore-index.json, but partition files remain global._index/data--aws--ec2--instance.jsoncontaining entries keyed asdata/aws/ec2/instance/.... A new namespaced client callspullChanged({ namespace: "team-a", context: { models: [...] } }). The scoped pull path callspullPartitionedIndex, which reads the global partition files, merges non-namespaced entries (data/aws/ec2/...) into the namespaced in-memory index, and on the nextpushChanged, writes them intoteam-a/.datastore-index.json— contaminating the namespace index with global entries.ns/data/...),partitionKeyFromPathreturnsundefinedbecause it checksrel.startsWith("data/"), so partition writes are no-ops and reads fall back to the correct monolithic index. The issue only manifests during solo→namespace migration on the same bucket with scoped sync active.this.namespace ? `${this.namespace}/_index/${key}.json` : `_index/${key}.json`. Or document that scoped sync is not supported in namespaced mode and skip thepullPartitionedIndexpath when namespace is set.pullForeignCatalogsswallowsAbortErrorfrom signal cancellations3_cache_sync.ts:1675,gcs_cache_sync.ts:1485catch {}in the namespace iteration loop catches all errors, includingAbortErrorthrown when the caller's signal fires. Once anAbortSignalis aborted it stays aborted, so every subsequentgetObjectalso throwsAbortError, which is also caught. The method returns an empty or partial array instead of propagating the abort.AbortSignalwith a 5-second timeout and requests catalogs from["ns-a", "ns-b", "ns-c"]. Network is slow; the signal fires during thens-afetch. The method silently skips all three namespaces and returns[]. The caller proceeds as if no foreign catalogs exist — incorrect but no error is raised.No validation on
namespaceparameter inexportCatalogandpullForeignCatalogss3_cache_sync.ts:1649-1654,s3_cache_sync.ts:1667-1668,gcs_cache_sync.ts:1459-1464,gcs_cache_sync.ts:1478-1479fetchForeignContentvalidatesrelPathfor path traversal (../, leading/, leading\), butexportCatalogandpullForeignCatalogsuse thenamespaceparameter directly in key construction (${namespace}/.catalog-export.json) with zero validation. The same namespace parameter is also used unvalidated infetchForeignContent's key construction and inbindNamespaceforpullChanged/pushChanged.exportCatalog("", rows)writes to key/.catalog-export.json.exportCatalog("ns/../../other", rows)writes tons/../../other/.catalog-export.json. While S3/GCS treats keys as opaque strings (no filesystem traversal), inconsistent validation across parameters in the same API surface is a code smell, and empty or malformed namespaces could cause confusing key layouts.assertValidNamespace(ns)that rejects empty strings, strings containing..segments, leading/trailing slashes, and backslashes. Apply it inbindNamespace,exportCatalog,pullForeignCatalogs, andfetchForeignContent.Low
s3_cache_sync.ts:466-469,gcs_cache_sync.ts:432-433.datastore-sync-state.json) is stored atjoin(cachePath, SYNC_STATE_FILE)regardless of namespace. If two different namespaces share a cachePath (e.g.,~/.cache/swamp/datastores/my-store/), the sidecar written by namespace A contains the ETag/generation fromns-a/.datastore-index.json. When namespace B starts and reads the sidecar, it compares againstns-b/.datastore-index.json— the ETags will almost certainly differ, so the fast path falls through to the slow path. The only risk is a vanishingly unlikely ETag collision causing the fast path to incorrectly return 0.bindNamespacecheck prevents within-instance switching.Verdict
PASS — The namespace-scoped sync feature is well-structured with good test coverage (14 new tests across S3 and GCS). The
bindNamespaceimmutability guard,indexKey()scoping,discoverIndexFromBucketprefix filtering, andfetchForeignContentpath traversal check are all solid. The medium findings are edge cases in mixed solo↔namespace migration or abort semantics — none represent data loss or security risk in the primary use case (pure namespaced or pure solo mode). The code is backward compatible for solo mode as claimed.