S3-First Architecture
Using S3 as the durable source of truth for user data, with a relational database as a query-optimized read layer.
Last updated: 2026-04-12
Overview
The pattern: all writes go to S3. A Lambda function syncs changes to PostgreSQL. List queries hit the database (fast). Single-item reads go to S3 (freshest data).
Writes → S3 (source of truth)
↓
Lambda trigger
↓
PostgreSQL (fs_files table)
↓
Reads ← Fast queries
This inverts the typical assumption that a database is always the right primary store for user data.
Key Points
- S3 durability: 11 nines. A typical Postgres instance doesn’t come close
- Versioning for free: S3 versioning gives an automatic audit trail — every write is tracked without any application-layer logging
- Human-readable debugging: YAML and markdown files can be inspected with
cat. No DB client, no query needed - Cost: S3 storage is significantly cheaper than database storage at scale
- Sync architecture: two Lambda functions keep S3 and PostgreSQL in sync
fs-sync: triggered by S3 upload/delete events via SNS → real-time upsert/delete infs_filesfs-reconcile: EventBridge every 3 hours → full S3 vs DB scan, fixes any discrepancies from cold starts or network blips- Both use upsert with timestamp guards so newer data always wins
- User memories: each user has
/private/memories/UserMemories.mdin S3. Plain markdown, editable in the UI. Loaded and injected as context on every conversation — no schema migrations needed - Skills and watchlists follow the same pattern: YAML files in S3, queried via PostgreSQL
Connections
- agent-sandbox — the sandbox mount system exposes S3 prefixes as filesystem paths to the agent
- claude-code-skills — skills are stored in S3, discovered via SQL query against the synced
fs_filestable
Sources
- Lessons from Building AI Agents for Financial Services — added 2026-04-12