Storage Architecture
ICE uses three distinct storage interaction patterns. Each serves a different function and requires separate configuration.
1. Active Semantic Ledger (Hot Storage)
The active memory store. All session history, context fragments, and vector embeddings live here during their active lifespan.
Technology: PostgreSQL with the pgvector extension.
Why not object storage (S3)? The retrieval path requires sub-second vector similarity queries with per-tenant RLS enforcement. Object storage introduces HTTP and serialization overhead incompatible with this access pattern. PostgreSQL on NVMe is the correct substrate.
Configuration:
| Variable | Description | Required |
|---|---|---|
DATABASE_URL | PostgreSQL connection string. Must have pgvector installed. | Yes |
REDIS_URL | Redis connection string for the Hot-Cache (session sliding window). | Yes |
ICE_MEMORY_CAP_GB | Hard RAM ceiling for the ICE process. Engine terminates cleanly if exceeded. | No (8) |
DATABASE_URL="postgresql://user:pass@db-host:5432/ice_db"
REDIS_URL="redis://cache-host:6379"
ICE_MEMORY_CAP_GB=16
2. Cold Storage Delegation
For data retention and compliance archival. ICE does not connect directly to S3. Instead, it delegates expired records to a webhook endpoint you control.
Delegation pipeline:
- Data lives in the active ledger for
ICE_RETENTION_DAYS. - Every 24 hours, the ICE
retention_purge_loopidentifies expired records. - ICE POSTs each expired record as a JSON payload to
ICE_PRE_PURGE_WEBHOOK_URL. - Your webhook receives the payload and writes it to your object storage of choice (AWS S3, GCS, Azure Blob, Glacier, etc.).
- ICE permanently deletes the local record only after receiving a
2xxresponse from the webhook.
This design keeps cold storage configuration inside your infrastructure boundary. ICE has no direct knowledge of your bucket provider, credentials, or retention policy.
Configuration:
| Variable | Description | Required |
|---|---|---|
ICE_RETENTION_DAYS | Days a record stays in the active ledger before delegation. | No (30) |
ICE_PRE_PURGE_WEBHOOK_URL | Endpoint that receives expired records for archival. | No |
ICE_RETENTION_DAYS=90
ICE_PRE_PURGE_WEBHOOK_URL="https://internal.yourcompany.com/webhooks/ice-archive"
Webhook payload (JSON):
{
"user_id": "user-alice",
"session_id": "project-alpha",
"expired_at": "2026-05-10T00:00:00Z",
"chunks": [
{ "chunk_id": "c_001", "text": "...", "embedding_model": "text-embedding-3-small" }
]
}
ICE retries the webhook up to 3 times with exponential backoff before logging a permanent failure. Records are not deleted locally if all retries fail.
3. Document Ingestion (Reading from Storage)
A separate pipeline for feeding external documents into the Semantic Ledger. This is not related to how ICE stores its internal session memory.
Local File System
Files must be in the sandboxed ICE_UPLOAD_DIR. See Multimodal Ingest for format details.
ice.ingest(
file_path="annual_report_2026.pdf", # Relative to ICE_UPLOAD_DIR
x_user_id="user-alice",
x_session_id="finance-analysis"
)
| Variable | Description | Default |
|---|---|---|
ICE_UPLOAD_DIR | Sandboxed directory for local file ingestion. | /tmp/ice/uploads |
Object Storage (Cloud URI)
For ingesting documents directly from cloud object storage, pass a cloud URI. ICE handles the download, parsing, chunking, and vectorization internally.
AWS S3
ice.ingest(
uri="s3://my-enterprise-bucket/project_alpha_docs/",
x_user_id="user-alice",
x_session_id="project-alpha"
)
Credentials are resolved from the standard AWS credential chain (environment variables, instance profile, or ECS task role).
Google Cloud Storage
ice.ingest(
uri="gs://my-gcp-bucket/project_alpha_docs/",
x_user_id="user-alice",
x_session_id="project-alpha"
)
Credentials are resolved from the standard GCP credential chain (Application Default Credentials or a service account key).
No ICE-specific variable is required for either provider. Credentials are handled by the host environment.
Storage Pattern Summary
| Pattern | Technology | ICE Touches Object Storage? | Configuration |
|---|---|---|---|
| Active Semantic Ledger | PostgreSQL + Redis | No | DATABASE_URL, REDIS_URL |
| Cold Storage Delegation | Your webhook → Your bucket | No (delegates) | ICE_RETENTION_DAYS, ICE_PRE_PURGE_WEBHOOK_URL |
| Document Ingestion (local) | Local filesystem | No | ICE_UPLOAD_DIR |
| Document Ingestion (S3) | AWS S3 (read-only) | Yes (read) | uri="s3://..." in SDK call |
| Document Ingestion (GCS) | Google Cloud Storage (read-only) | Yes (read) | uri="gs://..." in SDK call |