API Reference
ICE exposes an OpenAI-compatible REST API. If you already use an OpenAI, Anthropic, or Ollama SDK, point base_url at your ICE instance and add the two memory headers — nothing else changes.
Headers
These two headers are how ICE scopes memory. Every request to /v1/chat/completions and /v1/ingest should include them.
| Header | Required | Description |
|---|---|---|
X-Session-Id | Yes | Identifies the conversation or workspace. All context is stored and retrieved under this ID. |
X-User-Id | No (default: default-user) | Identifies the user. Enforces per-user data isolation via Row-Level Security at the database layer. |
Without X-Session-Id, ICE operates statelessly — no memory is stored or retrieved.
Endpoints
POST /v1/chat/completions
Drop-in replacement for the OpenAI chat completions endpoint. ICE retrieves relevant context from the session ledger and injects it into the prompt before forwarding to the upstream model.
Request
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-Session-Id: project-alpha" \
-H "X-User-Id: alice" \
-d '{
"model": "gpt-4o",
"messages": [{ "role": "user", "content": "Summarise what we discussed yesterday." }],
"stream": false
}'
Parameters
| Field | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model name to route to (gpt-4o, claude-3-5-sonnet, llama3, etc.) |
messages | array | Yes | Standard OpenAI messages array. |
stream | boolean | No | true for SSE streaming, false for a single response object. |
tools | array | No | Standard tool definitions. ICE persists tool call state across turns automatically. |
Response (non-streaming)
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"model": "gpt-4o",
"choices": [{
"index": 0,
"message": { "role": "assistant", "content": "Yesterday we discussed..." },
"finish_reason": "stop"
}],
"usage": { "prompt_tokens": 210, "completion_tokens": 48, "total_tokens": 258 }
}
Response (streaming)
Standard SSE chunks, terminated with data: [DONE]. Format is identical to the OpenAI streaming spec.
POST /v1/ingest
Loads a document into the session's memory ledger. After ingestion, content is available for retrieval in all subsequent chat completions under the same X-Session-Id.
Request
curl -X POST "http://localhost:8000/v1/ingest?file_path=annual_report.pdf" \
-H "X-Session-Id: finance-q3" \
-H "X-User-Id: alice"
| Parameter | Type | Required | Description |
|---|---|---|---|
file_path | query string | Yes | Path relative to ICE_UPLOAD_DIR. |
For cloud storage ingestion, use the SDK ingest() method with a uri parameter (s3://... or gs://...). See Storage Architecture.
Response
{
"status": "success",
"message": "File annual_report.pdf ingested successfully.",
"tokens_processed": 128400
}
GET /health
Returns the engine status and connectivity of backing services.
curl http://localhost:8000/health
{
"status": "online",
"engine": "ICE v2.7.755",
"ledger_status": "connected",
"cache_status": "connected"
}
SDK Usage
- Python
- Node.js / TypeScript
from ice.sdk import ICEClient
ice = ICEClient(api_url="http://localhost:8000")
# Chat with memory
response = ice.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What did we cover last session?"}],
x_session_id="project-alpha",
x_user_id="alice"
)
# Ingest a local file
ice.ingest(
file_path="spec.pdf",
x_session_id="project-alpha",
x_user_id="alice"
)
# Ingest from cloud storage
ice.ingest(
uri="s3://my-bucket/docs/",
x_session_id="project-alpha",
x_user_id="alice"
)
import { IceClient } from '@dopove/ice-engine';
const ice = new IceClient({ licenseJwt: process.env.ICE_LICENSE_JWT });
// Chat with memory
const response = await ice.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "What did we cover last session?" }],
sessionId: "project-alpha",
userId: "alice"
});
// Tool call — tool results are pinned across turns automatically
const toolResponse = await ice.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "user", content: "Check the server logs." },
{ role: "assistant", content: null, tool_calls: [/* ... */] },
{ role: "tool", content: '{"error": "timeout"}', name: "get_logs" }
],
sessionId: "ops-debug-01"
});
Using Existing SDKs
Because ICE is OpenAI-compatible, you can use the official openai Python or JS library with no code changes beyond base_url and the two headers.
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="not-used" # ICE does not use an API key
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Summarise the Q3 report."}],
extra_headers={
"X-Session-Id": "finance-q3",
"X-User-Id": "alice"
}
)