API Reference

ICE exposes an OpenAI-compatible REST API. If you already use an OpenAI, Anthropic, or Ollama SDK, point base_url at your ICE instance and add the two memory headers — nothing else changes.

Headers

These two headers are how ICE scopes memory. Every request to /v1/chat/completions and /v1/ingest should include them.

Header	Required	Description
`X-Session-Id`	Yes	Identifies the conversation or workspace. All context is stored and retrieved under this ID.
`X-User-Id`	No (default: `default-user`)	Identifies the user. Enforces per-user data isolation via Row-Level Security at the database layer.

Without X-Session-Id, ICE operates statelessly — no memory is stored or retrieved.

Endpoints

`POST /v1/chat/completions`

Drop-in replacement for the OpenAI chat completions endpoint. ICE retrieves relevant context from the session ledger and injects it into the prompt before forwarding to the upstream model.

Request

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-Session-Id: project-alpha" \
  -H "X-User-Id: alice" \
  -d '{
    "model": "gpt-4o",
    "messages": [{ "role": "user", "content": "Summarise what we discussed yesterday." }],
    "stream": false
  }'

Parameters

Field	Type	Required	Description
`model`	string	Yes	Model name to route to (`gpt-4o`, `claude-3-5-sonnet`, `llama3`, etc.)
`messages`	array	Yes	Standard OpenAI messages array.
`stream`	boolean	No	`true` for SSE streaming, `false` for a single response object.
`tools`	array	No	Standard tool definitions. ICE persists tool call state across turns automatically.

Response (non-streaming)

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-4o",
  "choices": [{
    "index": 0,
    "message": { "role": "assistant", "content": "Yesterday we discussed..." },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 210, "completion_tokens": 48, "total_tokens": 258 }
}

Response (streaming)

Standard SSE chunks, terminated with data: [DONE]. Format is identical to the OpenAI streaming spec.

`POST /v1/ingest`

Loads a document into the session's memory ledger. After ingestion, content is available for retrieval in all subsequent chat completions under the same X-Session-Id.

Request

curl -X POST "http://localhost:8000/v1/ingest?file_path=annual_report.pdf" \
  -H "X-Session-Id: finance-q3" \
  -H "X-User-Id: alice"

Parameter	Type	Required	Description
`file_path`	query string	Yes	Path relative to `ICE_UPLOAD_DIR`.

For cloud storage ingestion, use the SDK ingest() method with a uri parameter (s3://... or gs://...). See Storage Architecture.

Response

{
  "status": "success",
  "message": "File annual_report.pdf ingested successfully.",
  "tokens_processed": 128400
}

`GET /health`

Returns the engine status and connectivity of backing services.

curl http://localhost:8000/health

{
  "status": "online",
  "engine": "ICE v2.7.755",
  "ledger_status": "connected",
  "cache_status": "connected"
}

SDK Usage

Python
Node.js / TypeScript

from ice.sdk import ICEClient

ice = ICEClient(api_url="http://localhost:8000")

# Chat with memory
response = ice.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What did we cover last session?"}],
    x_session_id="project-alpha",
    x_user_id="alice"
)

# Ingest a local file
ice.ingest(
    file_path="spec.pdf",
    x_session_id="project-alpha",
    x_user_id="alice"
)

# Ingest from cloud storage
ice.ingest(
    uri="s3://my-bucket/docs/",
    x_session_id="project-alpha",
    x_user_id="alice"
)

import { IceClient } from '@dopove/ice-engine';

const ice = new IceClient({ licenseJwt: process.env.ICE_LICENSE_JWT });

// Chat with memory
const response = await ice.chat.completions.create({
    model: "gpt-4o",
    messages: [{ role: "user", content: "What did we cover last session?" }],
    sessionId: "project-alpha",
    userId: "alice"
});

// Tool call — tool results are pinned across turns automatically
const toolResponse = await ice.chat.completions.create({
    model: "gpt-4o",
    messages: [
        { role: "user", content: "Check the server logs." },
        { role: "assistant", content: null, tool_calls: [/* ... */] },
        { role: "tool", content: '{"error": "timeout"}', name: "get_logs" }
    ],
    sessionId: "ops-debug-01"
});

Using Existing SDKs

Because ICE is OpenAI-compatible, you can use the official openai Python or JS library with no code changes beyond base_url and the two headers.

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="not-used"   # ICE does not use an API key
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarise the Q3 report."}],
    extra_headers={
        "X-Session-Id": "finance-q3",
        "X-User-Id": "alice"
    }
)

Headers​

Endpoints​

POST /v1/chat/completions​

POST /v1/ingest​

GET /health​

SDK Usage​

Using Existing SDKs​