MGPT API · v1

Build with MGPT

MGPT is the stateful model boundary behind Continual MI. It exposes a Responses-style API where each conversation lives inside a server-owned universe: send the universe id and the next user input, and MGPT loads the prompt snapshot, conversation buffer, and output contract from durable state before calling the model.

Introduction

The public surface lives at https://mdl.continualmi.com/api/mgpt/. The same Python MGPT service runs locally on http://127.0.0.1:8090/api/mgpt/ for development. There are three endpoints:

POST /api/mgpt/universes — create a persistent universe from a server-side stored prompt.
POST /api/mgpt/responses — generate the next assistant turn for a persisted MGPT API universe.
POST /api/mgpt/images — generate an image for a generated scene (used by MDL gameplay).

The text generation aliases are currently mdl-1-lite-frozen, emw-1-lite-frozen, and cmi-frozen-overridable. New aliases will land as continual-learning checkpoints freeze.

A stateful Responses API

MGPT is intentionally not a thin chat-completions proxy. With chat completions the client must keep the entire history and resend the full prompt envelope every turn. With MGPT, history and configuration live on the server. The client identifies the universe by id and sends only the next input or current turn tool results.

Why this matters. Continual learning needs a durable place to attach per-universe state: KV-cache continuity, retrieval indexes, and later fast-weight or LoRA adapters. A stateless wrapper has nowhere to put that state. The Responses boundary lets MGPT own it without leaking implementation detail to the client.

The contract is small:

// Stateless chat completions (what we are NOT)
POST /v1/chat/completions
{
  "model": "...",
  "messages": [ /* entire history every request */ ],
  "tools": [ /* sent every request */ ]
}

// Stateful MGPT Responses
POST /api/mgpt/responses
{
  "model": "mdl-1-lite-frozen",
  "universe": "813671db-dd71-4108-a1e7-76e9f3e2bc98",
  "userInput": "Continue from the current state."
}

On every request MGPT loads the stored universe and transcript, rebuilds the provider prompt from MGPT-owned state, and calls the underlying provider. The client never sees the system prompt snapshot or provider routing policy.

Universes

A universe is the stateful container for one conversation. It is identified by a UUID and owned by a single user. The universe document holds:

the prompt snapshot resolved at creation time;
the transcript — user turns, assistant tool calls, tool results, and final assistant text;
MGPT-owned metadata for model execution and future continual state.

Public MGPT API universes are persisted in Postgres under the mgpt schema. MDL gameplay universes remain separate under mdl.universes.

Universes are created from server-side stored prompts. The prompt text is copied into the universe's snapshot and never returned to the client. Updating a stored prompt file is a state change for future universes — existing universes keep their frozen snapshot.

Authentication

Every endpoint authenticates with a bearer token. Keys are issued from the MGPT API console and stored hashed in core.api_keys. The current key format is cmi_live_<public>.<secret>.

Authorization: Bearer cmi_live_<public>.<secret>
Content-Type: application/json

Required scopes per route:

mgpt.responses:create — for POST /universes and POST /responses.
mgpt.images:create — for POST /images.
mgpt.logs:read — for log inspection.

The platform gateway derives the calling user from the key owner. Do not send userId or usageForward from external clients — those fields are platform-owned.

Quickstart

Create a universe from a stored prompt, then send the first user input to that universe. The public API persists the transcript on the MGPT side.

export CONTINUAL_API_KEY="cmi_live_..."

UNIVERSE_ID=$(curl -s https://mdl.continualmi.com/api/mgpt/universes \
  -H "Authorization: Bearer $CONTINUAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "storedPrompt": "emwaver-prompt",
    "displayName": "EMWaver Agent"
  }' | jq -r .universe)

curl https://mdl.continualmi.com/api/mgpt/responses \
  -H "Authorization: Bearer $CONTINUAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"emw-1-lite-frozen\",
    \"universe\": \"$UNIVERSE_ID\",
    \"userInput\": \"Help me inspect the connected CC1101.\"
  }"

Public Responses calls always run against stored MGPT API universes. MDL gameplay uses a separate internal backend route.

Creating a universe

Universes are created from server-side stored prompts. Pass the stored-prompt name and an optional display name. The prompt body is loaded from backend-api/prompts/<name>.md (or .txt), copied into the universe snapshot, and not echoed back.

curl https://mdl.continualmi.com/api/mgpt/universes \
  -H "Authorization: Bearer $CONTINUAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "storedPrompt": "emwaver-prompt",
    "displayName": "EMWaver Agent"
  }'

The response returns the new universe id:

{
  "universe": "813671db-dd71-4108-a1e7-76e9f3e2bc98",
  "userId": "eaa0d5f0-81e9-4c79-b6f2-cf7db0cb5d63",
  "prompt": {
    "storedPrompt": "emwaver-prompt",
    "stored": true
  }
}

Stored universe requests

The normal generation path: send the universe id and the next user input. MGPT loads the universe, rebuilds the prompt from the persisted transcript, and calls the configured provider.

curl https://mdl.continualmi.com/api/mgpt/responses \
  -H "Authorization: Bearer $CONTINUAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mdl-1-lite-frozen",
    "universe": "813671db-dd71-4108-a1e7-76e9f3e2bc98",
    "userInput": "Continue from the current state.",
    "requestId": "req_quickstart_001"
  }'

MGPT stores the user message immediately. If the provider returns tool calls, those assistant tool-call messages are stored before the response is returned.

Tool continuation

When a response returns toolCalls, execute them locally and send the cumulative toolResults for that pending turn. MGPT appends the tool results to the same transcript before asking the provider for the final answer.

{
  "model": "emw-1-lite-frozen",
  "universe": "813671db-dd71-4108-a1e7-76e9f3e2bc98",
  "userInput": "Help me inspect the connected CC1101.",
  "toolResults": [
    {
      "callId": "call_read_script",
      "name": "read_script",
      "output": { "ok": true, "result": { "bytes": 1200 } },
      "ok": true
    }
  ]
}

Request fields

Field	Required	Notes
`model`	yes	Public alias. Currently `mdl-1-lite-frozen`, `emw-1-lite-frozen`, or `cmi-frozen-overridable`.
`modelOverride`	no	OpenRouter provider model override. Supported only with `cmi-frozen-overridable`; accepts any OpenRouter model id.
`userInput`	yes	The next user message. Already serialized by the caller.
`universe`	yes	MGPT API universe id created by `POST /api/mgpt/universes`.
`requestId`	no	Caller correlation id, surfaced in logs and diagnostics.
`tools` / `toolChoice`	no	Function-calling tools forwarded to the provider.

Platform-owned fields (userId, usageForward) are filled in by the gateway and must not be sent from external clients.

Response shape

MGPT returns the raw assistant text plus generation metadata. Consumer products own parse / apply / persist after the handoff.

{
  "assistantRaw": "...",
  "previewText": "...",
  "promptFingerprint": "sha256:...",
  "nextUniverseTime": "2026-05-06T12:00:00Z",
  "requestLog": { "id": "log_..." },
  "usageSettlement": { /* tokens, billing */ },
  "timingBreakdown": { "totalMs": 612, "providerMs": 540 },
  "diagnostics": {
    "summary": { /* ... */ },
    "serverStages": [ /* per-stage timings */ ],
    "storage": { /* universe load info */ },
    "prompt": { /* window selection */ },
    "provider": { /* model + provider routing */ }
  }
}

assistantRaw is the canonical handoff. Everything else is optional metadata — most clients only need assistantRaw and promptFingerprint.

Transcript persistence

The public Responses API persists every message MGPT produces or receives for a universe: user input, assistant tool calls, tool results, and final assistant text. Tool continuation calls append only new tool results for the pending turn, so callers can retry with cumulative results without duplicating the previous messages.

The internal MDL gameplay backend keeps its existing statePointer and block-window contract on /backend-api/mgpt/responses. That route is separate from the public MGPT API transcript path.

Diagnostics

Every response carries a diagnostics object with MGPT-owned server-side detail. Persist these stage names verbatim — they are the audit trail for generation-time decisions.

summary — top-line totals (tokens, latency, model alias resolved).
serverStages — per-stage timings (auth, load, truncate, prompt build, provider).
storage — how the universe was loaded (Redis hit, Postgres fallback, age).
prompt — window selection, block alignment, structured-output mode.
provider — routed provider, model id, response format, retry attempts.

Errors

All errors return JSON with a detail field and an HTTP status code.

{ "detail": "Universe not found" }

400 — validation error (missing field, invalid tool result payload, unsupported model override).
401 — missing or invalid bearer token.
403 — token lacks required scope, or universe not owned by the calling user.
404 — universe id does not exist.
5xx — provider or storage failure. Safe to retry with the same requestId; retries are de-duplicated.