Chat completions

The chat completions endpoint is fully OpenAI-compatible. Use it directly or through any OpenAI SDK.

Create a chat completion

POST /v1/chat/completions

Request body

Field	Type	Required	Default	Description
`model`	string	Yes	—	Model ID (e.g. `pearl-1`)
`messages`	array	Yes	—	Conversation messages
`stream`	boolean	No	`true`	Whether to stream the response
`temperature`	number	No	—	Sampling temperature (0–2)
`max_tokens`	number	No	—	Maximum tokens to generate
`max_completion_tokens`	number	No	—	Upper bound for output tokens
`top_p`	number	No	—	Nucleus sampling threshold
`stop`	string \| string[]	No	—	Stop sequences
`frequency_penalty`	number	No	`0`	Penalize tokens by frequency (−2 to 2)
`presence_penalty`	number	No	`0`	Penalize tokens by presence (−2 to 2)
`seed`	integer	No	—	For deterministic sampling
`n`	integer	No	`1`	Number of choices to generate
`tools`	array	No	—	Tool/function definitions (max 128)
`tool_choice`	string \| object	No	—	`"none"`, `"auto"`, `"required"`, or specific function
`response_format`	object	No	—	`{"type":"text"}`, `{"type":"json_object"}`, or `{"type":"json_schema",...}`

Use GET /v1/models to discover available models.

Message format

json

{
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Hello!" },
    { "role": "assistant", "content": "Hi there! How can I help?" },
    { "role": "user", "content": "What's the weather like?" }
  ]
}

Supported roles: system, user, assistant, tool.

Non-streaming response

Set stream: false to get a single JSON response.

curl

curl https://api.mindforge.ai/v1/chat/completions \
  -H "Authorization: Bearer $MINDFORGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "pearl-1",
    "messages": [{ "role": "user", "content": "Hello!" }],
    "stream": false
  }'

Response

json

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1714000000,
  "model": "pearl-1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop",
      "logprobs": null
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 14,
    "total_tokens": 39
  }
}

Streaming response

Streaming is the default. The response is a stream of server-sent events (SSE).

curl

curl https://api.mindforge.ai/v1/chat/completions \
  -H "Authorization: Bearer $MINDFORGE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "pearl-1",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'

Stream format

Each event is a line starting with data: followed by a JSON chunk:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1714000000,"model":"pearl-1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1714000000,"model":"pearl-1","choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1714000000,"model":"pearl-1","choices":[{"index":0,"delta":{"content":"!"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1714000000,"model":"pearl-1","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}

data: [DONE]

The first chunk includes delta.role: "assistant"
Content chunks include delta.content with the text fragment
The final chunk has an empty delta and finish_reason: "stop"
The stream ends with data: [DONE]

Tool calls

Pass tools in the request to enable function calling. The model may respond with tool calls instead of (or in addition to) text.

Non-streaming tool call

When tools are called, finish_reason is "tool_calls":

json

{
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\":\"San Francisco\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}

Streaming tool call

Tool calls appear as delta.tool_calls in the stream:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1714000000,"model":"pearl-1","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"id":"call_abc123","type":"function","function":{"name":"get_weather","arguments":"{\"location\":\"San Francisco\"}"}}]},"logprobs":null,"finish_reason":null}]}

Error responses

Errors follow the OpenAI error format:

json

{
  "error": {
    "message": "Model 'nonexistent' not found.",
    "type": "invalid_request_error",
    "code": "model_not_found"
  }
}

Status	Meaning
`400`	Invalid request (bad parameters, missing fields)
`401`	Unauthorized — invalid or missing API key
`404`	Model not found
`502`	Inference backend unavailable

Chat completions ​

Create a chat completion ​

Request body ​

Message format ​

Non-streaming response ​

Response ​

Streaming response ​

Stream format ​

Tool calls ​

Non-streaming tool call ​

Streaming tool call ​

Error responses ​

Chat completions

Create a chat completion

Request body

Message format

Non-streaming response

Response

Streaming response

Stream format

Tool calls

Non-streaming tool call

Streaming tool call

Error responses