On-Demand Responses

Create Response

curl --request POST \
  --url https://external.nebius.tensormesh.ai/v1/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --header 'X-User-Id: <x-user-id>' \
  --data '
{
  "model": "openai-gpt-oss-120b-gpu-type-h200x1_8nic16",
  "input": "Say hello."
}
'

{
  "id": "resp_123",
  "object": "response",
  "model": "<string>",
  "output": [
    {
      "id": "out_123",
      "type": "message",
      "role": "<string>",
      "status": "<string>",
      "content": [
        {
          "type": "output_text",
          "text": "hello",
          "annotations": [
            {}
          ]
        }
      ]
    }
  ],
  "created_at": 123,
  "status": "<string>"
}

POST

responses

Create Response

curl --request POST \
  --url https://external.nebius.tensormesh.ai/v1/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --header 'X-User-Id: <x-user-id>' \
  --data '
{
  "model": "openai-gpt-oss-120b-gpu-type-h200x1_8nic16",
  "input": "Say hello."
}
'

{
  "id": "resp_123",
  "object": "response",
  "model": "<string>",
  "output": [
    {
      "id": "out_123",
      "type": "message",
      "role": "<string>",
      "status": "<string>",
      "content": [
        {
          "type": "output_text",
          "text": "hello",
          "annotations": [
            {}
          ]
        }
      ]
    }
  ],
  "created_at": 123,
  "status": "<string>"
}

Use this page when you want the routed On-Demand responses endpoint over raw HTTP.

Auth: Authorization: Bearer <API_KEY>
Routing: required X-User-Id: <uuid>
Host: choose the external Tensormesh host for your provider
Model: pass a served On-Demand model name in the JSON request body

Authorizations

Authorization

string

header

required

Bearer authentication using your On-Demand API key. Format: Bearer <API_KEY>

Headers

X-User-Id

string<uuid>

required

Tensormesh user id used for attribution and routing.

Body

application/json

model

string

required

On-Demand served model name to use.

Example:

"openai-gpt-oss-120b-gpu-type-h200x1_8nic16"

input

any

required

Input passed to the responses endpoint.

max_output_tokens

integer

Optional limit for generated output tokens.

Response

Successful Response

string

required

Example:

"resp_123"

object

string

required

Example:

"response"

model

string

required

output

ResponseOutputItem · object[]

required

Show child attributes

created_at

integer

status

string | null

On-Demand Text Completions

On-Demand Tokenize

⌘I

Get Started

On-Demand Inference

Serverless Inference

Models

Billing - Balance

Billing - Address

Billing - Transactions

Billing - Pricing

Billing - Products

Billing - Stripe

Billing - Model Billing

Observability

Activity

Support

Support - Reserved Deployments

User

Admin - Models

Admin - Users

Admin - Billing

Admin - Products

Admin - Pricing

Authorizations

Headers

Body

Response