Auth And Config

The SDK keeps auth and routing explicit. It does not read CLI config files by default. The public inference surface exposes chat.completions, models, completions, responses, tokenize, detokenize, health, and version on both surfaces. The SDK resolves configuration in this order:

constructor arguments
environment variables
built-in defaults

Surface Boundaries

Control Plane
- auth: bearer token
- client namespace: client.control_plane
Serverless inference
- auth: inference API key for POST routes; the default public host also serves models, health, and version without one
- client namespace: client.inference.serverless
- extra namespaces: models, completions, responses, tokenize, detokenize, health, and version
- model value: serverless model name
On-Demand inference
- auth: inference API key
- extra header: X-User-Id
- client namespace: client.inference.on_demand
- extra namespaces: models, completions, responses, tokenize, detokenize, health, and version
- model value: served gateway model name, not the Control Plane modelId UUID
- config naming note: the CLI stores this served model name under gateway_model_id, which remains a compatibility key; that string is the value you pass as model

Get Credentials

For a Control Plane bearer token, use CLI Authentication. The CLI browser flow stores the token locally, and tm auth print-token --yes-i-know can print it for a controlled SDK setup when you need it outside the CLI.
For an inference API key, either use the key your Tensormesh environment already issued to you, or create one through the authenticated workflow:

tm auth login
USER_ID="$(tm --output json auth whoami | python3 -c 'import json,sys; print(json.load(sys.stdin)["user"]["id"])')"
tm users api-keys create --user-id "$USER_ID" --name sdk-key --yes

If your environment does not expose self-serve API key creation, ask your operator or admin for the exact inference API key to use.

If you only have inference credentials, you can still use the serverless SDK surface without Control Plane login.

If you are coming from the CLI-managed flow, gateway_api_key is the stored inference API key used by the SDK as inference_api_key. gateway_model_id remains a config compatibility key, and its value is the served model name string you pass as model.

Environment Variables

The SDK supports these environment variables:

TENSORMESH_CONTROL_PLANE_TOKEN
TENSORMESH_CONTROL_PLANE_BASE_URL
TENSORMESH_INFERENCE_API_KEY
TENSORMESH_SERVERLESS_BASE_URL
TENSORMESH_ON_DEMAND_BASE_URL
TENSORMESH_ON_DEMAND_USER_ID
TENSORMESH_TIMEOUT_SECONDS
TENSORMESH_MAX_RETRIES
TENSORMESH_CA_BUNDLE

Blank environment-variable values are treated as unset. Explicit constructor arguments still need to be valid non-empty values.

Constructor-Based Configuration

from tensormesh import Tensormesh

client = Tensormesh(
    control_plane_token="YOUR_CONTROL_PLANE_TOKEN",
    inference_api_key="YOUR_INFERENCE_API_KEY",
    on_demand_base_url="https://YOUR_ON_DEMAND_BASE_URL",
    on_demand_user_id="00000000-0000-0000-0000-000000000000",
    timeout=30,
    max_retries=2,
)

max_retries applies to idempotent HTTP methods. The main inference calls on this SDK surface are POST requests such as /v1/chat/completions, /v1/completions, /v1/responses, /tokenize, and /detokenize, so those requests are not retried automatically.

Environment-Based Configuration

export TENSORMESH_CONTROL_PLANE_TOKEN="YOUR_CONTROL_PLANE_TOKEN"
export TENSORMESH_INFERENCE_API_KEY="YOUR_INFERENCE_API_KEY"
export TENSORMESH_ON_DEMAND_BASE_URL="https://YOUR_ON_DEMAND_BASE_URL"
export TENSORMESH_ON_DEMAND_USER_ID="00000000-0000-0000-0000-000000000000"

from tensormesh import Tensormesh

client = Tensormesh()

The SDK does not require tm auth login.

Production deployments and CI environments: supply credentials through environment variables (TENSORMESH_INFERENCE_API_KEY, TENSORMESH_CONTROL_PLANE_TOKEN, etc.). No browser interaction is required.
Local development: use tm auth login for the browser-based Control Plane auth flow when you want the CLI to store and manage the token locally.

When you do configure On-Demand directly, treat the base URL as environment-specific. Values such as external.nebius.tensormesh.ai are provider-specific examples, not universal defaults.

Common Mistakes

trying to use a control-plane bearer token for inference
forgetting that on-demand inference requires both on_demand_base_url and on_demand_user_id
using a Control Plane modelId UUID where the gateway expects a served model name
assuming the SDK reads ~/.config/tensormesh/ automatically
mixing serverless and on-demand base URLs

Overview

Guides

Surface Boundaries

Get Credentials

Environment Variables

Constructor-Based Configuration

Environment-Based Configuration

Common Mistakes

Overview

Guides

​Surface Boundaries

​Get Credentials

​Environment Variables

​Constructor-Based Configuration

​Environment-Based Configuration

​When To Use CLI Login

​Common Mistakes

​Related Guides

Surface Boundaries

Get Credentials

Environment Variables

Constructor-Based Configuration

Environment-Based Configuration

When To Use CLI Login

Common Mistakes

Related Guides