Skip to main content
The SDK keeps auth and routing explicit. It does not read CLI config files by default. The public inference surface exposes chat.completions, models, completions, responses, tokenize, detokenize, health, and version on both surfaces. The SDK resolves configuration in this order:
  1. constructor arguments
  2. environment variables
  3. built-in defaults

Surface Boundaries

  • Control Plane
    • auth: bearer token
    • client namespace: client.control_plane
  • Serverless inference
    • auth: inference API key for POST routes; the default public host also serves models, health, and version without one
    • client namespace: client.inference.serverless
    • extra namespaces: models, completions, responses, tokenize, detokenize, health, and version
    • model value: serverless model name
  • On-Demand inference
    • auth: inference API key
    • extra header: X-User-Id
    • client namespace: client.inference.on_demand
    • extra namespaces: models, completions, responses, tokenize, detokenize, health, and version
    • model value: served gateway model name, not the Control Plane modelId UUID
    • config naming note: the CLI stores this served model name under gateway_model_id, which remains a compatibility key; that string is the value you pass as model

Get Credentials

  • For a Control Plane bearer token, use CLI Authentication. The CLI browser flow stores the token locally, and tm auth print-token --yes-i-know can print it for a controlled SDK setup when you need it outside the CLI.
  • For an inference API key, either use the key your Tensormesh environment already issued to you, or create one through the authenticated workflow:
tm auth login
USER_ID="$(tm --output json auth whoami | python3 -c 'import json,sys; print(json.load(sys.stdin)["user"]["id"])')"
tm users api-keys create --user-id "$USER_ID" --name sdk-key --yes
If your environment does not expose self-serve API key creation, ask your operator or admin for the exact inference API key to use.
  • If you only have inference credentials, you can still use the serverless SDK surface without Control Plane login.
If you are coming from the CLI-managed flow, gateway_api_key is the stored inference API key used by the SDK as inference_api_key. gateway_model_id remains a config compatibility key, and its value is the served model name string you pass as model.

Environment Variables

The SDK supports these environment variables:
  • TENSORMESH_CONTROL_PLANE_TOKEN
  • TENSORMESH_CONTROL_PLANE_BASE_URL
  • TENSORMESH_INFERENCE_API_KEY
  • TENSORMESH_SERVERLESS_BASE_URL
  • TENSORMESH_ON_DEMAND_BASE_URL
  • TENSORMESH_ON_DEMAND_USER_ID
  • TENSORMESH_TIMEOUT_SECONDS
  • TENSORMESH_MAX_RETRIES
  • TENSORMESH_CA_BUNDLE
Blank environment-variable values are treated as unset. Explicit constructor arguments still need to be valid non-empty values.

Constructor-Based Configuration

from tensormesh import Tensormesh

client = Tensormesh(
    control_plane_token="YOUR_CONTROL_PLANE_TOKEN",
    inference_api_key="YOUR_INFERENCE_API_KEY",
    on_demand_base_url="https://YOUR_ON_DEMAND_BASE_URL",
    on_demand_user_id="00000000-0000-0000-0000-000000000000",
    timeout=30,
    max_retries=2,
)
max_retries applies to idempotent HTTP methods. The main inference calls on this SDK surface are POST requests such as /v1/chat/completions, /v1/completions, /v1/responses, /tokenize, and /detokenize, so those requests are not retried automatically.

Environment-Based Configuration

export TENSORMESH_CONTROL_PLANE_TOKEN="YOUR_CONTROL_PLANE_TOKEN"
export TENSORMESH_INFERENCE_API_KEY="YOUR_INFERENCE_API_KEY"
export TENSORMESH_ON_DEMAND_BASE_URL="https://YOUR_ON_DEMAND_BASE_URL"
export TENSORMESH_ON_DEMAND_USER_ID="00000000-0000-0000-0000-000000000000"
from tensormesh import Tensormesh

client = Tensormesh()

When To Use CLI Login

The SDK does not require tm auth login.
  • Production deployments and CI environments: supply credentials through environment variables (TENSORMESH_INFERENCE_API_KEY, TENSORMESH_CONTROL_PLANE_TOKEN, etc.). No browser interaction is required.
  • Local development: use tm auth login for the browser-based Control Plane auth flow when you want the CLI to store and manage the token locally.
When you do configure On-Demand directly, treat the base URL as environment-specific. Values such as external.nebius.tensormesh.ai are provider-specific examples, not universal defaults.

Common Mistakes

  • trying to use a control-plane bearer token for inference
  • forgetting that on-demand inference requires both on_demand_base_url and on_demand_user_id
  • using a Control Plane modelId UUID where the gateway expects a served model name
  • assuming the SDK reads ~/.config/tensormesh/ automatically
  • mixing serverless and on-demand base URLs