The SDK keeps auth and routing explicit. It does not read CLI config files by default. The public inference surface exposesDocumentation Index
Fetch the complete documentation index at: https://docs.tensormesh.ai/llms.txt
Use this file to discover all available pages before exploring further.
chat.completions, models, completions, responses, tokenize, detokenize, health, and version on both surfaces.
The SDK resolves configuration in this order:
- constructor arguments
- environment variables
- built-in defaults
Surface Boundaries
- Control Plane
- auth: bearer token
- client namespace:
client.control_plane
- Serverless inference
- auth: inference API key for POST routes; the default public host also serves
models,health, andversionwithout one - client namespace:
client.inference.serverless - extra namespaces:
models,completions,responses,tokenize,detokenize,health, andversion - model value: serverless model name
- auth: inference API key for POST routes; the default public host also serves
- On-Demand inference
- auth: inference API key
- extra header:
X-User-Id - client namespace:
client.inference.on_demand - extra namespaces:
models,completions,responses,tokenize,detokenize,health, andversion - model value: served gateway model name, not the Control Plane
modelIdUUID - config naming note: the CLI stores this served model name under
gateway_model_id, which remains a compatibility key; that string is the value you pass asmodel
Get Credentials
- For a Control Plane bearer token, use CLI Authentication. The CLI browser flow stores the token locally, and
tm auth print-token --yes-i-knowcan print it for a controlled SDK setup when you need it outside the CLI. - For an inference API key, either use the key your Tensormesh environment already issued to you, or create one through the authenticated workflow:
- If you only have inference credentials, you can still use the serverless SDK surface without Control Plane login.
gateway_api_key is the stored inference API key used by the SDK as inference_api_key. gateway_model_id remains a config compatibility key, and its value is the served model name string you pass as model.
Environment Variables
The SDK supports these environment variables:TENSORMESH_CONTROL_PLANE_TOKENTENSORMESH_CONTROL_PLANE_BASE_URLTENSORMESH_INFERENCE_API_KEYTENSORMESH_SERVERLESS_BASE_URLTENSORMESH_ON_DEMAND_BASE_URLTENSORMESH_ON_DEMAND_USER_IDTENSORMESH_TIMEOUT_SECONDSTENSORMESH_MAX_RETRIESTENSORMESH_CA_BUNDLE
Constructor-Based Configuration
max_retries applies to idempotent HTTP methods. The main inference calls on this SDK surface are POST requests such as /v1/chat/completions, /v1/completions, /v1/responses, /tokenize, and /detokenize, so those requests are not retried automatically.
Environment-Based Configuration
When To Use CLI Login
The SDK does not requiretm auth login.
- Production deployments and CI environments: supply credentials through environment variables (
TENSORMESH_INFERENCE_API_KEY,TENSORMESH_CONTROL_PLANE_TOKEN, etc.). No browser interaction is required. - Local development: use
tm auth loginfor the browser-based Control Plane auth flow when you want the CLI to store and manage the token locally.
external.nebius.tensormesh.ai are provider-specific examples, not universal defaults.
Common Mistakes
- trying to use a control-plane bearer token for inference
- forgetting that on-demand inference requires both
on_demand_base_urlandon_demand_user_id - using a Control Plane
modelIdUUID where the gateway expects a served model name - assuming the SDK reads
~/.config/tensormesh/automatically - mixing serverless and on-demand base URLs

