This is the shortest SDK-first path to a working request. The public inference surface exposesDocumentation Index
Fetch the complete documentation index at: https://docs.tensormesh.ai/llms.txt
Use this file to discover all available pages before exploring further.
chat.completions, models, completions, responses, tokenize, detokenize, health, and version.
1. Install The Package
Prerequisite: Python3.12 or newer.
For a published release:
2. Pick A Surface
- Serverless inference: use
client.inference.serverless - Control Plane: use
client.control_plane
models, health, and version also work without one. Control Plane uses a bearer token.
For a first successful SDK request, start with serverless inference.
Model naming: serverless examples expect a serverless model name.
If you are coming from the CLI-managed flow, gateway_api_key is the stored inference API key used by the SDK as inference_api_key.
3. Get Credentials
- For a Control Plane bearer token, use the browser login flow in CLI Authentication, then use
tm auth print-token --yes-i-knowonly in a controlled shell when you need to pass that token into SDK code. - For an inference API key, either use the key your Tensormesh environment already issued to you, or create one through the authenticated Control Plane flow:
4. Choose A Model Name
- Pass a serverless model name that is valid for the selected serverless host.
- If you have Control Plane access for the same Tensormesh environment, discover published serverless models with
tm billing pricing serverless list. - Use the returned
pricing[].modelvalue as themodelargument. - If you only have inference credentials, or you are targeting a different serverless host override, ask your operator or admin for the exact serverless
modelstring for that host before sending the request.
tm billing pricing serverless list for the same Tensormesh environment, or asking your operator or admin for the exact serverless model string.
5. First Sync Request
6. First Async Request
7. First Control-Plane Request
Next Steps
- If you are deciding which credentials and base URLs to use, continue with Auth And Config.
- If you want chat completions plus the other verified serverless endpoints, continue with Inference.
- If you are migrating an existing OpenAI or Fireworks chat integration, continue with Migration From OpenAI And Fireworks.
- If you want models, billing, users, or support examples, continue with Control Plane.
- If you want the CLI operator path for Control Plane tasks, continue with Control Plane Workflows.

