Use this section when you are calling Tensormesh inference directly over HTTP instead of through the Python SDK or theDocumentation Index
Fetch the complete documentation index at: https://docs.tensormesh.ai/llms.txt
Use this file to discover all available pages before exploring further.
tm CLI.
Direct inference callers should expect 429 rate limits on busy surfaces, honor Retry-After when present, and avoid automatic retries around non-idempotent writes unless duplicate effects are acceptable.
Surface
Serverless: OpenAI-compatible chat completions plus verifiedmodels, completions, responses, tokenize, detokenize, health, and version endpoints on the public serverless host.
Auth: Authorization: Bearer <API_KEY> for POST routes; GET /v1/models, /health, and /version also work on the public host without auth.
Start Here
- Serverless Chat Completions
- Serverless Models
- Serverless Responses
- API Quickstart
- Choose A Serverless Model Name

