Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tensormesh.ai/llms.txt

Use this file to discover all available pages before exploring further.

Use this section when you are calling Tensormesh inference directly over HTTP instead of through the Python SDK or the tm CLI. Direct inference callers should expect 429 rate limits on busy surfaces, honor Retry-After when present, and avoid automatic retries around non-idempotent writes unless duplicate effects are acceptable.

Surface

Serverless: OpenAI-compatible chat completions plus verified models, completions, responses, tokenize, detokenize, health, and version endpoints on the public serverless host. Auth: Authorization: Bearer <API_KEY> for POST routes; GET /v1/models, /health, and /version also work on the public host without auth.

Start Here

If you need management APIs for users, models, billing, support, logs, or metrics, use the Control Plane API tab instead of this section.