Skip to main content
The Tensormesh Python SDK is the main application-integration surface for Tensormesh. Use it when you want to:
  • call serverless and on-demand inference endpoints from Python, including chat completions, models, completions, responses, tokenize, detokenize, health, and version
  • work with Tensormesh control-plane resources such as models, users, billing, and support
  • choose between synchronous and asynchronous clients without changing the overall API shape
The SDK ships in the tensormesh Python distribution. The tm CLI ships in the same distribution, but the SDK is the better starting point when you are building application code or services. The public inference surface exposes chat.completions, models, completions, responses, tokenize, detokenize, health, and version on both surfaces. On the default public serverless host, models, health, and version also work without an inference API key. Embeddings and audio endpoints are not currently exposed on this SDK surface. For the shortest first-success path, start with serverless inference and an inference API key. Use on-demand inference once you already have the routing values for your deployment.

Prerequisites

Python 3.12 or newer.

Install

pip install tensormesh

Main Client Surfaces

  • Tensormesh: synchronous client for scripts, notebooks, and request/response applications
  • AsyncTensormesh: asynchronous client for services and async application stacks
Both clients expose the same top-level namespaces:
  • client.inference.serverless
  • client.inference.on_demand
  • client.control_plane
On the inference side, both namespaces expose chat.completions, models, completions, responses, tokenize, detokenize, health, and version.

Choose A Guide

Use the Control Plane API tab in the docs navigation for generated management API reference.