Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tensormesh.ai/llms.txt

Use this file to discover all available pages before exploring further.

The current Tensormesh SDK supports the narrowest migration path on the inference side:
  • serverless chat completions
  • serverless Responses client
  • no embeddings client

Fastest Serverless Migration

If your existing app already uses chat completions, serverless is the closest fit.
from tensormesh import Tensormesh
from tensormesh.types import ChatMessage

client = Tensormesh(inference_api_key="YOUR_INFERENCE_API_KEY")

serverless_model_name = "YOUR_SERVERLESS_MODEL_NAME"
completion = client.inference.serverless.chat.completions.create(
    model=serverless_model_name,
    messages=[ChatMessage(role="user", content="Say hello.")],
)

print(completion.choices[0].message.content)

What Changes From OpenAI Or Fireworks

  • Serverless uses client.inference.serverless.chat.completions.create(...), not client.chat.completions.create(...).
  • Serverless also exposes client.inference.serverless.responses.create(...) when you want the verified responses surface.
  • This SDK does not currently expose embeddings.
  • Message content is text-oriented in this SDK surface; multimodal content-part request shapes are not modeled here.
  • Structured output is limited to response_format={"type": "json_object"} or ResponseFormat(type="json_object"). JSON Schema-style json_schema response formats are not supported on this surface.
  • CLI login state is not read automatically by the Python SDK. Application code must pass credentials explicitly or via the documented SDK environment variables.
  • Use serverless for an OpenAI-style chat flow on this SDK surface.
  • If you do not already know a valid serverless model name, start with Choose A Serverless Model Name before copying the serverless example.