Skip to main content

When To Use This

Use tm infer chat when you want to send an actual inference request through the CLI.
  • Choose --surface serverless when you already know the serverless model name you want.
  • Use the default On-Demand flow after tm auth login and tm init --sync when you want the CLI to reuse the synced managed gateway settings.

Usage

tm infer chat [OPTIONS]

Examples

Send a non-streaming On-Demand chat request after tm auth login and tm init --sync.

tm infer chat --json '[{"role":"user","content":"Say hello."}]'

Send a non-streaming Serverless request with an explicit inference API key.

tm infer chat --surface serverless --api-key YOUR_INFERENCE_API_KEY --model YOUR_SERVERLESS_MODEL_NAME --json '[{"role":"user","content":"Say hello."}]'

Stream tokens from the selected surface after the usual On-Demand setup, and fail a stalled stream after 30 seconds of upstream silence.

echo '[{"role":"user","content":"Stream tokens."}]' | tm infer chat --stream --stream-idle-timeout 30

Options

NameTypeRequiredDefaultDetails
--surface`choice[on-demandserverless]`no"on-demand"Inference surface to use.
--modeltextnoModel name to use.
--user-idtextnoX-User-Id header (UUID). Only used for —surface on-demand.
--api-keytextnoInference API key (Authorization: Bearer …).
--base-urltextnoOverride the base URL for the selected surface.
--streambooleannofalseStream tokens via SSE. Boolean flag.
--stream-idle-timeoutfloatno300.0Maximum idle read timeout in seconds for —stream responses.
--jsontextnoJSON payload or @file.json (object or messages array). When omitted, reads piped stdin if available.
--filepathnoRead JSON payload/messages from file.
--timeoutfloatnoHTTP connect timeout in seconds for the inference request.

Inherited Global Options

NameTypeRequiredDefaultDetails
--version, -VbooleannofalseShow the version and exit. Boolean flag.
--configpathno"~/.config/tensormesh/config.toml"Path to config TOML file
--output`choice[textjsonyamlrawtable]`no"text"Output format (text is human-readable; json is machine-friendly).
--quietbooleannofalseSuppress non-essential output. Boolean flag.
--debugbooleannofalsePrint debug logs to stderr (secrets redacted). Boolean flag.
--ca-bundlepathnoPath to a PEM CA bundle for TLS verification (overrides TENSORMESH_CA_BUNDLE).
--max-retriesintegernoMax retries for idempotent HTTP requests on transient errors (overrides TENSORMESH_MAX_RETRIES; subcommands may override).
--controlplane-basetextnoOverride the Control Plane base URL.
--gateway-providertextnoInference Gateway provider for built-in host selection (nebius, lambda, yotta).

Auth Scope

  • inference-api-key

Prerequisites

  • For the default On-Demand flow, run tm auth login and tm init --sync first so the CLI has the synced inference API key, X-User-Id, and served model name.
  • Provide --model when using --surface serverless.
  • If you have Control Plane access for the same Tensormesh environment, discover published serverless model names with tm billing pricing serverless list, then pass the returned pricing[].model value with --model.
  • If you only have inference credentials, or you are targeting a different serverless host override, get the model name from your Tensormesh environment before using --surface serverless.

Caveats

  • --surface on-demand requires an inference API key and X-User-Id; the standard way to populate both is tm auth login followed by tm init --sync.
  • --surface serverless does not send X-User-Id, defaults to https://serverless.tensormesh.ai, and reuses gateway_api_key as the shared inference API key when --api-key is omitted.
  • gateway_api_key is the stored inference API key used by the SDK as inference_api_key.
  • tm billing pricing serverless list helps discover published serverless model names for the current Tensormesh Control Plane environment. If you are targeting a different serverless host override, confirm the model name for that host separately.
  • --stream currently supports only --output text.
  • --model @latest is only supported for --surface on-demand.

Parent Command