Documentation Index
Fetch the complete documentation index at: https://docs.tensormesh.ai/llms.txt
Use this file to discover all available pages before exploring further.
When To Use This
Usetm infer chat when you want to send an actual serverless inference request through the CLI.
Usage
Examples
Send a non-streaming serverless request with an explicit inference API key.
Stream tokens over SSE and fail a stalled stream after 30 seconds of upstream silence.
Options
| Name | Type | Required | Default | Details |
|---|---|---|---|---|
--surface | choice[serverless] | no | "serverless" | Inference surface to use. |
--model | text | no | Model name to use. | |
--api-key | text | no | Inference API key (Authorization: Bearer …). | |
--base-url | text | no | Override the base URL for the selected surface. | |
--stream | boolean | no | false | Stream tokens via SSE. Boolean flag. |
--stream-idle-timeout | float | no | 300.0 | Maximum idle read timeout in seconds for —stream responses. |
--json | text | no | JSON payload or @file.json (object or messages array). When omitted, reads piped stdin if available. | |
--file | path | no | Read JSON payload/messages from file. | |
--timeout | float | no | HTTP connect timeout in seconds for the inference request. |
Inherited Global Options
| Name | Type | Required | Default | Details | ||||
|---|---|---|---|---|---|---|---|---|
--version, -V | boolean | no | false | Show the version and exit. Boolean flag. | ||||
--config | path | no | "~/.config/tensormesh/config.toml" | Path to config TOML file | ||||
--output | `choice[text | json | yaml | raw | table]` | no | "text" | Output format (text is human-readable; json is machine-friendly). |
--quiet | boolean | no | false | Suppress non-essential output. Boolean flag. | ||||
--debug | boolean | no | false | Print debug logs to stderr (secrets redacted). Boolean flag. | ||||
--ca-bundle | path | no | Path to a PEM CA bundle for TLS verification (overrides TENSORMESH_CA_BUNDLE). | |||||
--max-retries | integer | no | Max retries for idempotent HTTP requests on transient errors (overrides TENSORMESH_MAX_RETRIES; subcommands may override). | |||||
--controlplane-base | text | no | Override the Control Plane base URL. |
Auth Scope
- inference-api-key
Prerequisites
- Provide
--modelexplicitly or includemodelin the JSON request body. - If you have Control Plane access for the same Tensormesh environment, discover published serverless model names with
tm billing pricing serverless list, then pass the returnedpricing[].modelvalue with--model. - If you only have inference credentials, or you are targeting a different serverless host override, get the model name from your Tensormesh environment before sending the request.
Caveats
- Reuses
gateway_api_keyfromconfig.tomlas the inference API key when--api-keyis omitted. gateway_api_keyis the stored inference API key used by the SDK asinference_api_key.tm billing pricing serverless listhelps discover published serverless model names for the current Tensormesh Control Plane environment. If you are targeting a different serverless host override, confirm the model name for that host separately.--streamcurrently supports only--output text.
Related Commands
tm billing pricing serverless listtm models listtm doctor

