Skip to main content
This guide documents the CLI’s actual scripting behavior today.

Output Modes

Use machine-readable output whenever another program will parse the result:
tm --output json version
tm --output json models list
tm --output json infer doctor
Available root output modes are:
  • text: human-readable default
  • json: structured scripting output
  • yaml: structured scripting output
  • raw: unmodified body/text when available, otherwise a compact one-line serialization of local command data
  • table: human-readable tabular output
For automation, prefer --output json. --output raw is most useful when you explicitly want passthrough text from an upstream response. Current caveat: tm infer chat --stream, tm infer completions --stream, and tm infer responses --stream only support --output text. For streaming automation, use the SDK or consume the upstream SSE endpoint directly.

Readiness Checks And Exit Codes

For automation, use the readiness commands with --exit-status:
tm auth status --exit-status
tm doctor --exit-status
tm infer doctor --exit-status
Use the surface-specific commands when you need strict gating for one workflow:
  • tm auth status --exit-status for the local Control Plane token plus gateway credential prerequisites
  • tm infer doctor --exit-status for direct gateway inference prerequisites
  • tm doctor --exit-status for a stricter combined readiness check
Without --exit-status, those commands still print useful diagnostics, but they are not strict shell gates.

Timeouts

Global request timeout:
tm --timeout 20 models list
Environment default:
export TENSORMESH_TIMEOUT_SECONDS=20
Streaming inference uses:
  • --timeout for the connect timeout
  • --stream-idle-timeout for idle SSE reads
Example:
tm infer responses --surface serverless --model MiniMaxAI/MiniMax-M2.5 \
  --stream --stream-idle-timeout 60 \
  --json '{"input":"hi"}'

Retries And Rate Limits

Global retry setting:
tm --max-retries 2 models list
Environment default:
export TENSORMESH_MAX_RETRIES=2
Current CLI behavior:
  • only idempotent methods are retried automatically: GET, HEAD, OPTIONS, PUT, DELETE
  • automatic retries happen for network failures and HTTP 429, 500, 502, 503, 504
  • when a retryable HTTP response includes Retry-After, the CLI waits for that server-provided delay before retrying, capped at 8s
  • otherwise backoff is exponential: 0.5s, 1s, 2s, 4s, 8s max
  • non-idempotent POST requests are not retried automatically
tm infer chat uses POST. Setting --max-retries has no effect on inference requests. --max-retries applies to management commands such as tm models list and tm activities list. Retry-After may be either delta-seconds or an HTTP date; both are honored when present, up to the same 8s maximum delay. Operational advice:
  • if the service is rate-limiting you, reduce concurrency before raising --max-retries
  • remember that retries still stop at an 8s maximum delay per attempt, even if the server asks for longer

Pagination Patterns

Control Plane commands use the pagination shape exposed by each endpoint family. Page/size style examples:
tm --output json activities list --page 2 --size 50
tm --output json tickets list --status TICKET_STATUS_OPEN --page 1 --size 20
Page-token style example:
tm --output json billing transactions list --page-size 25 --page-token "$NEXT_PAGE_TOKEN"
When you are scripting, prefer --output json and pass the next token or page values explicitly rather than scraping table output.

Error Surface

The CLI distinguishes a few failure classes:
  • usage/config errors: missing required args, invalid UUIDs, invalid timeout values
  • HTTP errors: include HTTP <status> and may include request_id
  • network errors: surfaced as connection or timeout failures
Example shell pattern:
tm_auth_json="$(
  mktemp -t tm-auth.XXXXXX 2>/dev/null || \
  mktemp "${TMPDIR:-/tmp}/tm-auth.XXXXXX"
)"
trap 'rm -f "$tm_auth_json"' EXIT

if ! tm --output json auth status --exit-status >"$tm_auth_json" 2>&1; then
  cat "$tm_auth_json"
  exit 1
fi

Idempotency And Mutations

Mutation commands such as create, deploy, upsert, message, and add do not get automatic retries. That is intentional: if you need retry logic around non-idempotent operations, add it in your wrapper with explicit safeguards around duplicate effects.

What To Avoid

  • do not parse text or table output in CI
  • do not treat tm auth status, tm doctor, or tm infer doctor as strict health checks unless you add --exit-status
  • do not assume POST mutations are retried for you
  • do not set --max-retries so high that repeated retryable failures still create unacceptable CI wait time

Headless And CI Authentication

For CI or remote environments where a browser cannot be opened:
tm auth login --no-open-browser --max-wait-seconds 60
The default wait ceiling is 300 seconds. Lower it for CI jobs with tight timeouts. See the Authentication Guide for the full token and refresh workflow.