Skip to main content
Once your model is deployed, it can be managed from the Deployment Details dashboard. This page provides full visibility into the deployment’s status, configuration, performance, and operational activity.

Accessing Deployment Details

Navigate to Operations → Dashboard → Select a Deployment to view the deployment management interface. Each deployment includes four main tabs: Overview, Configuration, Metrics, and Logs & Events. Deployment Details

Overview Tab

The Overview tab serves as the primary control and status panel for your deployment, displaying high-level, real-time information.

Key Information

Status

The current lifecycle state of the deployment (e.g., Active, Initializing, Stopped, Error)

Cloud Provider

The cloud provider and region where the deployment is running (e.g., Nebius – EU North 1)

Run Time

The elapsed time since the deployment was created and started

Deployment ID

Unique identifier for tracking and referencing this deployment

API Endpoint

The Overview tab provides direct access to your deployment’s API credentials: Endpoint URL — The base URL for making API requests to your deployed model
X-User-ID Header — Authentication header for user identification
API Key — Secure authentication token (keep this confidential)
Keep your API key secure. Never share it publicly or expose it in client-side code.

cURL Example

A ready-to-use cURL command is provided for quick testing:
curl https://external.nebius.tensormesh.ai/v1/chat/completions \
  --no-buffer \
  -H "Content-Type: application/json" \
  -H "X-User-Id: <your-x-user-id>" \
  -H "Authorization: Bearer **********************" \
  -d '{
    "model": "openai/gpt-oss-20b",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "Write a haiku about cloud compute." }
    ],
    "temperature": 0.7
  }'

Deployment Information

Additional deployment information includes: Model — The specific model being deployed (e.g., openai/gpt-oss-20b)
Created — Timestamp indicating when the deployment was initially created
Last Updated — Most recent modification timestamp

Configuration Tab

The Configuration tab displays the static setup parameters of your deployment.

Configuration Details

This includes:
  • Model source and path
  • Cloud provider and region selection
  • GPU type and count
  • Number of replicas
  • Memory and compute allocations
  • Network and storage settings

Metrics Tab

The Metrics tab provides real-time and historical performance insights for your LLM deployment.

Available Metrics

The dashboard displays seven key performance metrics: Cache Hit Rate — Percentage of requests served from cache
Time to First Token (TTFT) — Latency from request to first token
Inter-Token Latency (ITL) — Time between consecutive tokens
Input Throughput — Request processing rate (QPS)
Output Throughput — Token generation rate
GPU Compute Utilization — GPU resource usage percentage
KV Cache Usage Ratio — Cache capacity utilization

Time Range Selection

View metrics across different time windows: Last Hour — Real-time monitoring
Last Day — Daily trends
Last Week — Long-term patterns
For detailed explanations of each metric, optimization strategies, and troubleshooting guidance, see the Monitoring Performance guide.

Logs & Events Tab

The Logs & Events tab displays a chronological activity log of system-generated deployment events.

Event Types

Common events include: Initialization — Deployment startup progress and container launches
Status Changes — Transitions between deployment states (Starting, Running, Stopped)
Errors — Failure messages and error codes for troubleshooting
Scaling Events — Replica additions or removals
Configuration Updates — Changes to deployment settings
Health Checks — Regular system health verification results

Using Logs for Troubleshooting

Logs help you:
  • Diagnose deployment failures and error conditions
  • Track deployment lifecycle and state transitions
  • Identify performance degradation causes
  • Audit deployment activity and changes
  • Debug integration and API connection issues