Skip to main content

Overview

The Model Chatbot interface allows you to interact with your deployed models in real-time and run comprehensive performance tests.

Getting Started

  1. Navigate to Operations → Chatbot from the sidebar
  2. Select your active deployment from the dropdown menu
  3. Start chatting with your model or configure performance tests
Model Chatbot Interface

Interface Tabs

The Model Chatbot interface includes four main tabs:
  • Chat - Interactive conversation with your deployed model
  • Performance - Load testing and performance metrics (BETA)
  • Test Results - View historical test results (BETA)
  • Metrics - Detailed performance analytics

Chat Interface

Old Models do not always support the “chat” function. Before using our chatbot, verify that your LLM supports “chat”.
  1. Ensure your deployment status shows as Active (green indicator)
  2. Type your message in the input field at the bottom
  3. Click the send button or press Enter
  4. View the model’s response in the chat area

Performance Testing

BETA FEATURE: Performance testing is currently in beta. Features and metrics may be subject to change based on user feedback and ongoing improvements.
Performance Testing Interface

Test Configuration Parameters

Test Name (Optional) — Provide a descriptive name or leave empty for auto-generated ID
Concurrent Users — Range: 1–500 users, simulates parallel requests (default: 100)
Test Duration — Range: 10–150 seconds of continuous testing (default: 20)

Running a Performance Test

  1. Navigate to the Performance tab
  2. Configure your test parameters
  3. Click Start Performance Test (green button)
  4. Monitor live metrics in the right panel
Tests run continuously for the configured duration while maintaining the specified concurrency level. Total requests will vary based on your model’s response time.

Understanding Live Metrics

During test execution, the Live Metrics panel displays: TTFT (Time To First Token) — Latency until the first token is generated
ITL (Inter-Token Latency) — Time between subsequent tokens
Cache Hit Rate — Percentage of requests served from cache
Throughput — Requests processed per second

Test Results

BETA FEATURE: Test Results tracking is currently in beta. The format and available metrics may evolve as we refine this feature.
The Test Results tab stores historical performance test data, allowing you to:
  • Compare performance across different test runs
  • Track improvements or regressions
  • Analyze trends over time