Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tensormesh.ai/llms.txt

Use this file to discover all available pages before exploring further.

Run live demos against serverless models to visualize how KV caching accelerates inference. Navigate to Demos from the sidebar to get started.

How It Works

1

Choose a Demo

Browse the demo catalog and select one to run.
2

Select a Model

Pick from the available serverless models. No deployment setup required.
3

Run the Demo

Configure and click Run. The demo streams results in real time.

Available Demos

Ask the Document — Sends 20 questions against a shared long-document prefix to demonstrate KV cache speedup. The first request is cold; follow-ups reuse the cached state and skip the prefill step.
Try the same demo across different models to compare how architecture and model size affect cache speedup.