External Storage gives your serverless models a persistent KV cache bucket — so context is remembered across requests and sessions, not just within a single call. Navigate to Operations → Storage to view plans and subscribe.Documentation Index
Fetch the complete documentation index at: https://docs.tensormesh.ai/llms.txt
Use this file to discover all available pages before exploring further.
Why It Matters
By default, Tensormesh’s KV cache is in-memory and scoped to a single request. Every new session starts cold: tokens that were cached in a previous session need to be recomputed and billed as regular input tokens. With an External Storage bucket, those tokens are persisted and reused across sessions.Persistent Cache
KV cache entries survive beyond a single request window. Long system prompts and repeated context are stored across sessions.
More $0 Cached Tokens
A higher fraction of requests hit the cache, meaning more tokens served at $0 and a lower effective cost per call.
Faster Responses
Requests that share context with previous sessions skip recomputation entirely — reducing time-to-first-token.
No API Changes
Once your bucket is active, caching is handled automatically. No changes to your existing API calls required.
Storage Plans
Plans are tiered by bucket size. Subscribe or change plans anytime from Operations → Storage — billing adjusts immediately and no data is lost on upgrade.| Plan | Best For |
|---|---|
| Bronze | Getting started — low to moderate request volume |
| Silver | Agentic developers — more headroom for parallel workloads |
| Gold | Production-scale inference — high volume and large system prompts |
External Storage is a flat monthly subscription billed separately from token usage. See Pricing Overview for how it interacts with cached token pricing.
Monitoring Your Usage
The Operations → Storage page shows:- Live usage bar — Your current bucket fill level so you know how much capacity you’re using.
- Per-model KV cache usage table — A breakdown of how much storage each model is consuming, so you can see exactly where your bucket is being used.
What Gets Cached
External Storage extends the same KV cache that already powers $0 cached tokens — it just makes those cached entries persist beyond a single request:- System messages and instructions
- Shared conversation prefixes and history
- Long document contexts passed repeatedly
- Common prompt templates shared across sessions
Cross-Session vs In-Request Caching
| In-Memory Cache (default) | External Storage | |
|---|---|---|
| Scope | Single request window | Across requests and sessions |
| Cost | Free (included) | Flat monthly subscription |
| Setup | None | Subscribe at Operations → Storage |
| Cache hit rate | Lower (cold on every new session) | Higher (warm on returning sessions) |

