Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tensormesh.ai/llms.txt

Use this file to discover all available pages before exploring further.

External Storage gives your serverless models a persistent KV cache bucket — so context is remembered across requests and sessions, not just within a single call. Navigate to Operations → Storage to view plans and subscribe.

Why It Matters

By default, Tensormesh’s KV cache is in-memory and scoped to a single request. Every new session starts cold: tokens that were cached in a previous session need to be recomputed and billed as regular input tokens. With an External Storage bucket, those tokens are persisted and reused across sessions.

Persistent Cache

KV cache entries survive beyond a single request window. Long system prompts and repeated context are stored across sessions.

More $0 Cached Tokens

A higher fraction of requests hit the cache, meaning more tokens served at $0 and a lower effective cost per call.

Faster Responses

Requests that share context with previous sessions skip recomputation entirely — reducing time-to-first-token.

No API Changes

Once your bucket is active, caching is handled automatically. No changes to your existing API calls required.

Storage Plans

Plans are tiered by bucket size. Subscribe or change plans anytime from Operations → Storage — billing adjusts immediately and no data is lost on upgrade.
PlanBest For
BronzeGetting started — low to moderate request volume
SilverAgentic developers — more headroom for parallel workloads
GoldProduction-scale inference — high volume and large system prompts
External Storage is a flat monthly subscription billed separately from token usage. See Pricing Overview for how it interacts with cached token pricing.

Monitoring Your Usage

The Operations → Storage page shows:
  • Live usage bar — Your current bucket fill level so you know how much capacity you’re using.
  • Per-model KV cache usage table — A breakdown of how much storage each model is consuming, so you can see exactly where your bucket is being used.

What Gets Cached

External Storage extends the same KV cache that already powers $0 cached tokens — it just makes those cached entries persist beyond a single request:
  • System messages and instructions
  • Shared conversation prefixes and history
  • Long document contexts passed repeatedly
  • Common prompt templates shared across sessions

Cross-Session vs In-Request Caching

In-Memory Cache (default)External Storage
ScopeSingle request windowAcross requests and sessions
CostFree (included)Flat monthly subscription
SetupNoneSubscribe at Operations → Storage
Cache hit rateLower (cold on every new session)Higher (warm on returning sessions)
Even without External Storage, cached tokens are always $0. External Storage increases the fraction of requests that hit the cache.