Skip to main content
Tensormesh uses a transparent, usage-based pricing model designed to align with your success. You only pay for what you use, and our pricing reflects the value you receive.

Beta Phase Pricing

Current Beta Pricing: During the beta period, you are charged only for the direct, pass-through cost of the GPU instances you allocate.

What You Pay During Beta

During beta, billing is straightforward: GPU Instance Costs — You pay only the provider’s base cost for GPU hours used
Caching Benefits — Free during beta — no charges for Tensormesh’s caching technology
Savings Reports — View estimated savings from caching in your dashboard

Why Beta Pricing Matters

While Tensormesh’s caching technology actively reduces your computational load, you are not charged for this added value. Estimated savings are displayed as a report, allowing you to experience the full benefit of Tensormesh at no extra cost.
Use the beta period to understand your workload’s cache hit rates and potential savings before the value-share model takes effect.

Post-Beta Pricing Model

Following the beta phase, pricing will be tied to the savings you realize through caching technology — a “value-share” model ensuring Tensormesh succeeds only when you save money. Value-Share Philosophy Our post-beta pricing model is built on a simple principle: we only earn when you save. By sharing in the value created through caching efficiency, our success is directly aligned with yours.

Pricing Formula

Tensormesh Pricing

Pricing = (GPU × GPH) + (GPU × GPH × EST × 0.3)
Where: GPU = Number of GPU hours consumed
GPH = Price per GPU hour for the chosen cloud provider
EST = Estimated savings based on cache hit rate (expressed as a decimal)

Baseline Comparison

Baseline = (GPU × GPH) + (GPU × GPH × EST)
The baseline represents what you would pay without Tensormesh for equivalent workload capacity. This is the cost of running the same workload without caching optimization.

Understanding the Formula

This is the pass-through cost of the GPU instances from your cloud provider. You pay this regardless of Tensormesh usage.
This represents the additional compute capacity you would need without caching. Your cache hit rate directly reduces the computational load required to serve requests.
Tensormesh takes 30% of the value created through caching. This means you keep 70% of all caching savings.

Example Calculation

Let’s walk through a real-world example to see how pricing works.

Assumptions

GPU Hours

100 hours

GPU Hourly Rate

$2.00 per hour

Cache Hit Rate

60% (0.6 as decimal)

Estimated Savings

60% reduced compute

Cost With Tensormesh

Cost = (GPU × GPH) + (GPU × GPH × EST × 0.3)
     = (100 × $2.00) + (100 × $2.00 × 0.6 × 0.3)
     = $200 + $36
     = $236
Breakdown:
  • Base GPU cost: $200
  • Tensormesh value-share: $36 (30% of the $120 in caching value)
  • Total: $236

Cost Without Tensormesh (Baseline)

Cost = (GPU × GPH) + (GPU × GPH × EST)
     = (100 × $2.00) + (100 × $2.00 × 0.6)
     = $200 + $120
     = $320
Breakdown:
  • Base GPU cost: $200
  • Additional compute needed: $120 (to handle the same workload without caching)
  • Total: $320

Your Savings Summary

Your Cost

$236With Tensormesh

Baseline Cost

$320Without Tensormesh

Net Savings

$84 (26%)Money saved
Value Distribution: Your savings$84 (70% of the $120 saved through caching)
Tensormesh share$36 (30% of the $120 saved through caching)
Total value created$120 from caching efficiency

Maximizing Your Savings

The more effectively you use Tensormesh’s caching, the more you save. Here are strategies to optimize your costs:
Structure your prompts with consistent prefixes and system messages to maximize cache reuse. Higher cache hit rates mean greater savings.Impact: Every 10% increase in cache hit rate translates directly to proportional savings.
Design prompts that encourage caching:
  • Use consistent system messages
  • Standardize common request patterns
  • Implement prompt templates for frequent queries
Regularly review your cache hit rates and savings reports in the dashboard. Identify patterns and optimize accordingly.
Select GPU types and replica counts that match your actual workload needs. Avoid over-provisioning while maintaining performance.

Pricing Transparency

What’s Included Your Tensormesh pricing includes:
  • GPU instance costs (pass-through from provider)
  • Intelligent KV cache management
  • Request routing and load balancing
  • Real-time performance metrics
  • API access and authentication
  • Dashboard and monitoring tools
  • Technical support during beta

Billing Details

Billing Frequency

Monthly billing cycles with detailed usage reports

Cost Tracking

Real-time cost visibility in your dashboard

Savings Reports

Detailed reports showing cache efficiency and cost savings

Usage Analytics

Granular breakdown of GPU hours, cache hits, and costs

Frequently Asked Questions

Value-share pricing will be introduced after the beta phase concludes. Current beta users will receive advance notice and detailed transition information.
If caching provides no benefit (EST = 0), you only pay the base GPU cost with no additional charges. The formula becomes: (GPU × GPH) + (GPU × GPH × 0 × 0.3) = GPU × GPH.
EST is calculated based on your cache hit rate and represents the proportion of compute that would otherwise be required without caching. A 60% cache hit rate typically translates to approximately 60% estimated savings.
Yes! Use the pricing formula with your expected GPU hours, provider rates, and estimated cache hit rate. The dashboard also provides cost estimates during deployment configuration.
No. Tensormesh pricing is completely transparent. You pay only GPU costs (during beta) or GPU costs plus the value-share (post-beta). No setup fees, no minimum commitments.
Pricing automatically adjusts to your actual usage and cache hit rates. You’re never locked into a pricing tier or committed to specific usage levels.