Beta Phase Pricing
Current Beta Pricing: During the beta period, you are charged only for the direct, pass-through cost of the GPU instances you allocate.
What You Pay During Beta
During beta, billing is straightforward: GPU Instance Costs — You pay only the provider’s base cost for GPU hours usedCaching Benefits — Free during beta — no charges for Tensormesh’s caching technology
Savings Reports — View estimated savings from caching in your dashboard
Why Beta Pricing Matters
While Tensormesh’s caching technology actively reduces your computational load, you are not charged for this added value. Estimated savings are displayed as a report, allowing you to experience the full benefit of Tensormesh at no extra cost.Post-Beta Pricing Model
Following the beta phase, pricing will be tied to the savings you realize through caching technology — a “value-share” model ensuring Tensormesh succeeds only when you save money. Value-Share Philosophy Our post-beta pricing model is built on a simple principle: we only earn when you save. By sharing in the value created through caching efficiency, our success is directly aligned with yours.Pricing Formula
Tensormesh Pricing
GPH = Price per GPU hour for the chosen cloud provider
EST = Estimated savings based on cache hit rate (expressed as a decimal)
Baseline Comparison
Understanding the Formula
Base GPU Cost: (GPU × GPH)
Base GPU Cost: (GPU × GPH)
This is the pass-through cost of the GPU instances from your cloud provider. You pay this regardless of Tensormesh usage.
Caching Value: (GPU × GPH × EST)
Caching Value: (GPU × GPH × EST)
This represents the additional compute capacity you would need without caching. Your cache hit rate directly reduces the computational load required to serve requests.
Tensormesh Share: (GPU × GPH × EST × 0.3)
Tensormesh Share: (GPU × GPH × EST × 0.3)
Example Calculation
Let’s walk through a real-world example to see how pricing works.Assumptions
GPU Hours
100 hours
GPU Hourly Rate
$2.00 per hour
Cache Hit Rate
60% (0.6 as decimal)
Estimated Savings
60% reduced compute
Cost With Tensormesh
- Base GPU cost: $200
- Tensormesh value-share: $36 (30% of the $120 in caching value)
- Total: $236
Cost Without Tensormesh (Baseline)
- Base GPU cost: $200
- Additional compute needed: $120 (to handle the same workload without caching)
- Total: $320
Your Savings Summary
Your Cost
$236With Tensormesh
Baseline Cost
$320Without Tensormesh
Net Savings
$84 (26%)Money saved
Tensormesh share — $36 (30% of the $120 saved through caching)
Total value created — $120 from caching efficiency
Maximizing Your Savings
The more effectively you use Tensormesh’s caching, the more you save. Here are strategies to optimize your costs:Increase Cache Hit Rates
Increase Cache Hit Rates
Structure your prompts with consistent prefixes and system messages to maximize cache reuse. Higher cache hit rates mean greater savings.Impact: Every 10% increase in cache hit rate translates directly to proportional savings.
Optimize Prompt Engineering
Optimize Prompt Engineering
Design prompts that encourage caching:
- Use consistent system messages
- Standardize common request patterns
- Implement prompt templates for frequent queries
Monitor Performance Metrics
Monitor Performance Metrics
Regularly review your cache hit rates and savings reports in the dashboard. Identify patterns and optimize accordingly.
Right-Size GPU Allocation
Right-Size GPU Allocation
Select GPU types and replica counts that match your actual workload needs. Avoid over-provisioning while maintaining performance.
Pricing Transparency
What’s Included Your Tensormesh pricing includes:- GPU instance costs (pass-through from provider)
- Intelligent KV cache management
- Request routing and load balancing
- Real-time performance metrics
- API access and authentication
- Dashboard and monitoring tools
- Technical support during beta
Billing Details
Billing Frequency
Monthly billing cycles with detailed usage reports
Cost Tracking
Real-time cost visibility in your dashboard
Savings Reports
Detailed reports showing cache efficiency and cost savings
Usage Analytics
Granular breakdown of GPU hours, cache hits, and costs
Frequently Asked Questions
When does the value-share pricing take effect?
When does the value-share pricing take effect?
What happens if my cache hit rate is 0%?
What happens if my cache hit rate is 0%?
If caching provides no benefit (EST = 0), you only pay the base GPU cost with no additional charges. The formula becomes: (GPU × GPH) + (GPU × GPH × 0 × 0.3) = GPU × GPH.
How is the estimated savings (EST) calculated?
How is the estimated savings (EST) calculated?
EST is calculated based on your cache hit rate and represents the proportion of compute that would otherwise be required without caching. A 60% cache hit rate typically translates to approximately 60% estimated savings.
Can I predict my costs before deploying?
Can I predict my costs before deploying?
Yes! Use the pricing formula with your expected GPU hours, provider rates, and estimated cache hit rate. The dashboard also provides cost estimates during deployment configuration.
Are there any hidden fees?
Are there any hidden fees?
What if my usage patterns change?
What if my usage patterns change?
Pricing automatically adjusts to your actual usage and cache hit rates. You’re never locked into a pricing tier or committed to specific usage levels.

