Enable Prompt Caching for Claude to make this model 10x cheaper

Currently, Venice AI does not seem to support Anthropic's Prompt Caching feature for Claude models. This is a significant issue for users who want to use Claude Opus 4.5 for anything beyond simple one-off questions.

The problem is simple: Without prompt caching, users pay the full input token price on every single message, even though most of the conversation context stays the same. With prompt caching enabled, cached input tokens cost only 10% of the regular price.

That's a 90% cost reduction.

For Claude Opus 4.5 specifically:

  • Regular input price: $5.00 per million tokens

  • Cached input price: $0.50 per million tokens

Most people use Claude for programming and development work. This means long conversations where you build up context over time - discussing code, debugging, iterating on solutions. A typical coding session can easily involve 50,000 to 200,000 tokens of context that gets sent with every message.

Right now, this makes Claude via Venice extremely expensive for any serious use. You end up paying 10 times more than you would if caching was enabled. The only scenario where the current implementation makes sense is if you just have one or two quick questions. But you cannot really work with the model this way.

Anyone who uses Claude more intensively will simply switch to the direct Anthropic API, Amazon Bedrock, or Google Vertex AI, because all of these support prompt caching. Venice loses these users not because of any problem with the platform itself, but purely because of unnecessary cost inefficiency.

The implementation should be straightforward since Anthropic handles the caching on their end. Venice would just need to support the cache_control parameter in requests and pass through the caching-related fields in the response.

Enabling prompt caching would make Venice a viable option for developers and power users who care about both privacy and cost efficiency. Without it, Claude on Venice remains a premium curiosity rather than a practical tool for daily work.

Please authenticate to join the conversation.

Upvoters
Status

New Submission

Board
💡

Feature Requests

Tags

API

Date

4 months ago

Author

Jakob

Subscribe to post

Get notified by email when there are changes.