Anthropic
Automatic compression for Anthropic Claude API calls.
The withCompression() wrapper automatically compresses the system prompt and all user messages before sending them to Claude. Your existing code stays the same.
Setup
from anthropic import Anthropic
from thetokencompany.anthropic import with_compression
client = with_compression(
Anthropic(),
compression_api_key="ttc-...",
)
# Use Anthropic exactly as before - compression happens automatically
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a helpful assistant.",
messages=[
{"role": "user", "content": "Your long prompt text here..."},
],
)
print(response.content[0].text)
Info
The wrapper compresses the
system parameter and all user messages. Assistant messages pass through unchanged so the LLM cache is fully preserved. tool_result content blocks are compressed when the "tool" role has aggressiveness set.Per-role aggressiveness
Set different compression levels per message role. Roles not in the dictionary are not compressed.
client = with_compression(
Anthropic(),
compression_api_key="ttc-...",
aggressiveness={
"system": 0.1, # light - preserve system prompt
"user": 0.4, # moderate - compress user messages
"tool": 0.6, # aggressive - compress tool results
},
)
How it works
- You call
client.messages.create()as normal - The wrapper compresses the system prompt and user messages
- Compressed messages are sent to Claude
- You receive the response as usual