Anthropic

Automatic compression for Anthropic Claude API calls.

The withCompression() wrapper automatically compresses the system prompt and all user messages before sending them to Claude. Your existing code stays the same.

Setup

from anthropic import Anthropic
from thetokencompany.anthropic import with_compression

client = with_compression(
    Anthropic(api_key="YOUR_ANTHROPIC_API_KEY"),
    compression_api_key="ttc-...",
)

# Use Anthropic exactly as before - compression happens automatically
response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    system="You are a helpful assistant.",
    messages=[
        {"role": "user", "content": "Your long prompt text here..."},
    ],
)

print(response.content[0].text)

Info

The wrapper compresses the system parameter and all user messages. Assistant messages pass through unchanged so the LLM cache is fully preserved. tool_result content blocks are compressed when the "tool" role has aggressiveness set.

Per-role aggressiveness

Set different compression levels per message role. Roles not in the dictionary are not compressed.

client = with_compression(
    Anthropic(api_key="YOUR_ANTHROPIC_API_KEY"),
    compression_api_key="ttc-...",
    aggressiveness={
        "system": 0.1,   # light - preserve system prompt
        "user": 0.4,     # moderate - compress user messages
        "tool": 0.6,     # aggressive - compress tool results
    },
)

How it works

You call client.messages.create() as normal
The wrapper compresses the system prompt and user messages
Compressed messages are sent to Claude
You receive the response as usual