Advanced Aggressiveness

Set different compression levels per message role.

Per-role aggressiveness

When using withCompression(), you can pass a dictionary instead of a single number. Each key is a message role, and the value is the aggressiveness for that role.

from openai import OpenAI
from thetokencompany.openai import with_compression

client = with_compression(
OpenAI(),
compression_api_key="ttc-...",
aggressiveness={
"system": 0.1, # light - preserve instructions carefully
"user": 0.4, # moderate - compress user messages
"tool": 0.6, # aggressive - compress tool/function results
},
)

# Roles not in the dict are NOT compressed
# Assistant messages always pass through unchanged

Available roles

RoleRecommendedDescription
system0.1System prompts with instructions. Use light compression to preserve intent.
user0.3–0.5User messages. Moderate compression works well for most content.
tool0.5–0.7Tool/function results. Often verbose - higher compression is safe.
Info
Assistant messages are never compressed. They always pass through unchanged so the LLM cache is fully preserved.

Uniform aggressiveness

Pass a single number to apply the same level to all roles:

# Single number applies to all roles equally
client = with_compression(
OpenAI(),
compression_api_key="ttc-...",
aggressiveness=0.3, # all roles at 0.3
)

Vercel AI SDK

Per-role aggressiveness works the same way with the Vercel AI SDK wrapper:

import { openai } from "@ai-sdk/openai";
import { withCompression } from "the-token-company/ai-sdk";

const model = withCompression(openai("gpt-5.4"), {
compressionApiKey: "ttc-...",
aggressiveness: {
system: 0.1,
user: 0.3,
tool: 0.5,
},
});

When to use per-role

  • RAG pipelines - light on system prompt (0.1), moderate on user query (0.3), aggressive on retrieved documents (0.7)
  • Agentic workflows - preserve tool call instructions (system 0.1), compress tool results aggressively (tool 0.6)
  • Chat applications - compress long user messages (0.4) while keeping system instructions intact (0.1)