About Compression

How compression works - choose the right level for your use case.

Choosing a compression level

The aggressiveness parameter controls how much content is removed. Choose a level based on your use case.

0.05

0.15

0.40.9

Light0.05 - 0.15

Financial reports & legal contracts
Medical records & clinical notes

Moderate0.15 - 0.4

Meeting transcripts & call recordings
Web-scraped page content

Aggressive0.4 - 0.9

Chat history summarization
Replacing Claude compact

from thetokencompany import TheTokenCompany

client = TheTokenCompany(api_key="ttc-...")

result = client.compress("Your text here...", aggressiveness=0.15)

Models

Model	Description	Status
`bear-2`	Most accurate compression. Best quality preservation.	Recommended
`bear-1.2`	Faster compression. Lower latency per request.	Available

For enterprise users, we custom fine-tune models to your specific data and use case. Contact us to learn more.

from thetokencompany import TheTokenCompany

client = TheTokenCompany(api_key="ttc-...")

result = client.compress("Your text here...", model="bear-2")

For current pricing across models, see the pricing page.

How it works

Send your prompt to the TTC compression API
The model analyzes token importance in context
Low-signal tokens are removed based on your aggressiveness setting
You receive the compressed text and pass it to any LLM

Deterministic

Compression is fully deterministic. The same input with the same aggressiveness setting will always produce the same output. This means you can safely cache compressed results and rely on consistent behavior across requests.

Advanced features

Advanced aggressiveness - set different compression levels per message role (system, user, tool)
Protect text - wrap content in <ttc_safe> tags to exclude it from compression