About Compression

How compression works - choose the right level for your use case.

Choosing a compression level

The aggressiveness parameter controls how much content is removed. Choose a level based on your use case.

0.05
0.15
0.40.9
Light0.05 - 0.15
  • Financial reports & legal contracts
  • Medical records & clinical notes
Moderate0.15 - 0.4
  • Meeting transcripts & call recordings
  • Web-scraped page content
Aggressive0.4 - 0.9
  • Chat history summarization
  • Replacing Claude compact
from thetokencompany import TheTokenCompany

client = TheTokenCompany(api_key="ttc-...")

result = client.compress("Your text here...", aggressiveness=0.15)

Models

ModelDescriptionStatus
bear-2Most accurate compression. Best quality preservation.Recommended
bear-1.2Faster compression. Lower latency per request.Available

For enterprise users, we custom fine-tune models to your specific data and use case. Contact us to learn more.

from thetokencompany import TheTokenCompany

client = TheTokenCompany(api_key="ttc-...")

result = client.compress("Your text here...", model="bear-2")

For current pricing across models, see the pricing page.

How it works

  1. Send your prompt to the TTC compression API
  2. The model analyzes token importance in context
  3. Low-signal tokens are removed based on your aggressiveness setting
  4. You receive the compressed text and pass it to any LLM

Deterministic

Compression is fully deterministic. The same input with the same aggressiveness setting will always produce the same output. This means you can safely cache compressed results and rely on consistent behavior across requests.

Advanced features