About Compression
How compression works - choose the right level for your use case.
Choosing a compression level
The aggressiveness parameter controls how much content is removed. Choose a level based on your use case.
0.05
0.15
0.40.9
Light0.05 - 0.15
- Financial reports & legal contracts
- Medical records & clinical notes
Moderate0.15 - 0.4
- Meeting transcripts & call recordings
- Web-scraped page content
Aggressive0.4 - 0.9
- Chat history summarization
- Replacing Claude compact
from thetokencompany import TheTokenCompany
client = TheTokenCompany(api_key="ttc-...")
result = client.compress("Your text here...", aggressiveness=0.15)
Models
| Model | Description | Status |
|---|---|---|
bear-2 | Most accurate compression. Best quality preservation. | Recommended |
bear-1.2 | Faster compression. Lower latency per request. | Available |
For enterprise users, we custom fine-tune models to your specific data and use case. Contact us to learn more.
from thetokencompany import TheTokenCompany
client = TheTokenCompany(api_key="ttc-...")
result = client.compress("Your text here...", model="bear-2")
For current pricing across models, see the pricing page.
How it works
- Send your prompt to the TTC compression API
- The model analyzes token importance in context
- Low-signal tokens are removed based on your aggressiveness setting
- You receive the compressed text and pass it to any LLM
Deterministic
Compression is fully deterministic. The same input with the same aggressiveness setting will always produce the same output. This means you can safely cache compressed results and rely on consistent behavior across requests.
Advanced features
- Advanced aggressiveness - set different compression levels per message role (system, user, tool)
- Protect text - wrap content in
<ttc_safe>tags to exclude it from compression