Backed byCombinator
Supercharge LLM performance by removing redundant tokens
The Token Company offers a compression model that removes least significant tokens to improve accuracy and model clarity.
backed by the founders and operators of










Simple Integration
Integrate with an API call
Add compression to your existing LLM workflow with just a few lines of code. No model changes, no infrastructure complexity.
client = OpenAI(api_key="YOUR_OPENAI_API_KEY")
text = "Please explain me photosynthesis in one sentence, thanks"
compressed_res = requests.post(
"https://api.thetokencompany.com/v1/compress",
json={
"input": text,
"model": "bear-1",
"compression_settings": {"aggressiveness": 0.8}
},
headers={"Authorization": "Bearer API_KEY"}
).json()
compressed_text = compressed_res["output"]
res = client.chat.completions.create(
model="gpt-5.2",
messages=[{"role": "user", "content": compressed_text}]
)
client = OpenAI(api_key="YOUR_OPENAI_API_KEY")
text = "Please explain me photosynthesis in one sentence, thanks"
res = client.chat.completions.create(
model="gpt-5.2",
messages=[{"role": "user", "content": text}]
)
← Move to compare →
I am ready to compress!
Create account to access the compression model.