Backed byCombinator

Optimize LLM context
by removing input bloat

Bear-1.2 compression removes low signal tokens from your prompts before they hit your LLM.

Backed by people behind

Hugging Face
Silo
Wolt
Y Combinator
Supercell
SVA
PDF

Save tokens and improve accuracy on agent's background knowledge

Bear-1.2 compresses your agent's background knowledge before it enters the context window.

Featurednew

Intelligent semantic processing

The bear-1 and bear-1.2 models process tokens based on context and semantic intent. Compression runs deterministic and low latency.

In its most fundamental sense, compression is the process of encoding
information using fewer bits or resources than the original representation
by identifying and eliminating statistical redundancies or irrelevant data
within a dataset. Whether applied to digital media, text, or the high-
dimensional vector spaces of Large Language Models, compression relies on
the principle that most raw information contains noise or repeating patterns
that do not contribute new meaning. By applying an algorithm—or in your
case, an ML-based model—to map the input data into a more compact form,
you essentially distil the signal from the noise. In the context of ML
inputs, this means transforming long-form text into a dense, mathematically
efficient representation that preserves the original semantic intent and
logical relationships while significantly reducing the physical token count,
thereby allowing a system to process more information within the same fixed
computational window or budget.

One API call

Send text in, get compressed text back. Drop it in before your LLM call. That's the entire integration.

POSTapi.thetokencompany.com/v1/compress
{
"model": "bear-1.1",
"input": "Your long text to compress..."
}
response
{
"output": "Compressed text...",
"original_input_tokens": 1284,
"output_tokens": 436
}
Read the docs

Use cases

LLM Entertainment & Gaming

Longer memories, richer worlds, same budget.

Meeting Transcription

Distill hours of calls into signal-dense context.

Web Scraping

Strip boilerplate from crawled pages before ingest.

Document Analysis

Fit more PDFs and reports into one context window.

Ready to compress?

Access the compression API.