← Back to blog

February 2026

Introducing bear-1.1: Improved LLM Compression

bear-1.1 is an improved version of the bear-1 compression model. It delivers better accuracy preservation and faster compression speeds while maintaining the same simple API integration.

What's improved in bear-1.1?

bear-1.1 builds on everything that made bear-1 effective and improves on it. The core approach is the same — semantic compression that removes redundant tokens before they reach your LLM — but bear-1.1 does it better.

Better accuracy preservation

bear-1.1 is more precise about which tokens to remove. It better understands context dependencies across long inputs, resulting in higher accuracy on downstream LLM tasks compared to bear-1.

Faster compression speeds

Optimized inference pipeline reduces compression latency. You get the same level of token reduction with lower overhead added to your LLM pipeline.

Same simple API

bear-1.1 is a drop-in replacement for bear-1. Change the model parameter from bear-1 to bear-1.1 in your API call and you're done.

Performance

bear-1.1 maintains the same compression ratios as bear-1 while improving output quality. The key metrics:

66%

Token reduction

3x

Cost reduction

With bear-1.1, compressed prompts preserve more of the semantic signal that matters. This means your LLM sees cleaner input, leading to more accurate and relevant responses.

Upgrading from bear-1

If you're already using bear-1, upgrading to bear-1.1 is a one-line change. Update the model parameter in your compression API call:

- "model": "bear-1"

+ "model": "bear-1.1"

The request and response format is identical. No other changes needed. bear-1 remains available for existing integrations.

When to use bear-1.1

We recommend bear-1.1 for all new integrations. It's the better model — improved accuracy preservation, lower latency, and the same cost. Use bear-1.1 when you need:

  • Chat applications with long conversation histories
  • Document processing and RAG pipelines
  • Any LLM workflow where you want to reduce token costs

Get started with bear-1.1

Create an account and start compressing with the latest model.