Vercel AI SDK

Middleware compression for any Vercel AI SDK provider.

Wrap any Vercel AI SDK language model with automatic compression. Works with all providers - OpenAI, Anthropic, Google, and others.

Quick start

import { openai } from "@ai-sdk/openai";
import { generateText } from "ai";
import { withCompression } from "the-token-company/ai-sdk";

const model = withCompression(openai("gpt-5.4"), {
compressionApiKey: "ttc-...",
});

const { text } = await generateText({
model,
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Your long prompt text here..." },
],
});

console.log(text);

Middleware

Use compressionMiddleware to compose with other middleware:

import { wrapLanguageModel } from "ai";
import { openai } from "@ai-sdk/openai";
import { compressionMiddleware } from "the-token-company/ai-sdk";

const model = wrapLanguageModel({
model: openai("gpt-5.4"),
middleware: compressionMiddleware({
compressionApiKey: "ttc-...",
}),
});

Streaming

Works with streamText in Next.js API routes:

import { openai } from "@ai-sdk/openai";
import { streamText } from "ai";
import { withCompression } from "the-token-company/ai-sdk";

const model = withCompression(openai("gpt-5.4"), {
compressionApiKey: "ttc-...",
});

export async function POST(req: Request) {
const { messages } = await req.json();

const result = streamText({
model,
messages,
});

return result.toDataStreamResponse();
}

Per-role aggressiveness

const model = withCompression(openai("gpt-5.4"), {
compressionApiKey: "ttc-...",
aggressiveness: {

system: 0.1, // light - preserve instructions
user: 0.4, // moderate - compress user messages
tool: 0.6, // aggressive - compress tool results
},
});

Any provider

The wrapper works with any AI SDK provider. Here's an example with Anthropic:

import { anthropic } from "@ai-sdk/anthropic";
import { generateText } from "ai";
import { withCompression } from "the-token-company/ai-sdk";

const model = withCompression(anthropic("claude-sonnet-4-6"), {
compressionApiKey: "ttc-...",
});

const { text } = await generateText({
model,
prompt: "Your long prompt text here...",
});
Info
The middleware compresses system, user, and tool messages. Assistant messages pass through unchanged.