Vercel AI SDK
Middleware compression for any Vercel AI SDK provider.
Wrap any Vercel AI SDK language model with automatic compression. Works with all providers - OpenAI, Anthropic, Google, and others.
Quick start
import { openai } from "@ai-sdk/openai";
import { generateText } from "ai";
import { withCompression } from "the-token-company/ai-sdk";
const model = withCompression(openai("gpt-5.4"), {
compressionApiKey: "ttc-...",
});
const { text } = await generateText({
model,
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Your long prompt text here..." },
],
});
console.log(text);Middleware
Use compressionMiddleware to compose with other middleware:
import { wrapLanguageModel } from "ai";
import { openai } from "@ai-sdk/openai";
import { compressionMiddleware } from "the-token-company/ai-sdk";
const model = wrapLanguageModel({
model: openai("gpt-5.4"),
middleware: compressionMiddleware({
compressionApiKey: "ttc-...",
}),
});Streaming
Works with streamText in Next.js API routes:
import { openai } from "@ai-sdk/openai";
import { streamText } from "ai";
import { withCompression } from "the-token-company/ai-sdk";
const model = withCompression(openai("gpt-5.4"), {
compressionApiKey: "ttc-...",
});
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model,
messages,
});
return result.toDataStreamResponse();
}Per-role aggressiveness
const model = withCompression(openai("gpt-5.4"), {
compressionApiKey: "ttc-...",
aggressiveness: {
system: 0.1, // light - preserve instructions
user: 0.4, // moderate - compress user messages
tool: 0.6, // aggressive - compress tool results
},
});Any provider
The wrapper works with any AI SDK provider. Here's an example with Anthropic:
import { anthropic } from "@ai-sdk/anthropic";
import { generateText } from "ai";
import { withCompression } from "the-token-company/ai-sdk";
const model = withCompression(anthropic("claude-sonnet-4-6"), {
compressionApiKey: "ttc-...",
});
const { text } = await generateText({
model,
prompt: "Your long prompt text here...",
});Info
The middleware compresses
system, user, and tool messages. Assistant messages pass through unchanged.