Vercel AI SDK

Middleware compression for any Vercel AI SDK provider.

Wrap any Vercel AI SDK language model with automatic compression. Works with all providers - OpenAI, Anthropic, Google, and others.

Quick start

import { openai } from "@ai-sdk/openai";
import { generateText } from "ai";
import { withCompression } from "the-token-company/ai-sdk";

const model = withCompression(openai("gpt-5.4-mini"), {
    compressionApiKey: "ttc-...",
});

const { text } = await generateText({
    model,
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "Your long prompt text here..." },
    ],
});

console.log(text);

Middleware

Use compressionMiddleware to compose with other middleware:

import { wrapLanguageModel } from "ai";
import { openai } from "@ai-sdk/openai";
import { compressionMiddleware } from "the-token-company/ai-sdk";

const model = wrapLanguageModel({
    model: openai("gpt-5.4-mini"),
    middleware: compressionMiddleware({
        compressionApiKey: "ttc-...",
    }),
});

Streaming

Works with streamText in Next.js API routes:

import { openai } from "@ai-sdk/openai";
import { streamText } from "ai";
import { withCompression } from "the-token-company/ai-sdk";

const model = withCompression(openai("gpt-5.4-mini"), {
    compressionApiKey: "ttc-...",
});

export async function POST(req: Request) {
    const { messages } = await req.json();

    const result = streamText({
        model,
        messages,
    });

    return result.toDataStreamResponse();
}

Per-role aggressiveness

const model = withCompression(openai("gpt-5.4-mini"), {
    compressionApiKey: "ttc-...",
    aggressiveness: {

        system: 0.1,   // light - preserve instructions
        user: 0.4,     // moderate - compress user messages
        tool: 0.6,     // aggressive - compress tool results
    },
});

Any provider

The wrapper works with any AI SDK provider. Here's an example with Anthropic:

import { anthropic } from "@ai-sdk/anthropic";
import { generateText } from "ai";
import { withCompression } from "the-token-company/ai-sdk";

const model = withCompression(anthropic("claude-sonnet-4-5"), {
    compressionApiKey: "ttc-...",
});

const { text } = await generateText({
    model,
    prompt: "Your long prompt text here...",
});

Info

The middleware compresses system, user, and tool messages. Assistant messages pass through unchanged.