Skip to main content

Setup

Create a new Cloudflare Worker project:
npm create cloudflare@latest kyma-worker
cd kyma-worker

Worker Code

import OpenAI from "openai";

export default {
  async fetch(request: Request, env: any): Promise<Response> {
    if (request.method !== "POST") {
      return new Response("POST /chat with { message, model? }", { status: 405 });
    }

    const client = new OpenAI({
      baseURL: "https://kymaapi.com/v1",
      apiKey: env.KYMA_API_KEY,
    });

    const { message, model } = (await request.json()) as any;

    const response = await client.chat.completions.create({
      model: model || "qwen-3.6-plus",
      messages: [{ role: "user", content: message }],
    });

    return new Response(
      JSON.stringify({ reply: response.choices[0].message.content }),
      { headers: { "Content-Type": "application/json" } }
    );
  },
};

Add Your API Key

npx wrangler secret put KYMA_API_KEY
# Paste your Kyma API key from https://kymaapi.com

Deploy

npx wrangler deploy
Your AI API is now live at https://kyma-worker.<your-subdomain>.workers.dev.

Test It

curl -X POST https://kyma-worker.<your-subdomain>.workers.dev \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello!", "model": "qwen-3.6-plus"}'

Streaming

For streaming responses, pass through the SSE stream directly:
const response = await client.chat.completions.create({
  model: model || "qwen-3.6-plus",
  messages: [{ role: "user", content: message }],
  stream: true,
});

return new Response(response.toReadableStream(), {
  headers: { "Content-Type": "text/event-stream" },
});
Cloudflare Workers have a 30-second CPU time limit. For long responses, use streaming to avoid timeouts.