Setup
Create a new Cloudflare Worker project:
npm create cloudflare@latest kyma-worker
cd kyma-worker
Worker Code
import OpenAI from "openai";
export default {
async fetch(request: Request, env: any): Promise<Response> {
if (request.method !== "POST") {
return new Response("POST /chat with { message, model? }", { status: 405 });
}
const client = new OpenAI({
baseURL: "https://kymaapi.com/v1",
apiKey: env.KYMA_API_KEY,
});
const { message, model } = (await request.json()) as any;
const response = await client.chat.completions.create({
model: model || "qwen-3.6-plus",
messages: [{ role: "user", content: message }],
});
return new Response(
JSON.stringify({ reply: response.choices[0].message.content }),
{ headers: { "Content-Type": "application/json" } }
);
},
};
Add Your API Key
npx wrangler secret put KYMA_API_KEY
# Paste your Kyma API key from https://kymaapi.com
Deploy
Your AI API is now live at https://kyma-worker.<your-subdomain>.workers.dev.
Test It
curl -X POST https://kyma-worker.<your-subdomain>.workers.dev \
-H "Content-Type: application/json" \
-d '{"message": "Hello!", "model": "qwen-3.6-plus"}'
Streaming
For streaming responses, pass through the SSE stream directly:
const response = await client.chat.completions.create({
model: model || "qwen-3.6-plus",
messages: [{ role: "user", content: message }],
stream: true,
});
return new Response(response.toReadableStream(), {
headers: { "Content-Type": "text/event-stream" },
});
Cloudflare Workers have a 30-second CPU time limit. For long responses, use streaming to avoid timeouts.