Quickstart

Five minutes from zero to your first request. echotokens speaks the OpenAI API — point your existing SDK at our base URL and the rest of your code keeps working.

1. Get an API key

Open the keys page in the portal, click New key, and copy the sk-echo-... token. Store it in your environment (.env, secret manager, whatever you use for other API keys). The token is shown once at create time and cannot be retrieved later.

treat keys like passwords

Anyone holding your key can spend your wallet. If a key leaks, revoke it from the keys page and rotate every dependent service before the next deploy.

2. Install the OpenAI SDK

You don't need a new SDK — the official OpenAI client works as-is. If you don't have it yet:

pip install openai
# or
npm install openai

3. Make your first call

The only difference from a vanilla OpenAI setup is the base_url. Pick your language:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.echotokens.com/v1",
    api_key="sk-echo-...",
)

response = client.chat.completions.create(
    model="claude-opus-4.7",
    messages=[{"role": "user", "content": "Write a haiku about TCP."}],
)

print(response.choices[0].message.content)
print(f"Cost: {response.cost_usd_cents} cents")
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.echotokens.com/v1",
  apiKey: process.env.ECHOTOKENS_API_KEY,
});

const res = await client.chat.completions.create({
  model: "gpt-5.5-pro",
  messages: [{ role: "user", content: "Write a haiku about TCP." }],
});

console.log(res.choices[0].message.content);
console.log(`Cost: ${res.cost_usd_cents} cents`);
curl https://api.echotokens.com/v1/chat/completions \
  -H "Authorization: Bearer sk-echo-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-pro-preview",
    "messages": [{ "role": "user", "content": "Write a haiku about TCP." }]
  }'

4. Read the cost field

Every successful response includes a cost_usd_cents field — an integer count of US cents we charged your wallet for that request. There's no token math to do, no model multiplier to look up. The number is exactly what the upstream provider charged us, in your dollars.

res = client.chat.completions.create(model="claude-opus-4.7", messages=[...])
print(res.cost_usd_cents)
# 0.42  → request cost less than half a cent

Picking a model

Pass any model id from our catalog as the model parameter. A few good starting points:

  • General-purpose chat: claude-opus-4.7 (long-context reasoning) or gpt-5.5-pro (broad capability).
  • Fast & cheap: gemini-3-flash-preview, gpt-5-mini, haiku-4-fast.
  • Vision: any frontier model — pass image content blocks per the OpenAI spec.
  • Specialty: flux-2-pro (images), veo-3.1-quality (video), deepgram-nova-3 (transcription).

The full catalog appears in the portal's model picker; the gateway tracks upstream releases so new model IDs work the day they land.

ship before you optimize

The OpenAI SDK's defaults (retries, timeouts, structured outputs, streaming) all work unchanged against echotokens. Get your code running first, then explore streaming, images, or embeddings for surface-specific patterns.