Everything AI can do — one endpoint.
Hover any tile to see the top models we recommend.Tap any tile to see the top models for each capability. Swap the model string — keep your code, prompts, tools.
Chat & reasoning
Frontier chat with function-calling, tool use, structured output.
Vision
Images, PDFs, screenshots → structured understanding.
Image generation
Concept art, product mocks, brand assets. One endpoint.
Speech → text
Diarization, timestamps, 99+ languages. Realtime or batch.
Text → speech
Natural voices in 30+ languages. Stream or batch hours of audio.
Video
Generate, extend, image-to-video. Sora-class quality.
Realtime voice
Sub-300ms voice agents. Two-way audio with function calling.
Web search & enrichment
Live grounding, structured extraction, real-time facts.
300+ models.
Zero contracts.
Frontier, fast, open-weight, specialty — switch by changing one string.
You pay $15.
You get $100 of compute.
Our buying power becomes your pricing edge. Then every request bills at the upstream's actual cost — no markup, no padded tokens.
How $15 becomes $100.
You buy $100 of credit
Pay $15 in crypto or via redeem code. We seed your wallet with $100 of upstream credit immediately.
Every call bills at upstream cost
A chat request that costs us $0.0042 bills your wallet for $0.0042. Itemized in USD + tokens, no rounding tricks.
No expiry, no minimums after
Credit doesn't disappear. Re-top whenever — same 85%-off ratio applies.
launch pricing · while supplies last ·first 1,000 wallets
Three lines.
That's the migration.
Change the base URL. Change the API key. Swap the model name to anything in the catalog. Your code, your prompts, your tools — unchanged.
- 100% schema-compatible with the OpenAI SDK
- Streaming, function-calling, structured output
- Vision, audio, image — same endpoint shape
- Auto-retries, fallback models, request IDs
from openai import OpenAI
client = OpenAI(
base_url="https://api.echotokens.com/v1",
api_key="sk-echo-...",
)
response = client.chat.completions.create(
model="claude-opus-4.7",
messages=[{"role": "user", "content": "Hello, world."}],
)Already in your
favorite tool.
Anywhere the OpenAI SDK runs, echotokens runs. Paste your key, change the base URL — works in seconds across editors, agents, desktop apps, and self-hosted UIs.
Cursor
AI-first code editor
Claude Code
Anthropic's CLI agent
OpenCode
Open-source coding agent
Cline
Autonomous VS Code agent
Roo
Multi-mode AI dev assistant
Kilo Code
Open-source code agent
Aider
AI pair programmer in your terminal
Goose
On-machine AI agent
OpenClaw
Multi-model orchestrator
Claude Desktop
Native Anthropic client
OpenWebUI
Self-hosted ChatGPT alternative
JanitorAI
Roleplay & companion bots
Hermes
Personal AI dashboard
Shakespeare
Long-form writing studio
RxResume
AI-powered resumes
Engineered for production.
Priced like a deal.
Streaming. Fallback. Caching. Schemas. Spend limits. Webhooks. The infrastructure you'd build yourself — already built.
Real-time streaming
Server-Sent Events from every chat model. First-token p50 under 400ms.
Smart fallback
Upstream rate-limited? We retry on a sibling model. You set the priority list.
Prompt caching
Anthropic & OpenAI cache discounts honored. Repeated context bills at the discounted rate.
Structured output
JSON-schema mode and function-calling on every model that supports it — uniform error shape.
Spending limits
Cap a key at $5/day or $50/month. Hard ceilings, soft warnings, rolling windows.
Itemized usage
Every request: input + output tokens, USD cost, latency, status, request ID — exportable as CSV.
Webhooks
Long-running video and image jobs ping your endpoint when ready. No polling.
IP allowlisting
Lock keys to specific CIDR ranges. Per-key, per-environment, zero downtime to update.
Bring your own data
Self-host the dashboard on Firebase + your own Firestore. The portal is open-source.
Ship in five minutes.
Sign up, drop in a key, point your SDK. The next hundred dollars of compute is on us — at fifteen.