Documentation

Everything you need to integrate ApiFast into your application.

1. Getting Started

ApiFast provides access to AI models through an OpenAI-compatible API. No registration is required. Follow these steps:

1

Buy credit — Visit the buy page and deposit crypto (USDT or BTC). Your deposit is multiplied into API credit (current rate & first-deposit bonus shown on the buy page).

2

Receive your key — After payment confirmation, you receive an API key starting with sk_kf_.

3

Start making requests — Use the API exactly like OpenAI. Change the base URL and API key, everything else stays the same.

2. Authentication

Include your API key in the Authorization header as a Bearer token:

Authorization: Bearer sk_kf_your_api_key_here

Never share your API key publicly. If compromised, anyone can use your credit balance. Keys cannot be recovered once lost.

3. Making Requests

Base URL

https://apifast.live/v1

cURL

curl https://apifast.live/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer sk_kf_your_key" \ -d '{ "model": "deepseek-v4-pro", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is quantum computing?"} ], "temperature": 0.7, "max_tokens": 1000 }'

Python (OpenAI SDK)

from openai import OpenAI client = OpenAI( api_key="sk_kf_your_key", base_url="https://apifast.live/v1" ) response = client.chat.completions.create( model="deepseek-v4-pro", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is quantum computing?"} ], temperature=0.7, max_tokens=1000 ) print(response.choices[0].message.content)

JavaScript / Node.js

import OpenAI from "openai"; const client = new OpenAI({ apiKey: "sk_kf_your_key", baseURL: "https://apifast.live/v1", }); const response = await client.chat.completions.create({ model: "deepseek-v4-pro", messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "What is quantum computing?" }, ], temperature: 0.7, max_tokens: 1000, }); console.log(response.choices[0].message.content);

Response Format

{ "id": "chatcmpl-abc123", "object": "chat.completion", "model": "deepseek-v4-pro", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Quantum computing is..." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 25, "completion_tokens": 150, "total_tokens": 175 } }

4. Available Models

One key, every model — no model picker. You choose the model on each request by setting the model field to any Model ID from the table below. The same API key works with all of them, and you can switch models any time without a new key.

All prices are per 1 million tokens, updated daily from live market rates.

Model IDDisplay NameIntelligenceCache / 1MInput / 1MOutput / 1M
kimi-k2.7-codeKimi K2.7 Code#1 55.0*$0.15$0.74$3.50
minimax-m3MiniMax M3#2 44.4$0.06$0.30$1.20
deepseek-v4-proDeepSeek V4 Pro#3 44.3$0.00$0.43$0.87
kimi-k2.6Kimi K2.6#4 42.8$0.34$0.68$3.41
mimo-v2.5-proMiMo-V2.5-Pro#5 42.2$0.00$0.43$0.87
deepseek-v4-flashDeepSeek V4 Flash#6 40.3$0.02$0.09$0.18
glm-5.1GLM 5.1#7 40.2$0.18$0.98$3.08
mimo-v2.5MiMo-V2.5#8 40.1$0.00$0.14$0.28
glm-5GLM 5#9 39.5$0.12$0.60$1.92
kimi-k2.5Kimi K2.5#10 38.1$0.38$2.02
minimax-m2.7MiniMax M2.7#11 38.1$0.05$0.25$1.00
nemotron-3-ultraNemotron 3 Ultra#12 37.8$0.10$0.50$2.20
glm-4.7GLM 4.7#13 33.8$0.08$0.40$1.75
qwen3.5:397bQwen3.5 397B A17B#15 33.7$0.39$2.45
minimax-m2.5MiniMax M2.5#14 33.7$0.05$0.15$0.90
minimax-m2.1MiniMax M2.1#16 31.4$0.03$0.29$0.95
gemma4:31bGemma 4 31B#17 29.4$0.09$0.12$0.35
minimax-m2MiniMax M2#18 28.3$0.03$0.26$1.00
gemini-3-flash-previewGemini 3 Flash Preview#19 27.4$0.05$0.50$3.00
nemotron-3-superNemotron 3 Super#20 25.4$0.09$0.45
deepseek-v3.2DeepSeek V3.2#21 24.7$0.23$0.34
gpt-oss:120bgpt-oss-120b#22 23.8$0.04$0.18
deepseek-v3.1:671bDeepSeek V3.1 Terminus#23 21.4$0.13$0.27$0.95
qwen3-coder-nextQwen3 Coder Next#24 21.2$0.07$0.11$0.80
rnj-1:8bRnj 1 Instruct#25 21.0*$0.15$0.15
qwen3-coder:480bQwen3 Coder 480B A35B#26 18.0$0.22$1.80
mistral-large-3:675bMistral Large#27 16.2$0.20$2.00$6.00
devstral-2:123bDevstral 2 2512#28 15.5$0.04$0.40$2.00
gpt-oss:20bgpt-oss-20b#29 14.9$0.03$0.14
devstral-small-2:24bDevstral Small 2 24b#30 13.1
ministral-3:14bMinistral 3 14B 2512#31 10.0$0.02$0.20$0.20
ministral-3:8bMinistral 3 8B 2512#32 8.9$0.01$0.15$0.15
nemotron-3-nano:30bNemotron 3 Nano 30B A3B#33 7.4$0.05$0.20
ministral-3:3bMinistral 3 3B 2512#34 5.6$0.01$0.10$0.10
gemma3:27bGemma 3 27B#35 4.8$0.08$0.16
gemma3:12bGemma 3 12B#36 3.4$0.05$0.15
gemma3:4bGemma 3 4B#37 1.1$0.05$0.10

List models programmatically

Fetch the live catalog (ids + prices) anytime — handy for building your own model selector:

# OpenAI SDK models = client.models.list() # or plain HTTP curl https://apifast.live/v1/models -H "Authorization: Bearer sk_kf_your_key"

5. Image, Speech & Audio

Beyond chat, the same API key works with image generation, text-to-speech and transcription — each on its own OpenAI-compatible endpoint. These are billed per unit (per image, per 1M characters, or per minute of audio), not per token.

Model IDTypePriceEndpoint
mimo-asrasr$0.006 / minute/v1/audio/transcriptions
mimo-ttstts$30 / 1M chars/v1/audio/speech

Image generation

curl https://apifast.live/v1/images/generations \ -H "Authorization: Bearer sk_kf_your_key" \ -H "Content-Type: application/json" \ -d '{ "model": "minimax-image-01", "prompt": "a neon fox", "n": 1 }' # -> { "data": [{ "b64_json": "..." }], "x_keyforge": { "cost_usd": ..., "units": 1 } }

Text-to-speech

curl https://apifast.live/v1/audio/speech \ -H "Authorization: Bearer sk_kf_your_key" \ -H "Content-Type: application/json" \ -d '{ "model": "minimax-speech-2.6-hd", "input": "Hello world", "voice": "default" }' \ --output speech.mp3 # audio bytes returned; cost + balance in x-keyforge-* response headers

Transcription (speech-to-text)

curl https://apifast.live/v1/audio/transcriptions \ -H "Authorization: Bearer sk_kf_your_key" \ -F model="mimo-asr" \ -F file=@audio.mp3 # -> { "text": "...", "x_keyforge": { "cost_usd": ..., "units": <seconds> } }

6. Tiers & Rate Limits

Your throughput scales with your tier, which is set by your largest deposit (it never goes down). Limits apply per API key across all endpoints. Exceeding them returns a 429 — retry with exponential backoff.

TierUnlocked atRequests / minTokens / min
Tier 1 · Starter≥ $12050,000
Tier 2 · Pro≥ $2060200,000
Tier 3 · Scale≥ $502001,000,000

Concurrency — each key runs ONE request at a time (1 agent per key): concurrent calls on the same key queue and are served one-by-one, never in parallel. To run multiple agents at once, use a separate key per agent. A global per-IP ceiling of 240 requests/min also applies before authentication.

7. Error Codes

StatusReasonDescription
400Bad RequestInvalid request body or missing required fields.
401UnauthorizedMissing or invalid API key.
402Insufficient BalanceYour credit balance is too low for this request.
404Not FoundModel not found or endpoint does not exist.
429Rate LimitedToo many requests. Slow down and retry.
500Internal ErrorServer error. Please retry or contact support.

Error Response Format

{ "error": { "message": "Insufficient balance. Please recharge your key.", "type": "insufficient_quota", "code": "insufficient_quota" } }

Ready to get started?

Buy Your First Credit