The Kaiko API enforces fair-use limits to ensure consistent performance for all developers. This page explains rate limits, error formats, and best practices for handling them.
| Endpoint Type | Requests per Minute (RPM) | Notes |
|---|---|---|
/v1/chat/completions | 60 RPM | Higher cost per request due to LLM + emotion |
/v1/emotions/:context_id/... | 120 RPM | Stateful; counts against per-org quota |
/v1/emotions/analyse | 300 RPM | Designed for high-throughput, lightweight use cases |
Every request includes headers to help track your usage:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 12
X-RateLimit-Reset: 1738479302X-RateLimit-Limit → Maximum allowed requests per minuteX-RateLimit-Remaining → Requests left in the current windowX-RateLimit-Reset → UNIX timestamp when your quota resetsAll errors follow a consistent schema:
{
"error": {
"type": "invalid_request_error",
"code": "invalid_model",
"message": "The requested model 'gpt-4.1-micro' does not exist.",
"request_id": "req-7f65ab21d1"
}
}| Field | Description |
|---|---|
type | Category of error (e.g., authentication_error, rate_limit_error, server_error) |
code | Specific error code (invalid_model, context_not_found, etc.) |
message | Human-readable explanation of the error |
request_id | Unique ID for tracing/debugging. Include this when contacting support |
| HTTP Code | Type | Example Code | Meaning | How to Fix |
|---|---|---|---|---|
| 400 | invalid_request_error | missing_field | Payload malformed | Check required fields like messages |
| 401 | authentication_error | invalid_api_key | Missing/invalid key | Ensure x-api-key header is set |
| 403 | forbidden_error | key_revoked | Key revoked or expired | Generate new key in console |
| 404 | not_found_error | context_not_found | Context ID doesn't exist or expired | Verify context_id |
| 409 | conflict_error | duplicate_id | Same external_id reused incorrectly | Use unique IDs per message |
| 413 | invalid_request_error | payload_too_large | Request too big | Reduce message size |
| 422 | invalid_request_error | invalid_role | Invalid role in messages | Allowed: user, assistant, system |
| 429 | rate_limit_error | too_many_requests | Quota exceeded | Retry with exponential backoff |
| 500 | server_error | internal | Unexpected error | Retry; if persistent, contact support |
| 503 | server_error | service_unavailable | Temporary outage | Retry with backoff |
For POST requests, supply an Idempotency-Key header to safely retry without duplicating work.
Use exponential backoff with jitter. Example in pseudocode:
let delay = 1000; // 1s
for (let attempt = 1; attempt <= 5; attempt++) {
try {
callApi();
break;
} catch (e) {
if (e.code === 'too_many_requests') {
await sleep(delay + random(0, 200));
delay *= 2; // 2s, 4s, 8s...
} else {
throw e;
}
}
}x-api-key set correctly?messages array is correctX-RateLimit-Remainingrequest_id → Helps Kaiko support trace your errorrequest_id and timestamp)