Documentation

Rate Limits & Error Handling

The Kaiko API enforces fair-use limits to ensure consistent performance for all developers. This page explains rate limits, error formats, and best practices for handling them.

1. Rate Limits

Default Limits
Endpoint TypeRequests per Minute (RPM)Notes
/v1/chat/completions60 RPMHigher cost per request due to LLM + emotion
/v1/emotions/:context_id/...120 RPMStateful; counts against per-org quota
/v1/emotions/analyse300 RPMDesigned for high-throughput, lightweight use cases
Response Headers

Every request includes headers to help track your usage:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 12
X-RateLimit-Reset: 1738479302
  • X-RateLimit-Limit → Maximum allowed requests per minute
  • X-RateLimit-Remaining → Requests left in the current window
  • X-RateLimit-Reset → UNIX timestamp when your quota resets
Best Practices
  • • Plan for retries — use exponential backoff (e.g., 1s, 2s, 4s...)
  • • Batch where possible — analyze multiple messages in one call if supported
  • • Separate workloads — use context API for chatbots, stateless for analytics

2. Error Response Format

All errors follow a consistent schema:

{
  "error": {
    "type": "invalid_request_error",
    "code": "invalid_model",
    "message": "The requested model 'gpt-4.1-micro' does not exist.",
    "request_id": "req-7f65ab21d1"
  }
}
Fields
FieldDescription
typeCategory of error (e.g., authentication_error, rate_limit_error, server_error)
codeSpecific error code (invalid_model, context_not_found, etc.)
messageHuman-readable explanation of the error
request_idUnique ID for tracing/debugging. Include this when contacting support

3. Common Errors

HTTP CodeTypeExample CodeMeaningHow to Fix
400invalid_request_errormissing_fieldPayload malformedCheck required fields like messages
401authentication_errorinvalid_api_keyMissing/invalid keyEnsure x-api-key header is set
403forbidden_errorkey_revokedKey revoked or expiredGenerate new key in console
404not_found_errorcontext_not_foundContext ID doesn't exist or expiredVerify context_id
409conflict_errorduplicate_idSame external_id reused incorrectlyUse unique IDs per message
413invalid_request_errorpayload_too_largeRequest too bigReduce message size
422invalid_request_errorinvalid_roleInvalid role in messagesAllowed: user, assistant, system
429rate_limit_errortoo_many_requestsQuota exceededRetry with exponential backoff
500server_errorinternalUnexpected errorRetry; if persistent, contact support
503server_errorservice_unavailableTemporary outageRetry with backoff

4. Retry Guidelines

Idempotency

For POST requests, supply an Idempotency-Key header to safely retry without duplicating work.

Backoff

Use exponential backoff with jitter. Example in pseudocode:

let delay = 1000; // 1s
for (let attempt = 1; attempt <= 5; attempt++) {
  try {
    callApi();
    break;
  } catch (e) {
    if (e.code === 'too_many_requests') {
      await sleep(delay + random(0, 200));
      delay *= 2; // 2s, 4s, 8s...
    } else {
      throw e;
    }
  }
}

5. Debugging Checklist

  • 🔑 Check your key → Is x-api-key set correctly?
  • ✅ Validate payload → Run through JSON lint, ensure messages array is correct
  • 🎯 Confirm endpoint → Chat vs Context vs Non-Context APIs differ
  • ⏱ Watch limits → Inspect X-RateLimit-Remaining
  • 📝 Log request_id → Helps Kaiko support trace your error

6. Support

  • • Status Page → status.kaiko.ai
  • • Community → Discord / Forum
  • • Direct Support → support@kaiko.ai (include request_id and timestamp)