Documentation

Rate Limits

Kaiko enforces rate limits to ensure fair usage and platform stability. This page explains the V2 limits, how to monitor usage, and best practices.

V2 Rate Limits

Limit	Value	Notes
Requests per minute	1,000 RPM	Per API key
Tokens per month	100,000	Upgradeable (contact sales)
Concurrent requests	10	Per API key
Request timeout	30 seconds	For non-streaming requests
Batch size (batch-analysis)	100 messages	Per batch request

Rate Limit Headers

All API responses include headers to help you track your usage:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 950
X-RateLimit-Reset: 1735862400
Retry-After: 60

Header	Description
X-RateLimit-Limit	Your rate limit per minute
X-RateLimit-Remaining	Requests remaining in current window
X-RateLimit-Reset	Unix timestamp when limit resets
Retry-After	Seconds to wait (only on 429)

Handling 429 Errors

When you exceed the rate limit, the API returns a 429 status code. Implement exponential backoff:

async function callWithRetry(fn, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (error) {
      if (error.status === 429 || error.status >= 500) {
        const delay = Math.pow(2, i) * 1000;
        await new Promise(r => setTimeout(r, delay));
        continue;
      }
      throw error;
    }
  }
  throw new Error('Max retries exceeded');
}

Start with a 1-second delay, doubling each retry.
Respect the Retry-After header if present.
Add jitter (random delay) to prevent thundering herd.
Monitor 429 rates to identify if you need higher limits.

Token Usage

V2 API responses include token usage in the response body:

Field	Description
usage.prompt_tokens	Tokens in the prompt (Chat API)
usage.completion_tokens	Tokens in the LLM response (Chat API)
usage.analyse_tokens	Tokens used for emotion analysis

Best Practices

Batch when possible:Use /v2/emotions/batch-analysis for multiple texts instead of individual calls.
Cache results:For repeated queries, cache emotion analysis results.
Use stateless API for one-off:Context-based APIs have slightly higher overhead.
Monitor usage:Track token consumption via the usage object and dashboard.
Request limit increase:Contact sales@kaikostudios.xyz for enterprise quotas.

Enterprise Limits

For higher limits, contact our sales team:

Custom RPM and token quotas
Dedicated rate limit pools
Priority queue access
SLA guarantees

Email: sales@kaikostudios.xyz

Next: see Error Handling for all error codes and troubleshooting, or Authentication for security best practices.