Rate Limits

The Datagrid API enforces rate limits to protect the platform from abuse and ensure fair usage across all consumers.

Default rate limit

The default rate limit is 200 requests per 60-second sliding window, but individual endpoints may enforce their own limits. Check the X-RateLimit-Limit response header to see the effective limit for any given endpoint. The limit is scoped per teamspace, endpoint path, and HTTP method. For example, POST /v1/converse and GET /v1/agents maintain independent rate limit windows within the same teamspace.

Response headers

Every Datagrid API response includes rate limit headers so you can monitor your usage proactively:

Header	Description	Example
`X-RateLimit-Limit`	Maximum requests allowed in the current window	`200`
`X-RateLimit-Remaining`	Requests remaining in the current window	`195`
`X-RateLimit-Reset`	Unix epoch second when the current window resets	`1741275120`
`Retry-After`	Seconds until the window resets (present when `X-RateLimit-Remaining` is `0`, including on `429` responses)	`30`

429 Too Many Requests

When the rate limit is exceeded, the API returns a 429 status code with the following body:

{
  "status_code": 429,
  "statusCode": 429,
  "error": "rate_limit_exceeded",
  "message": "Rate limit exceeded",
  "mitigation": "Implement exponential backoff and retry with delays",
  "retryable": true,
  "details": {
    "reason": "Rate limit exceeded. Please retry after the current window resets."
  }
}

statusCode is deprecated and will be removed in a future version. Use status_code instead.

The response also includes the Retry-After header indicating how many seconds to wait before retrying.

Batch prediction rate limits

Batch prediction endpoints also use 429 Too Many Requests, but they return RFC 9457 problem details (application/problem+json) instead of the legacy JSON shape above.

{
  "type": "https://api.datagrid.com/errors/rate_limit_exceeded",
  "title": "Rate Limit Exceeded",
  "status": 429,
  "detail": "Too many batch prediction requests are already in progress for this teamspace."
}

Batch prediction 429 responses may also include:

Header	Description
`X-RateLimit-Limit`	The numeric limit for the rule that rejected the request.
`X-RateLimit-Remaining`	Remaining capacity for the matched rule, usually `0` on `429`.
`X-RateLimit-Reset`	Unix epoch second when capacity is expected to reset, when applicable.
`RateLimit-Policy`	The batch admission-control rule that rejected the request.
`RateLimit`	Current limit state for the matched policy.
`Retry-After`	Seconds to wait before retrying.

Batch create requests can be rejected by organization concurrency, enqueued-item, create-rate, or temporary global-capacity policies. Treat these responses as retryable and honor Retry-After before submitting more batch work.

Best practices

SDK automatic retries

The official Python and TypeScript/JavaScript SDKs automatically retry 429 responses up to 2 times with exponential backoff. You can configure this via the maxRetries option:

Python

from datagrid_ai import Datagrid

client = Datagrid(max_retries=5)  # default is 2; set to 0 to disable

TypeScript

import Datagrid from 'datagrid-ai';

const client = new Datagrid({ maxRetries: 5 }); // default is 2; set to 0 to disable

If you are using the SDK, you typically do not need to implement your own retry logic.

Use exponential backoff

If you are calling the API directly (without the SDK), implement retry logic yourself. When you receive a 429, wait for the number of seconds specified in the Retry-After header before retrying. If Retry-After is not available, use exponential backoff starting at 1 second, doubling with each retry up to a maximum of 60 seconds.

Python

import time
import httpx

def request_with_backoff(client, **kwargs):
    max_retries = 5
    delay = 1

    for attempt in range(max_retries):
        response = client.post(**kwargs)
        if response.status_code != 429:
            return response

        retry_after = response.headers.get("Retry-After")
        wait = int(retry_after) if retry_after else delay
        time.sleep(wait)
        delay = min(delay * 2, 60)

    return response

TypeScript

async function requestWithBackoff(
  fn: () => Promise<Response>,
  maxRetries = 5
): Promise<Response> {
  let delay = 1000;

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fn();
    if (response.status !== 429) return response;

    const retryAfter = response.headers.get("Retry-After");
    const wait = retryAfter ? parseInt(retryAfter, 10) * 1000 : delay;
    await new Promise((resolve) => setTimeout(resolve, wait));
    delay = Math.min(delay * 2, 60_000);
  }

  return fn();
}

Monitor X-RateLimit-Remaining

Check the X-RateLimit-Remaining header on every response. If it drops below a threshold (e.g., 10% of the limit), slow down your request rate before hitting a 429.

Distribute requests across endpoints

Since rate limits are scoped per endpoint path and HTTP method, you can make concurrent requests to different endpoints without them counting against the same window.

Introduction

Converse

Voice

Identity

Agents

Knowledge

Search

Conversations

Messages

Connections

Connection Providers

Connectors

Files

Tools

Secrets

Webhooks

Memory

Pages

MCP Servers - Beta

Batch Predictions

Organization

Default rate limit

Response headers

429 Too Many Requests

Batch prediction rate limits

Best practices

Introduction

Converse

Voice

Identity

Agents

Knowledge

Search

Conversations

Messages

Connections

Connection Providers

Connectors

Files

Tools

Secrets

Webhooks

Memory

Pages

MCP Servers - Beta

Batch Predictions

Organization

Documentation Index

​Default rate limit

​Response headers

​429 Too Many Requests

​Batch prediction rate limits

​Best practices

Default rate limit

Response headers

429 Too Many Requests

Batch prediction rate limits

Best practices