The current rate limits

The token limits include the input + output tokens

  • tokens per minute: 40 000
  • tokens per day: 1 000 000
  • requests per minute: 60
  • requests per day: 28 800
If you need higher limits contact us

Rate limit headers

We set the following x-ratelimit headers to inform you on current rate limits applicable to you.

The following headers are set (values are illustrative):

HeaderValueNotes
retry-after2Seconds to wait until retrying*
x-ratelimit-limit-requests28800Requests per day allowed
x-ratelimit-limit-tokens40000Tokens per minute allowed
x-ratelimit-remaining-requests123Requests remaining for the day
x-ratelimit-remaining-tokens1337Tokens remaining for this minute
x-ratelimit-reset-requests1337sSeconds until the daily rate limit resets
x-ratelimit-reset-tokens1sSeconds until the minute based token limit resets

* The retry-after header is only returned if the response status code is 429 and the request was rate limited