Making the most of GitHub rate limitsJul 26, 2022 · 4 minute read · Comments
The GitHub documentation has a lot of good advice about rate limits for its API, and how to make the most of them. However, since using the GitHub API, there are some things I’ve discovered that the documentation doesn’t cover, or doesn’t cover so well.
This topic is actually covered very well in the GitHub documentation. To summarise, all REST API requests will return
ETag headers, and most will return
Last-Modified. You can make use of these by making subsequent requests with the
If-Modified-Since headers respectively. If the resource hasn’t been modified, you’ll get back a
304 Not Modified response, and the request won’t count against your rate limit.
To show you what I mean, here’s a short example:
The first request uses one request of my rate limit, taking it from 4994 to 4993. But the next two requests use
If-Modified-Since headers, so my rate limit is still 4993.
Unfortunately, conditional requests are only available for the REST API. HTTP caching over GraphQL is not a simple problem, and it’s unlikely that GitHub will ever support it.
The GitHub REST API documentation covers conditional requests pretty well. The reason I’m mentioning it? Well, the documentation says that you can use
If-Modified-Since interchangeably—but they’re not equivalent. Take a look at this example:
And if I make the same request a little bit later…
The ETag is different but the Last-Modified time is still the same as before. Based on this StackOverflow question, it appears as if this has been an issue for a while. So if a response has both an
ETag and a
Last-Modified time, I’d recommend using the
Last-Modified time to make conditional requests.
Both REST and GraphQL
Saying “rate limit” isn’t really accurate. What I actually mean is “rate limits”. GitHub actually has nine different rate limits. Some are for very specific use cases, like
integration_manifest for the GitHub App Manifest code conversion endpoint. But the two that are most useful are
core (AKA REST) and
If I make a request to the rate limit endpoint, you can see all the different rate limits.
Depending on what API calls you want to make, you can intelligently split them across the REST and GraphQL APIs to achieve a higher overall limit. For example, if a GraphQL call is going to cost a lower number of points than the number of REST calls required to get the same data, you should make those calls via the GraphQL API. You should also bear in mind that you can make conditional requests to the REST API, but not to the GraphQL API.
Maximise page size
Whenever you’re making a request to an endpoint with pagination, you should check what the maximum results per page are and set your query parameter to that size.
The default size for most endpoints is 30 results, but the maximum size is often 100. If you forget to set this you might need to make four times as many requests to get the same number of results.
Most API calls allow you to sort them based on a date field when querying an endpoint. If you use this—and do some caching on your end as well—you can avoid having to fetch all pages for a request whenever you have a cache request.
For example, if you need to fetch the most recently changed pull requests for a repository, you should be sorting by
updated and storing a local cache of pull requests. That way a conditional request cache miss won’t require you to fetch all the pages of a request. You can compare each page to your local cache, and only fetch the next page if required.
This tip isn’t strictly about rate limits, but is useful when you’re eking out every last bit of performance. Nearly all GitHub REST API endpoints support
HEAD requests, in addition to the other HTTP verbs. If you’re already using conditional requests, you can avoid having the body of a request sent over the wire by sending a
HEAD request instead.
For example, here’s the header and body size for a
And here’s the header and body size for the
By making a
HEAD request instead of a
GET request, you can avoid being sent 137KB.
There is a trade-off, though. If you use conditional requests and have a cache miss, you’ll have to make the
GET request anyway.
Using these methods I’ve managed to eke out every bit of performance of the GitHub API for my integrations. Let me know what methods you use, or if there’s anything I’ve missed.