CloudFront
Edge cache with some compute moderately bizarre limitations
Warning: This section is under construction
Caching GitHub API Responses
On this wiki itself, I leverage remote content from https://github.com/thiskevinwang/public-docs
To construct the lefthand navigation sidebar, I leverage the structure of the filesystem from https://github.com/thiskevinwang/public-docs. But because this is a separate repository, I call the GitHub API to get that data.
GET /repos/:owner/:repo/git/trees/:tree_sha?recursive=0
gives me back a JSON blob like the following,
and while not depicted, it contains everything under the wiki
directory which is what I ultimately want.
Take note of the sha
field. I use that later in blob requests.
This is 1 API call. Unfortunately it doesn’t provide pretty text like titles or descriptions.
It only returns paths which I use as website slugs. The various MDX files have frontmatter with
name
and description
keys which are pleasing to the human eye.
I can go fire off N+1 or 1+N requests to GET /repos/:owner/:repo/git/blobs/:file_sha
,
passing in the sha
from earlier to fetch individual files' string contents.
Note: I need to set
{"Accept": "application/vnd.github.raw"}
header to fetch the raw string contents.
This way I have access to MDX frontmatter and can attach pretty titles to my previously ugly tree JSON data.
However, with 50-100's of files, this quickly burns through the 5000
request per... I think it resets every hour?...
rate limit that GitHub sets.
So for this class of requests that are conveniently unique via :file_sha
, I can set up a
CloudFront proxy to cache each of these items, which I know will be quite stable, for 1 year.
This significantly cuts down on my GitHub API consumption at page build time, whether that is
at static build time or at on-demand generation time. The math is roughly M * 1 * N
, where:
M
: number of actual pages under/wiki/[[...slug]]
1
: is the single call toGET /repos/:owner/:repo/git/trees/:tree_sha?recursive=0
N
: is multiple calls toGET /repos/:owner/:repo/git/blobs/:file_sha
- (All of these
N
requests now go to CloudFront)
- (All of these
CloudFront Settings
For the CloudFront proxy, the notable settings I set are:
- Use Legacy cache settings
- Include
Accept
andAuthorization
headers to be part of the cache key - Set custom cache TTLs (seconds)
- Minimum:
31536000
- Maximum:
31536000
- Default:
31536000
- Minimum:
That's it. And it's ready in about 3 minutes.
The uniqueness of the GitHub API endpoint, /repos/:owner/:repo/git/blobs/:file_sha
, lends it
self nicely to a almost-zero-config CloudFront setup.
Rate-limit sanity check
Here's Postman Test script for handy rate-limit observability when making
GitHub API calls. I know there's a dedicated /rate_limit
endpoint, but
I prefer not calling that simply to avoid tabbing back and forth within Postman.