Web Excursions 2022-01-07
How we handle 80TB and 5M page views a month for under $400
[Background: Poly Haven is a curated public asset library for visual effects artists and game designers, providing downloable 3D assets for Blender. ]
Cloudflare All the Way
Because of Argo, Cloudflare can cache traffic [for static files] even better than the huge asset files, resulting in about a 93% cache ratio.
Cloudflare’s Pro membership costs $20 per domain, and we have two domains: polyhaven.com for the main website, and polyhaven.org for the asset downloads.
Backblaze Bandwidth Alliance
As long as we use both services together, and pay for the $20 cloudflare subscription, we don’t get charged for download traffic at all.
All we have to pay is the storage fee, some upload costs and API requests, which comes to around $11 per month.
Web “Server”
polyhaven.com is built with Next.js – a javascript framework created by Vercel.
While you can absolutely run a Next.js application on your own servers or half a dozen other cloud providers, Vercel offers a fairly straight forward and affordable service to deploy your web application with them directly.
Their base fee is $20 per month, with additional costs based on usage.
Since we use Cloudflare in front of Vercel and are super careful about what can be cached and what can’t (e.g. anything requiring user authentication), we generally don’t go over the included usage limits unless something goes terribly wrong.
In fact, we recently acquired sponsorship from Vercel, so they now cover our costs anyway
The Database
To avoid performance issues in the future, I decided to splurge a bit and go for a cloud solution where I wouldn’t have to worry about reliability, performance, scaling or integrity ever again: Google Firestore.
This is certainly not the cheapest option, at around $100 per month
Every time a bit of data is fetched from the database, we pay a tiny amount. Per read it’s practically insignificant, at $0.0000006. But multiply this by the number of document reads you have to do per page view (e.g. on our library page, that’s one read per asset, 866 currently), multiplied by the number of page views (~5M per month) and it can get very expensive very quickly.
To avoid database reads as much as possible, we cache as much as possible.
Our API
we have a separate $5 server (yes, seriously) on Vultr that runs our API.
The purpose of this API is to connect our front-end website (on Vercel) to our database. And because we cache everything so heavily (in order to reduce database costs), the API server can be extremely basic.
Argo
Argo is an optional extra service from Cloudflare that does two things:
Optimizes DNS routes to improve latency (this helps our site feel faster to users).
Adds an additional layer to the cache, so all global traffic goes through only a handful of their biggest data centers before splitting up to their hundreds of edge nodes.
This results in a much higher cache ratio (going from ~75% to 93% with our configuration), but comes at a cost: You now pay per GB of traffic going through the Argo network.
we have Argo enabled on the .com domain which runs the website with relatively low bandwidth, and then we leave Argo disabled on the .org domain which serves the 80TB of download traffic.
Argo costs us about $160 per month currently, by far the biggest single expense. But remember it saves us a huge chunk of database usage fees, by my back-of-the-napkin math around $250.
Image Hosting
Bunny.net doesn’t just store our images though, they also have an optimization service which allows us to dynamically resize and compress images for the website.
This costs us about $27 per month, depending on traffic.
Future Thoughts and Contingencies
Our API server is currently our weakest single point of failure
Google Firebase is nice and convenient, but it is quite expensive.
One of our biggest cost savings is with the Cloudflare + Backblaze Bandwidth Alliance [which may] cease to exist at some point in the future