How we get fast global routing and caching
Platform to help users find, publish, and use R packages.
With or without CRAN.
R-universe runs on self hosted server in NYC, but all traffic is routed via cloudflare CDN (free).
These URLs are the same server and no caching is involved.
# Direct connetion to NYC-3
curl -OL https://dev.opencpu.org/ubuntu-2404.iso
# Same backend server but routed via cloudflare
curl -OL https://proxy.opencpu.org/ubuntu-2404.iso
The further you live from NYC, the slower the first.
But cloudflare routing can max out your bandwidth anywhere.
Conclusion: even without caching, clients can download from r-universe server at high speed and low latency.
From GHA runners we download from r-universe with 60MB/s (Close to Gbit speed).
We also make cloudflare cache things using Cache-Control
http response headers.
cache-control: public, max-age=60,
Here public
means the CDN may share cache between different clients. Caching age is 1 minute.
BUT: big win is clever cache revalidation based on Etag
or Last modified
.
// Server Middleware:
// get_latest() looks up the last changed record for a given query from
// the database. This is cheap.
return get_latest(query).then(function(doc){
res.set('Cache-Control', `public, max-age=60, stale-while-revalidate=30`);
if(doc){
const etag = `W/"${doc._id}"`;
const date = doc._published.toUTCString();
res.set('ETag', etag);
res.set('Last-Modified', date);
if(etag === req.header('If-None-Match') || date === req.header('If-Modified-Since')){
res.status(304).send(); //DONE!
} else {
next(); //proceed to routing
}
}
...
The query
determines which R packages are considered that could change the requested html page, when updated.
The CDN server uses SHA256 content-addressed URLs:
Files downloaded from their hash are by definition immutable:
Read about this and many other r-universe topics at: https://ropensci.org/technotes/