The CDN is Dead, Long Live the CDN! - Cache Partitioning in Firefox and Chrome

Dan   

Its long been conventional wisdom that if you're using a third party library for your website, like jQuery or Bootstrap, that you use their high performance CDN for loading. For example jQuery has

code.jquery.com

from which you can include any version of jQuery on your website. eg:

<script src="https://code.jquery.com/jquery-3.3.1.min.js"
          integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8="
          crossorigin="anonymous"></script>

This is supposed to give you the benefit of loading the jQuery library from their super fast global CDN, AND if some other site your client has visited also included jQuery on their site, in the same way, then the file would already be in their browser cache. If this happened the file could be fetched from the browser cache and you have an even faster website! Win win.

In practice website visitors rarely see any benefit. First, there is the drawback that the browser has to open a separate connection to the third party CDN, this can negate any latency benefits the CDN provides. Second, there are usually so many different versions of major libraries in use, that the chance of two websites using the same version, AND both loading it from the CDN are small.

Cache Partitioning in Chrome and Firefox

Until recently, Chrome and Firefox had a shared browser cache for all websites that a user visited. This means if you visited a website and it loaded the resource:

https://www.somesite.com/foo.js

and you then visited a second website, and it also included the same resource, then the resource would be loaded from the shared cache rather than being downloaded from the internet a second time. Cookies set by these resources would also be shared.

As of Firefox v85 and Chrome v86 the browser cache will be partitioned, this means that the same resource included on two sites will have to be downloaded from the internet twice and cached separately.

Why are they doing this?

The primary reason is privacy. The shared cache in browsers has been used by unscrupulous operators to track users without their consent across different websites. They do this by utilising cache side channel attacks.

What Are Side Channel Attacks?

In computer systems a program or algorithm might not have any bugs and be perfectly secure, but its interaction with the computer might leave information that an attacker can exploit. A great analogy from the spy world is using a laser beam to measure the vibrations on a pane of glass to hear a conversation going on inside a room that you otherwise couldn't hear.

Side channel attacks can be ingenious, utilising techniques such as:

  • observing power consumption/electro-magnetic radiation;
  • timing data movement in and out of memory;
  • timing how long a cpu takes to execute an instruction.
  • measuring sounds emitted by a hard drive.

With respect to the browser cache, an example attack is the user opening a malicious website. This malicious website then requests resources (eg an image) from another site. By timing how long it takes for the browser to load that image it can determine whether it was fetched from the browser cache, or had to download it over the internet. According to Google this technique can be used to:

  • Detect if a user has visited a specific site.
  • Detect if an arbitrary string is in the user's search results by checking for 'so search result' images used by particular sites
  • Track users across sites using the cache.

Github user terjanq has published a cache side channel attack can gather all sorts of information from Google service, he states that a regular Google user could have the following information leaked:

  • search history
  • videos watched
  • the exact URLs visited
  • time frames of the activities
  • private book collection
  • books read / purchased / bookmarked / favorite / etc.
  • private emails
  • tokens / credit card numbers / phone numbers / etc.
  • contacts (including email addresses, names, phone numbers)
  • bookmarked websites
  • and more.

Perhaps it was easier to just implement cache partitioning in Chrome rather than patch these vulnerabilities...

What you should do

Google estimates that the changes in browser caching will have minimal impact, as little as a 0.3% difference in the FCP (First Contentful Paint). However, since there will no longer be any benefits to hosting on a third party CDN you should:

  • Host all third party libraries on your own domain to eliminate the need for the browser to establish a second connection.
  • Utilise a modern CDN, like Peakhour, which transparently caches and serves your website assets on its global CDN.

Conclusion

There really are some third party resources that are shared across millions of websites. The most ubiquitous being google fonts. Under the new cache partitioning implementations these fonts will have to be downloaded for every site that uses them, unfortunately this will have some impact on site speed and data usage. However, it also has the side effect of improving Google's ability to track users, as now users have to download the font for every site they visit revealing to Google where you've been.

Safari has been partitioning the HTTP cache since 2013, leaving Microsoft's EDGE as the last major browser with global HTTP caches, however future versions will be based on Chromium (the open source version of Chrome) so will logically get cache partitioning by default.

Website Performance Learning

© PEAKHOUR.IO PTY LTD 2024   ABN 76 619 930 826    All rights reserved.