Tracking people online is generally done by using cookies: when your browser connects to a website like Google, you get a cookie, and every time you visit a site that uses anything made by Google (80% of the web, but not this site), your browser sends that cookie to Google and they know what you're visiting. But what if you delete your cookies? What if your browser rejects cookies altogether? Well, the web giants have other tricks up their sleeves to identify you for their shady activities: fingerprinting.
Fingerprinting is the shady practice of using unique (or almost unique) characteristics of your browser to learn information about you or your machine and identify you against your will. A typical way of doing this consists in giving the browser some ambiguous instructions on how to do something and then checking the result: the same browser on the same machine will always give the same result, even if you delete your data.
If this sounds fishy, you'll be surprised to know that this is scarily common. All the big tech companies do it: Google, Amazon, Facebook, Ebay, YouTube, Microsoft, Intel and many others.
Traditionally, fingerprinting is done by combining stuff like IP address, user agent, screen resolution, system language(s), etc., but these factors don't really cut it anymore, so more sophisticated techniques are being used.
In this article, we'll go through some common browser fingerprinting techniques:
All the code and techniques shown in this article are illustrated for educational purposes only. You are NOT allowed to copy, modify, redistribute or embed anything from this page in any way, shape or form.
Everything you see in this page should be considered malware, I am not responsible for what you do with it.
By running the examples below, you subject yourself to fingerprinting. Fingerprints are not stored by this site.
The code below illustrates how this can be done.
These 2 images are generated by the code on my laptop and on my desktop
As you can see, the images generated by the same code are different: the font and the emoji are clearly rendered differently, but if we XOR the images, we can see that the lines and the background are also slightly different.
In this case, the 2 machines had:
Simple, yet effective.
A variant of this technique can be implemented using WebGL. It's the same principle, just a different technology.
In this first version, we generate a high pitched sound that can be rendered in different ways by different browsers by using an oscillator and a compressor, then we calculate the hash on the first 512 samples generated by the browser.
A second, nastier variant of this technique uses an FFT to analyze the output and then hashes the frequency bins, and as of November 2019 there are no extensions that prevent this type of fingerprinting, used in the wild.
This is so evil.
Another classic is detection of installed fonts. The idea here is to measure the size of a string with a known good font (like Monospace), and then comparing this size with the size of the same string with font-family: font_to_detect, Monospace. If the sizes are different, the font is installed, otherwise it is not.
With this trick, we can detect some installed apps on the user's machine, like Microsoft Office or the Adobe suite, and we can even find out if they're spoofing the user agent by testing system fonts (it's extremely unlikely that the Ubuntu font will be installed on Windows 10, right?).
In this example, we only test a short list of fonts, but much larger lists can be found on the internet.