How to Boost Your Server Performance with Varnish
Varnish Cache is an HTTP accelerator and reverse proxy developed by Danish consultant and FreeBSD core developer Poul-Henning Kamp, along with other developers at Norwegian Linpro AS. It was released in 2006.
According to Pingdom.com, a company focused on web performance, in 2012 Varnish was already famous among the world's top websites for its capacity to speed up web delivery, and it was being used by sites such as Wired, SlideShare, Zappos, SoundCloud, Weather.com, Business Insider, Answers.com, Urban Dictionary, MacRumors, DynDNS, OpenDNS, Lonely Planet, Technorati, ThinkGeek and Economist.com.
Although there are other solutions that also shine, Varnish is still a go-to solution that can dramatically improve website speed, reduce the strain on the web application server's CPU, and even serve as a protection layer from DDoS attacks. KeyCDN recommends deploying it on the origin server stack.
Varnish can sit on a dedicated machine in case of more demanding websites, and make sure that the origin servers aren't affected by the flood of requests.
At the time of this writing (November 2017), Varnish is at version 5.2.
How it Works
Caching in general works by keeping the pre-computed outputs of an application in memory, or on the disk, so that expensive computations don't have to be computed over and over on every request. Web Cache can be on the client (browser cache), or on the server. Varnish falls into the second category. It is usually configured so that it listens for requests on the standard HTTP port (80), and then serves the requested resource to the website visitor.
The first time a certain URL and path are requested, Varnish has to request it from the origin server in order to serve it to the visitor. This is called a CACHE MISS, which can be read in HTTP response headers, depending on the Varnish setup.
According to the docs,
when an object, any kind of content i.e. an image or a page, is not stored in the cache, then we have what is commonly known as a cache miss, in which case Varnish will go and fetch the content from the web server, store it and deliver a copy to the user and retain it in cache to serve in response to future requests.
When a particular URL or a resource is cached by Varnish and stored in memory, it can be served directly from server RAM; it doesn't need to be computed every time. Varnish will start delivering a CACHE HIT in a matter of microseconds.
This means that neither our origin server or our web application, including its database, are touched by future requests. They won't even be aware of the requests loaded on cached URLs.
Varnish is threaded. It's been reported that Varnish was able to handle over 200,000 requests per second on a single instance. If properly configured, the only bottlenecks of your web app will be network throughput and the amount of RAM. (This shouldn't be an unreasonable requirement, because it just needs to keep computed web pages in memory, so for most websites, a couple of gigabytes should be sufficient.)
Varnish is extendable via VMODS. These are modules that can use standard C libraries and extend Varnish functionality. There are community-contributed VMODS listed here. They range from header manipulation to Lua scripting, throttling of requests, authentication, and so on.
Varnish has its own domain-specific language, VCL. VCL provides comprehensive configurability. With a full-page caching server like Varnish, there are a lot of intricacies that need to be solved.
When we cache a dynamic website with dozens or hundreds of pages and paths, with GET query parameters, we'll want to exclude some of them from cache, or set different cache-expiration rules. Sometimes we'll want to cache certain Ajax requests, or exclude them from the cache. This varies from project to project, and can't be tailored in advance.
Sometimes we'll want Varnish to decide what to do with the request depending on request headers. Sometimes we'll want to pass requests directly to the back end with a certain cookie set.
To quote the Varnish book,
VCL provides subroutines that allow you to affect the handling of any single request almost anywhere in the execution chain.
Purging the cache often needs to be done dynamically --- triggered by publishing articles or updating the website. Purging also needs to be done as atomically as possible --- meaning it should target the smallest possible scope, like a single resource or path.
Varnish has a set of tools for monitoring and administering the server:
varnishtop, which lets us monitor requested URLs and their frequency.
varnishncsacan be used to print the Varnish Shared memory Log (VSL): it dumps everything pointing to a certain domain and subdomains.
varnishhistreads the VSL and presents a live histogram showing the distribution of the last number of requests, giving an overview of server and back-end performance.
varnishtestis used to test VCL configuration files and develop VMODS.
varnishstatdisplays statistics about our varnishd instance:
varnishlogis used to get data about specific clients and requests.
Varnish Software offers a set of commercial, paid solutions either built on top of Varnish cache, or extending its usage and helping with monitoring and management: Varnish Api Engine, Varnish Extend, Akamai Connector for Varnish, Varnish Administration Console (VAC), and Varnish Custom Statistics (VCS).