“You’re Doing It Wrong: Think You’ve Mastered the Art of Server Performance? Think Again.”, 2010-06-11 (; backlinks; similar):
[Discussion of optimization Poul-Henning Kamp (PHK) made to Varnish, switching to an obscurer data structure which has better cache performance. This switch singlehandedly made a highly-optimized piece of software a good third faster.
Varnish is a web server technology which is used to store millions or billions of web pages/files in RAM to serve them ultra-fast, instead of falling back to a regular web server which may need to pull a file off the disk or use laborious computations to create the necessary web page. This is a straightforward high-volume task which should be bottlenecked on network and CPU—copying parts of RAM to the network as fast as possible.
PHK observed that Varnish server CPUs were not a bottleneck, and almost entirely idle. Why? Because it was spending most of its time tracking what files were in RAM, to make sure the copies weren’t too old & obsolete. This tracking was conveniently & efficiently implemented as a binary heap.
Unfortunately, a Varnish server wants to spend all its memory on the actual files it’s there to send, forcing much of the binary heap to be stashed on disk. This is bad because the binary heap assumes that all of it is in memory and optimizes for number of queries, spreading itself across a lot of memory—except because it is getting pushed out to disk, that means that each lookup may require going out to disk. So while the binary heap requires the minimum number of queries to figure out what objects are obsolete, each query is a worst-case query, unexpectedly taking milliseconds or entire seconds!
The solution is to take a binary heap and rearrange it into a “B-heap” to keep nearby entries on the same part of disk, instead of scattered across the disk. That way, when Varnish is forced to go out to disk to check obsoleteness, it at least only has to do a few queries on a single part of the disk which can be read en masse, and can get back to useful work more quickly.]