Social World Wide Wait

Photo by Tijs Zwinkels, CC Attribution Share alike.

Been busy at work improving the performance of our main web application.

Meanwhile Dr Dave blogged a link to a page full of beautiful chess sets that took 48 seconds to load (even when all the big photos were already cached in memory on my own machine). I’ve also been tidying up another website which whilst not particularly wasteful still hit a significant performance penalty through failing to compress and cache everything it could.

The “basics” of making a website faster are fairly easy. Google Page Speed or Yahoo Y-Slow will all tell you if you’ve forgotten to enable compression of text, failed to set suitable caching headers, failed to suitably compress your images, or compact your JavaScript, and dozens of other checks (many of which need to be applied with caution, or may offer modest benefits in many situations).

The design and tuning of server side applications can be fiddlier. Such as today switching a small Perl script from CGI to FastCGI, where it was hard to measure the benefit through the vagaries of testing, but when used in anger the improvement was very visible (mostly by showing up the performance of similar items which are served up in a different manner around them). The tools on the server end are more diverse, and environments offer more ways to go wrong, and things to try. Correspondingly the documentation on how to achieve things is thinner on the ground, the paths less well trodden.

However after applying most of the recommended enhancements parts of our application were still too slow. One evening in particular I noted things were painfully slow, indeed bit of the application were failing, eventually an error message was spewed in the middle of the page bleeding over the proper content and all became clear. I was getting “Social World Wide Wait” and it was bad.

In the case of the error the Facebook “like” button on the page was inserting an iframe into the page, and then Facebook servers were unable to fulfill the request due to load, so the iframe was populated with the server’s failure message.

However even in normal operation Facebook “like” and Google Analytics both have to have a single non-cache-able fetch. This is how they track your users for targeted advertising and in the case of Google Analytics also let you know so much about them. Even when Facebook and Google Analytics were working correctly, and returning prompt results, it still meant we were often doing 3 or 4 fetches per page load (one to our server, two to Facebook’s servers, one to Google’s servers), and the “social networking” was accounting for more than 50% of the delay in fetching items for the page.

Worse still I established there was interaction between one of the Javascript libraries used and Google Analytics, which meant Google Analytics wasn’t running at the ideal moment it should. Downloading and starting Javascript in a page as early as possibly, whilst not delaying other things, and not updating the page before it is ready, is a bit of an art.

The practical upshot of all this was to ask ‘Do we need a Facebook “like” buttons all over the place?”, and “Do we need Google Analytics after they have logged in?”, and whilst these features have some value to us we decided they didn’t have enough to justify the delay.

The Facebook component could have anticipated receiving an error from Facebook (either by hiding it, and revealing when it worked, or by testing for success). But we, like most folk, cut and paste the code snippet from Facebook and stuff it in, much of their code is already compacted so code review is hard, and it is hard to simulate all the kinds of failure you might get from their site so with limited testing resource, and plenty of our own code to test, we accept our shared fate with the rest of Facebook’s users.

I like Facebook, and Google+, integrating them into your website can make the experience of surfing the web better. You can create accounts quickly and easily. But be sparing in building 3rd party dependencies into your site, and remember the mathematics of downtime is not kind. If you rely on two services that have 99% availability you are already down to 98.01%  before you factor in your own service availability.

If you load your website with Gravatar icons, Facebook comments, Addthis buttons and similar  such that it takes 48 seconds to load, I will spend more time wondering about your site’s construction than I will interacting with your content.