Optimizing site speed: a case study

One of our clients runs an eCommerce site selling workout videos on behalf of George St.Pierre, the UFC Fighter, called GSPRushfit. The videos sell well, but they’re quite rightly always looking at ways to increase sales. A few weeks ago, we ran a study to see how fast the site loaded, and how that affected conversion rates (sales). I wrote a post about how we measure that a couple of weeks ago. The result of this was that we could see that page loaded reasonably well, but not fantastically. Across all pages, the average load speed was 3.2 seconds. What was eye-opening was that pages that loaded in 1.5 seconds or less converted at about twice the rate of pages loading 1.5-5 seconds. There was  a further dip between 5-10 seconds. So with this data in-hand, I started to look for ways to increase page load speed. I came up with a laundry list of things to do. Most of these are suggested by YSlow:

  1. Remove inline JS/CSS: We didn’t have a lot, but there was some unnecessary inline scripting. These were moved into the existing CSS & JS files. I think I added about 50 lines of code. Not a lot, but helpful. There’s still some inline JS & CSS that’s being written dynamically by coldfusion, but all the ‘static’ code was moved into one.
  2. Minify Files: This doesn’t do a lot, but does compress files slightly. I think I was able to reduce our JS file by 30% & our CSS by 15%. Not a lot, but still helpful. I use an app called Smaller, which I’m a big fan of. While YSlow suggests you combine files, I chose not to – the reduction in requests didn’t offset the problems for us in doing this.
  3. Reduce Image Size. The site is fairly graphically intensive – large background images & lots of alpha-transparency PNGs. When I started, the homepage was loading just under 1.2MB in images, either as CSS backgrounds or inline. Without (to my eye) any noticeable loss of quality I was able to re-cut those same images to about 700KB in size.
  4. Use a CDN: The site loads video, which we call from a CDN. But the static assets (CSS, JS, Images) weren’t being pulled. This was originally because the CDN doesn’t support being called over SSL. But it only took a little scripting to have every image load from the CDN while not on SSL, from the local server while over SSL. This, as you’d expect, greatly improved the response time – by about 0.2 seconds on average.
  5. Query Caching: this one is likely a no-brainer, but the effects were stunning. All the content is query-driven, generated by our CMS. But it doesn’t change terribly often. So I started caching all the queries. This alone dropped our page load time by nearly a full second on some pages. And to maintain the usefulness of a CMS, I wrote an update to clear specific queries in the cache when new content was published.
  6. GZip: Again, likely something I should have already been doing, but to be honest, I had no idea how to accomplish this on IIS. So I figured that out and requested that the server gzip static assets (JS, CSS & HTML files).
  7. Far-future expires headers. Because very few of our images change frequently, I set an expiry date of 1 year in the future. I likewise set a cache-control variable of the same time frame. Which, should, in theory, reduce requests and allow clients revisiting to just use their local cache more. I of course added a programmatic way to clear that cache as well for when we change content or edit the JS or whatever.
  8. Clean up markup: While I was doing all the rest, I also cleaned up the markup somewhat – not a lot, but again, we were aiming to eliminate every extraneous byte.

So, as you can see, we sort of threw the kitchen sink at it to see what stuck. In retrospect, I wish I had made these updates one at a time to measure what sort of an impact (if any) each one had. There’s only a couple I can see clear before & after differences, which were mentioned above. So for everyone out there, which of these were the most effective?

  1. Caching Dynamic Content: Even on a CMS-driven site, most content doesn’t change constantly. But if you can eliminate trips to the DB server to make a call, that’s time saved. Even if you cache a recordset for a few minutes or even just a few seconds, on a high-traffic site you can see some real impressive gains. We cache queries on this site for 10 days – but can selectively update specific queries if a user makes an edit in the CMS – sort of a best of both worlds right now. This somewhat depends on having a powerful server – but hosting hardware & memory are pretty cheap these days. There’s no reason not to make use of it.
  2. Crushing images: When building the site, I did my best to optimize file size as I exported out of Photoshop. but with a few hours in Fireworks I was able to essentially cut the size of the images in half with no real visual impact. A hat-tip to Dave Shea for the suggestion of using Fireworks.
  3. Pushing Content to a CDN: this is the head-smacking no-brainer that I don’t know why wasn’t already part of our standard workflow on all sites. As I wrote above, we gained about 0.2 seconds by doing this – which doesn’t sound like a lot, but it’s noticeable in practice.

The nice thing about this exercise was that it shook up how I built sites, how our internal CMS runs and how I configure both our IIS and Apache servers to all run slightly more efficiently. I suspect that I could eke out a few more milliseconds by playing more with the server settings itself, but I’m satisfied for now with how this worked out.

Measuring Page Load speed with Google Analytics

UPDATE: Google now has a built-in tool for doing this. Simply add the following: _gaq.push([‘_trackPageLoadTime’]); and hey presto: Page-load-speed tracking. See this page for more info, including an important caveat.

With Google now including Page Load speed in their page-ranking algorithm, many of our clients started asking us how fast their sites load. There’s lots of developer tools that can help with this – I use (alternately) the Google Page Speed tool and Yahoo! YSlow during development to tweak page load times. But this doesn’t help our clients very much.

There’s a chart you can use on Google Webmaster Tools, but overall, I don’t find Webmaster Tools particularly end-user friendly. As it’s name implies, I’m not sure that it is supposed to be. A couple of our more technical clients use it, but most don’t use it, or don’t really get it. The chart that Google Webmaster Tools spits out can be found under “Labs”, then “Site Performance”. It looks like this:

Google Webmaster Tools Page Speed Chart
Google Webmaster Tools Page Speed Chart

Which is nice, and gives a clear delineation of what’s good, and what’s not. As you can see, this site rides right on the edge of being “fast”. When a client asked if I could tell him how fast individual pages are loading, this gave me the idea of using Google Analytics’ Event Tracking feature to log how fast a page loads. What’s nice about this is that most of our clients “get” Analytics, and are used to going there to see their stats. So additional stats there works for them.

With Google’s visual guidelines in place, I set about making this happen. I decided to name the Event Category “PageLoad”. I then added 5 labels to group the results:

  1. Fast (less than 1500 MS) (because Google, according to Webmaster Tools, considers anything 1.5s or faster “fast”.
  2. Acceptable (less than 3000 MS)
  3. Middling (less than 5000 MS)
  4. Slow (less than 10000 MS)
  5. Unacceptable ( 10000 MS or longer)

These groupings are completely arbitrary – I could have set them at any time span. Those just seemed reasonable to me, based on knowing that most of our sites have an average load time of 2800 MS, based on our internal tools.

So then I had to track it. The code is as follows:

var pageLoadStart = new Date();

window.onload = function() {
var pageLoadEnd = new Date();
var pageLoadTime = pageLoadEnd.getTime() - pageLoadStart.getTime();
// let's set some second (1000s of ms) segments
if (pageLoadTime < 1500)
    loadStatus = 'Fast (less than 1500 ms)';
else if (pageLoadTime < 3000)
    loadStatus = 'Acceptable (less than 3000 ms)';
else if (pageLoadTime < 5000)
    loadStatus = 'Middling (less than 5000 ms)';
else if (pageLoadTime < 10000)
    loadStatus = 'Slow (less than 10000 ms)';
else
    loadStatus = 'Too Slow (more than 10000 ms)';
var myPath = document.location.pathname;
if( document.location.search)
    myPath += document.location.search;
// round the time to the nearest 10 ms.
pageLoadTime = Math.round(pageLoadTime / 10) * 10;
// send the GA event
try {
    _gaq.push(['_trackEvent', 'Page Load', loadStatus,myPath,pageLoadTime]);
} catch(err) {}
}

Some Notes:

  • I have the first line (var pageLoadStart = new Date();) at the very top of my header, right after the <head> tag. The window.onload() function sits in the footer.
  • I round my values to the nearest 10 MS – but this might be too many values, too noisy. So you could round to the nearest 100 MS or even nearest second if that worked  better for you.
  • the document.location.pathname just passes the page’s address to Google, so I can see which pages are slow, or when particular pages or slow.
  • You can only send integers to the Event Tracking widget.
  • the _gaq.push() line is where I send this to Analytics. You might have your Analytics tide to a different variable, in which case change the _gaq to whatever you use. For more on how event tracking works, see the docs

What my client then sees in Google analytics is a chart like this (note – I’m using the new version of Analytics – yours might look different):

 

PageLoad chart
PageLoad chart

As you can see (this is for the home page of the site), the results are ALL over the place. Which is troubling, but that’s for another day to figure out why it sometimes loads sooooo sloooowwwwlyyyyy. (We’re currently running an A/B test wherein a video auto-plays or not. My best guess is that because the page doesn’t finish loading until after the video starts to play, that explains the slow loading. But maybe not).

Regardless of the actual performance of this particular page, you can see how this is a nice little chart for clients to see – nothing too technical (I could possibly be even less technical by not including the MS numbers in the labels), but provides them some easy insight into how their site is doing.

Caveat: This chart obviously doesn’t include any pre-processing factors – connection speed, server-side scripts taking a long time, etc. But it’s still pretty useful, I think.

%d bloggers like this: