In bed with CSSEmbed – using data URIs and mhtml to reduce HTTP requests at paulcarvill.com, the home of Paul Carvill on the web

link: paulcarvill on twitter

link: paulcarvill at flickr

paulcarvill.com

Hi, I'm Paul Carvill, I'm a web developer. I'm currently working as Technical Lead at LBi, Europe's largest digital agency.

I also like walking, cooking, Bollywood and rock 'n' roll.

In bed with CSSEmbed – using data URIs and mhtml to reduce HTTP requests

posted: Friday, February 12th, 2010 at 12:27 pm

My team at LBi have added a pimped version of CSSEmbed into the continuous integration process of the project we’re working on. CSSEmbed is a tool by Yahoo’s Nicholas Zakas which takes CSS files as input, searches them for usages of image URLs, locates each image file being used via its URL or filepath, converts it into base64-encoded data and embeds that data back into the stylesheet in the form of a data URI, replacing the original URL. This has the effect of reducing the number of HTTP requests your browser has to make to load all of the images a page requires — everything in the stylesheet is downloaded in a single request. Because the number of concurrent HTTP requests your browser can make is limited, using CSSEmbed increases the browser’s chances of downloading all assets as quickly as possible.

Image by jek in the box

Performance

In practice getting the best performance out of cssembed is something of a balancing act, and takes some trial and error to get right. The size of the CSS file bloats pretty quickly once you start embedding even relatively small images in it. My first attempt increased one CSS file from around 80KB to almost 900KB! Now, while this file only takes one HTTP request to fetch, it also blocks downloads in that request channel for the entirety of the time it takes to come down the wire. After some experimentation we found the sweet spot was to embed images that were less than 1k in size, as the increased CSS file size was in a much better ratio to the number of HTTP requests saved.

We’ve tested this using PNG, GIF and JPEG in Firefox, Safari, Chrome and IE versions 6, 7 and 8.

Comparison

Here’s a comparison of a page I picked at random from our current project, showing data for using data URIs, both with and without Gzipping of the transfers:

Page size (KB) Number of
HTTP requests
Avg cold cache download time (ms) Avg warm cache download time (ms)
Using data URIs 540 45 792.30 726.60
Using data URIs, gzipped 235 45 802.00 748.30
Using original css/images, gzipped 194 56 882.60 771.00
Using original css/images 483 56 849.00 773.60

That’s a 19% fewer HTTP requests. Pretty impressive!

Pimping CSSEmbed

With some great help from my fellow LBi-ers Andrew and Raul, we amended CSSEmbed to take more arguments than the standard version, including a maximum image size to convert to data, a .txt file listing the CSS files to convert, a file extension to append to the outputted files and a directory to output to.

Continuous integration

Here’s a quick list of the steps we took to automate our CSSEmbed process:

1. Create a manifest of the stylesheets you want to embed image data into
2. Add an ant target that executes the cssembed jar twice, once to create files containing datauris (for modern browsers) and then again to create mhtml files (for older version of IE). Supply argument lines as necessary. Our ant targets do the following:
a. Supply root URLs to be prepended to relative URLs in the stylesheet, so the cssembed jar can locate the image files
b. Specify output directories: /datauri and /mhtml, respectively
c. Specify a file extension for the mhtml files: mht
d. Specify the location of the manifest .txt file
3. Amend your page’s head to serve the appropriate stylesheets to each browser

Here’s one of our ant targets:

<java jar="tools/cssembed-0.3.5.jar" fork="true">
     <arg line="--root '${deploy.home}/${project}-build/'" />
     <arg line="-o '${deploy.home}/${project}-build/assets/css/mhtml'" />
     <arg line="--maxsize 1"/>
     <arg line="--mhtml --mhtmlroot http://${project}.build/assets/css/mhtml/" />
     <arg line="--outextension mht" />
     <arg line="'tools/css-manifest.txt'"/>
</java>

A brief word on mhtml

The reason we output the mhtml files with a different file extension and put them in their own directory is to work around some presentation bugs in IE7, specifically IE7 on Windows Vista. IE7 on Vista doesn’t seem to like parsing PNG files as mhtml data unless the data is hosted in a separate file and served up as Content-Type: text/plain. Putting the mhtml files in a seperate directory with the ir own file extension means our client can use Apache directives either to serve either files with that extension as plain/text, or to serve any files from a particular directory as plain/text. It gives them the option and adds no performance overhead at all.

Serving it all up

Finally, with some clever use of conditional comments, we can produce valid markup that will let each browser download just those files it needs to. Here’s how to do it:

* Stylesheets for all browsers go here
<link rel="stylesheet" href="/path/to/styles.css" />

<!--[if gt IE 7]>-->
 * The datauri stylesheets go here.
 * Modern browsers and IE > 7 will see them.
 *
 * Note that the opening tag of the
 * conditional comment above is fully
 * enclosed in an HTML comment.
 *
 * All IE browsers read the conditional
 * comment and act on it, so only IE 8
 * sees these links

<link rel="stylesheet" href="/datauri/styles.css />

 * Note again that the closing tag of the
 * conditional comment below is fully
 * enclosed in an HTML comment

<!--<![endif]-->

<!--[if lt IE 8]>
 * The mhtml stylesheets go here.
 *
 * Only IE < 8 will see them.
 *
 * This whole section is inside an HTML
 * comment, so only IE browsers read the
 * conditional comment and act on it.
 *
 * Only IE < 8 sees these links

<link rel="stylesheet" href="/mhtml/styles.mht" />

 * Note the closing HTML comment tag below.

<![endif]-->

<!--[if lt IE 7]>
 * Here's an example of another conditional
 * comment, for supplying IE 6 stylesheets
 *
 * This is nothing to do with cssembed, it's
 * just here for illustration

<link rel="stylesheet" href="/ie6styles.css" />

<![endif]-->

Now whenever our build script runs, our CSS files are checked for image URLs and new CSS files are created with image data embedded. At runtime each browser downloads the appropriate stylesheets using almost 20% fewer requests than before. This means the page is ready faster than it was before we used data URIs, and we've freed up bandwidth for other activity to take place.

Comments are closed.