Wednesday, February 20, 2013

A Comprehensive Guide to WebP

WebP (pronounced "weppy") is an ambitious and promising new image format that was created by a team at Google back in September of 2010, but browser adoption has been slow.  At the time of this writing the only major browsers that support WebP are Chrome and Opera, even though support in native applications is strong (albeit with plugins in many cases).  In this article I want to present an unbiased, holistic view of the WebP landscape as it stands today.  My end goal is to further the conversation about alternate image formats and image compression on the web, because I believe it is one of the biggest opportunities for making web browsing faster across all devices and networks.

Setting the Stage

The description that Google provides for WebP is concise and explicit:
WebP is a new image format that provides lossless and lossy compression for images on the web. WebP lossless images are 26% smaller in size compared to PNGs. WebP lossy images are 25-34% smaller in size compared to JPEG images at equivalent SSIM index. WebP supports lossless transparency (also known as alpha channel) with just 22% additional bytes. Transparency is also supported with lossy compression and typically provides 3x smaller file sizes compared to PNG when lossy compression is acceptable for the red/green/blue color channels.
The link above also has more information about how WebP achieves those smaller image sizes, as well as the container and lossless bitstream specifications.  There have been a number of posts about real world situations where WebP has succeeded, a few of which are linked below.
WebP is a very different beast today than it was when it launched.  In May of 2011, WebP's compression algorithm got a tune-up, an upsampler was added to the decoder, and support was added to some key applications.  In October of 2011, Google announced WebP support for animation, ICC profile, XMP metadata and tiling.  A month later lossless compression modes and an alpha channel (transparency) were added.  In August of 2012, version 0.2 of the WebP library was released, with improved support for lossless and transparency modes to compress images.  WebP also has an active mailing list.  

There is no doubt that Google is committed to seeing WebP succeed, and development is ongoing.  Now let's look at the main benefits that WebP provides.

Benefits of WebP

The main reasons for supporting the use of WebP are as follows:
  • Smaller images - this one is obvious, and was explained in detail above.
  • It can replace both PNGs and JPEGs, so we could use one image format for most web content.
  • It is royalty-free, open source, and anyone can contribute to or implement the spec.
  • The project is backed by Google, and thus has a lot of resources behind it.
That all being said, WebP comes with some downsides, as we will see next.

Drawbacks of WebP

Jeff Muizelaar, a Mozilla employee, wrote a post back in April of 2011 outlining some complaints about WebP.  His list is below, with rebuttals where appropriate.
  • Google's original study used JPEGs as the source images, and looked at peak signal to noise ratio (PSNR) as a comparison metric, which is "accepted as a poor measure of visual quality".  Google did a second study using Structural Similarity (SSIM) vs. bits per pixel (bpp) that showed similar results as the first.  The second study makes this concern moot.  
  • WebP lacks support for key image features, specifically EXIF data, ICC color profiles, alpha channel, and 4:4:4:4 YCrCb color representation.  As detailed above, ICC color profiles and alpha transparency were added in a later version of WebP.  I can't speak to EXIF data or 4:4:4:4 YCrCb color, but if you know more about this please let me know and I'll update this point. UPDATE: Vikas Arora from the WebP team let me know that WebP now supports EXIF and XMP metadata as well.
  • There is a cost associated with new image formats - developers have to decide which one to use and clients need to support it forever.   There are third party vendors (like Torbit) and open source modules (mod_pagespeed/ngx_pagespeed) that will automate this for you, so developers no longer have to do the work themselves.
  • The progressive decoding of WebP requires using a separate library, instead of reusing the existing WebM decoder.
There are some additional drawbacks not mentioned in Jeff's article:
  • During the transition period when support is partial, you have to have duplicate images on your server, creating both JPEGs and WebP images.  This eats up storage space, and can make image retrieval slower (for example, if your working set used to fit in memory, and after doubling it you need to go to disk for some images).
  • CDNs and intermediate proxies have to cache both versions of the image, reducing cache hit ratio.
  • Currently the lack of context in Accept headers means that all WebP requests are marked as Cache-Control: private so the origin can do user agent detection (more on this below).  
  • Rolling out new image formats is hard.
I want to go into those last two points a little more.  Ilya Grigorik wrote an excellent post about deploying new image formats on the web, and the challenges that it creates.  He brings up the Accept header point in the list above - the fact that Accept headers are not providing an enumerated list of the image formats that the client supports causes major issues for adoption.  This makes rolling out WebP more difficult, but these are problems that we can fix, as Ilya says in his article.  I don't want to rehash all of that here, so I recommend reading his post before continuing.

I also corresponded via email with Jason Weber, the Performance Lead on the Internet Explorer team, and he brought up another challenge with WebP - the time it takes for the downloaded image to actually appear on the screen.  His description of the issue is both thorough and easy to understand, so I am going to simply quote him here:
There are many dimensions to consider when thinking about the performance implications of image formats. Most people gravitate toward quality (subjective) and compression sizes (performance). Both are interesting, but the elapsed time required to download the image is only part of the problem. After the image arrives on the local system there are several steps before it’s seen on the screen. The exact steps depend on the browser, operating system, and hardware. The basic pipeline looks something like this:
  1. System memory is allocated and the encoded image is stored in this memory. 
  2. Additional system memory is allocated, and the image is decoded at native resolution into this intermediary.
  3. The native image may need to be converted into another intermediary format.
  4. The appropriate intermediary is copied to the GPU (on hardware accelerated scenarios).
  5. Finally, the image is drawn to the screen.
On mobile caliber devices it’s not uncommon to see these operations take several 100ms. Some browsers attempt to do this work in parallel, so it may not directly block execution on the primary thread, but the elapsed time of this process impacts what are often referred to as TTG (time to glass) metrics.  
To improve TTG on the client you need to streamline this process. GPU’s are massively parallel and highly specialized silicon which can perform graphics operations more efficiently than the general purpose CPU silicon. Some operations can be performed 1000x more quickly through the GPU. There’s broad hardware support for traditional image formats like jpg in hardware today which speedup this process, and these opportunities don’t exist for WebP. 
For WebP to make the web faster the elapsed time downloading the images (from fewer bytes) needs to be greater than the additional cost of getting the image to the screen (TTG). That’s not the case in the real world today. And over time, other image formats including jpg have more headroom to improve. Modern hardware is amazing!
This is a point that few people talk about, and it's great that Jason was willing to explain it.  If you spend more incremental time getting an image actually displayed to the screen than you save by downloading fewer bytes, that image has actually made the web slower.

I corresponded with Ilya about this issue, and he said that while WebP decoding is about 1.4x slower than JPEG, this difference is largely overwhelmed by the speed improvement that comes from sending fewer bytes across the wire.  Some additional comments from him are below.
First off, to give the IE team credit, I do think they are ahead of most other browser vendors when it comes to their GPU + mobile browser story -- in fact, probably significantly so. I think Chrome and FF will close this gap in the not so distant future, but credit where credit's due. Now, having said that, I don't know the specific details of Jason's test, but my guess is, he's testing image intensive apps, with good network connectivity (3.9/4G), on latest Nokia hardware, and with minimum/no concern for bandwidth caps... That's fair, but that's not representative. 
Most users are stuck on non HSPA+ connections, with middle of the road hardware, and are using an older (year or more) WebKit build. This profile, compared to what he's [likely] analyzing, is night and day. In fact, in markets where mobile browsing is already dominant (India), bandwidth is the #1 concern - bits are expensive, literally. Hence, I take any argument about GPU performance with a grain of salt. Yes, it is definitely an important metric, and one that all of us will likely pay much closer attention to in the future, but it is not the limiting factor today for the vast majority of the population.

This is where WebP helps and why we're investing all the effort into it... Not to say that WebP is the answer to all things. It just happens to be the best option at this very moment. I hope we have more image formats, with even better compression and better "time to glass" in the future! These requirements are not entirely exclusive.
There is an important distinction to mention here: when we are talking about hardware accelerated browsers where image decoding happens on the GPU, lack of hardware support for WebP can cause a significant increase in TTG.  If we look at browsers that do image decoding on the CPU, the time difference is not as large (~1.4x as Ilya mentioned).  In the former case, it is likely that in the real world increased TTG time overwhelms the decrease in download time from WebP in at least some, and possibly most cases.  In the latter case, the decrease in download time likely dominates the difference in decoding time in at least some, and possibly most cases.  We clearly need more hard data to be published in this area to further inform this debate, and I will update this post as studies are published.

I'll also reiterate that if you take speed out of the equation and just talk about the cost of the bytes on the wire (due to bandwidth caps), WebP provides significant and unambiguous benefits.  

Additional Discussion About WebP

There has been a lot of talk about support for WebP in different browsers, and this bug that was filed to add it to Firefox has almost 200 comments at the time of this writing.  In comment 27 a Mozilla employee indicates that support will not be coming to Firefox.  In comment 146 he elaborates, stating that the main thing that would change his mind is support in IE and/or Safari.  A code patch that adds WebP support to Firefox has already been attached to this bug, so at this point the only thing Mozilla needs to do is hit accept.

There is also a JavaScript shim that adds WebP support to all WebM compatible browsers (which includes Firefox), but this kind of shim is a temporary, nasty thing, and you incur an extra blocking JavaScript request to implement it.  For WebP to truly succeed we need native support in all major browsers.  

Conclusion

According to the HTTP Archive, over 60% of the bytes on the web come from images.  Image transfer size also has a high correlation with load time.  This means that images are a great place to look for improvements in performance.  In addition, total transfer size of web pages has been steadily increasing over the last few years, with no sign of stopping.  At the same time, average bandwidth and latency for web users have been getting worse, as more and more people use mobile devices (to be clear, bandwidth in the home has been improving, but the switch to mobile networks offsets this improvement). We need to find some way to reverse this trend, or the web will continue to get slower.  WebP is a potential answer to this problem, or at least one of the potential answers.

That being said, there are significant challenges that WebP still has to overcome, the largest of which is finding an easy way for companies with a large, dynamic working set of images to deploy WebP.  Fixing Accept headers is the first step here, and the Chrome team is currently working on this.  As discussed above, a number of automated optimization services will handle the work of converting images to WebP and serving them to the right people, but we still need to fix Accept headers so we can make these products (and intermediate caches) more effective.

If we are able to overcome these challenges, here are some rough numbers that gauge the size of the opportunity.  When Google Instant launched, they claimed that if everyone in the world used it we would save more than 3.5 billion seconds a day, or 11 hours saved every second.  This works out to about 450 human work years (assuming 40 hour work weeks) of time per day.  If you think about the time savings we could get by reducing the cost of delivering and rendering images on the web, you can very quickly see that we would save many times that.  This means that billions of dollars worth of human time could be saved each year by finding a solution to this problem that works universally.

WebP also provides a tremendous amount of value on mobile devices with bandwidth caps.  As I said above, 60% of bytes served on the Web come from images, and for many mobile sites/apps that number is going to be much higher.  Take any mobile shopping app for example: once you have the app on your phone, you only have to download a few bytes of JSON data and images for all of the products you are looking at.  WebP is a huge win here.  Even if the increased decoding time offsets some of the improvement in download time, people are still saving money on bandwidth usage.

In addition to talking to Jason and Ilya, I've reached out to Mozilla to get more opinions about why adoption has been slow, and I will update this post as I get more feedback.  If you have more examples of WebP studies, additional benefits/drawbacks, or information about the political and technical reasons that are blocking adoption, please post in the comments.  If you prefer to contact me anonymously, you can email me directly at jonathan.n.klein@gmail.com.

Improving the web is difficult, but in the words of Ilya Grigorik, "some uphill battles are worth fighting - I think this a good one."