Aug 28, 2008

Codecs Overview : How codecs work

We know that encoded files are much smaller than raw media files; the question is how do encoders achieve this file size reduction, and why does the quality suffer?

At the heart of all encoding software lies the codec. Codec is a contraction of coder-decoder (or compressor-decompressor), and is the software algorithm that determines how to shrink a file to a usable size. You're probably already familiar with a number of codecs, though you may not be aware of it. For example, most digital cameras take pictures that are compressed with the JPEG codec. If you've ever used a photo-editing program to reduce the size or quality of your photos before you put them online, you've been adjusting the parameters of the JPEG codec. StuffIt and WinZip use codecs to compress files before they're sent across the Internet or put on installation CDs.

There's a key difference, however, between the JPEG codec used to compress photos and the codecs used to compress documents. Codecs used to compress documents must be lossless. If someone sends you a spreadsheet that has been compressed, when it de-compresses the data must be exactly the same as it was before the compression. Codecs such as JPEG, however, are known as lossy codecs, because some of the original information is lost during the compression. The original cannot be recreated from the compressed version of the file. Lossy codecs operate under the assumption that the quality lost either is not noticed by the end user or is an acceptable compromise required for the situation.

Web sites are a perfect example. Having lots of imagery on a Web site is great, but if the images were all 5 MB originals, each page would take forever to load. Because browsing the Internet should be a rapid, seamless experience, and because we sit so close to our monitors, the amount of detail required in a Web site image is much less than what is required for a printed page, so the image can be compressed heavily using the JPEG codec, and our experience isn't overly compromised.

The same holds true for podcasts. While it might be nice to have 256 kbps CD-quality podcasts, the reality is 128 kbps offers more than enough quality, and in fact 64 kbps might be plenty, particularly if you're not using the MP3 codec. As you reduce the bit rate of your podcast, the quality is also reduced, because the codec must delete lots more information.

Codecs try to maintain as much fidelity as possible during the encoding process, but at low bit rates something has to give. There simply isn't enough data to reproduce the original high fidelity. Given the complexity of the task, they actually do an amazing job. They're able to do as well as they do because they make use of perceptual models that help them determine what we perceive as opposed to what we hear. The difference is subtle, but key to modern codec efficiency. Before we talk about perceptual encoding techniques, let's talk a bit about basic codec technologies.

How codecs work
Codecs reduce file sizes by taking advantage of the repeated information in digital files. Lots of information is repeated. For example, a video that has been letterboxed (black stripes on the top and bottom) has lots and lots of black pixels. This results in lots and lots of zeros, all in a row. Instead of storing thousands of zeros, you could store "1000 × 0," which is only six characters. That's a significant savings. Also, you can reconstruct an exact copy of the original based on the information that you have stored.

Another way of encoding is to substitute for commonly occurring combinations of characters. For example, you could make this book smaller by replacing every instance of the word "podcasting" with "p." This wouldn't save that much space, though, and that's the problem with lossless encoding. You can achieve some file size reduction, but typically not enough for our needs. For this, you need perceptual encoding.

No comments: