Dec 21, 2008

Video Encoding | Learn Podcasting

Video encoding is more involved than audio encoding. Raw video files are much larger than raw audio, so a much greater degree of compression is required. Modern video codecs also have benefited from fierce competition between the leading streaming media platforms, as well as the MPEG organization. Most encoding software packages include a number of presets that produce acceptable video quality. If you want to try to improve your quality by tweaking the encoding parameters, this section explains the basic options available to you and how they affect video quality.

Screen resolution
The most important decision you're going to make about your video podcast is what resolution (or screen size) you're targeting. This is largely determined by the bit rates you're targeting, which in turn are determined by your audience. The higher the bit rate, the larger your resolution can be.

Most encoding software programs let you specify any screen resolution you want. You can specify that you want the full 640×480 frame encoded at 100 kbps, and the encoder does the best job it can. What you end up with is a large screen full of blurry blocks moving around, because 100 kbps simply isn't enough to encode a resolution that large.

Table 1 lists some common video bit rates and suggested screen resolutions. Note that the resolution is largely dependent on the content of your video. If you have lots of motion in your podcast, you have to use either a higher bit rate or a smaller resolution to achieve acceptable video quality. If your podcast is relatively static and you filmed using a tripod, you may be able to try slightly larger screen sizes.

Frame rate
Another parameter you can adjust is the frame rate. NTSC video is shot at 30 frames (actually 60 fields) per second. However, for low action content, you may be able to get away with a lower frame rate. For example, interview footage often looks just fine at 15 frames per second. Higher action content requires the full frame rate for smooth motion.

Adjusting the frame rate affects the overall clarity of the video, because no matter how many frames per second you're encoding, you always have a fixed bit rate at which to encode. If you're encoding at 300 kbps, you can spread those 300 kilobits over 30 frames or 15 frames. Obviously, if you're only encoding 15 frames instead of 30, you can dedicate more bits per frame, and the result is a higher-quality frame. However, this may be a false economy for high action content.

Remember how video encoding is done. First, a key frame is encoded, which is followed by a number of difference frames. High action content has lots of motion and, therefore, lots of difference from frame to frame. If you drop the frame rate to try to economize, the encoder drops frames and doesn't encode them. There are certainly fewer frames to encode, but the differences between them are greater! This is illustrated in Figure 1.

Figure 1: When encoding at a reduced frame rate, the increased differences between frames may negate the gains of encoding fewer frames.

If your programming has very little motion in it, such as talking head content, you may see an improvement in quality by dropping the frame rate. However, if you have lots of action in your podcast, leave the frame rate as is. To get higher video quality, you'll have to either encode at a higher bit rate or reduce your screen resolution.

Note The frame rate of NTSC video is actually 29.97 frames per second, although 30fps is often used as shorthand.

Bit rate
Along with the screen resolution, the other important choice you have to make is the bit rate of your podcast. The bit rate determines the quality and the file size, and it's the gating factor for the resolution. The bit rate you choose is determined to some extent by your audience, and the length of your podcast. The idea is that you don't want your audience to have to wait forever to watch your podcast. If the podcast is being downloaded in the background by an aggregator, then this isn't an issue. But many video podcasts are watched on Web pages. The user clicks a link and expects to see something in a reasonable amount of time.

Because most podcasters host their podcasts on a Web server, most video podcasts are progressively downloaded. Progressively downloaded videos have to preload a bit before they start playing back. The amount of preload is determined by the embedded player. The player knows how big the video file is and calculates how long the video will take to download. The player also knows how long the video is and tries to figure out how soon it can start playing the video so that by the time the playback reaches the end, all the video will have downloaded.

Let's say you've encoded your video at 300 kbps, which is a pretty standard bit rate. Most broadband connections can sustain this bit rate consistently, so as the video starts downloading, the player realizes in a few seconds that the data is being downloaded fast enough to begin playback. The total wait time for your audience is minimal.

Now let's take the same podcast and encode it at 500 kbps. Let's say it's a 1-minute podcast. The total file size is going to be approximately 30,000 kilobits (we'll stay in the world of bits because the math is easier). If the user's connection can sustain 300 kbps, it takes 100 seconds to download the entire file. If the file is 60 seconds long, that means the user has to wait 40 seconds before playback can begin. This is probably a little excessive, unless your audience is very dedicated. And this is if your podcast is only a minute long. For each additional minute, the audience is forced to wait an additional 40 seconds. This can get out of hand quickly.

If you're doing longer form content, longer than 5 minutes, then you should choose a bit rate that your users can receive in more or less real time. We can take a hint from streaming media sites here, who commonly target 300 to 450 kbps. If you're worried about bandwidth costs, you can choose a lower bit rate. Your video quality might suffer a bit, but if you can't afford your bandwidth bill, you won't be able to continue podcasting!

Audio bit rate

Of course, the bit rate we have been talking about up to this point has been the total bit rate of your podcast. After you choose your target bit rate, you have to decide how much of that bit rate you're going to allot to the audio stream. There are audio codecs at bit rates from 5 kbps up to hundreds of kilobits per second. How much should you give your audio stream?

One thing to remember is that audio tells the story. Audio is what draws the audience in and keeps them attentive. Think about it: When you're watching television at night with friends and a commercial comes on, what happens? If you're like most people these days, someone punches the mute button on the remote, the room immediately comes to life with conversation, and people take the opportunity to grab something from the refrigerator. When the commercial break is over, the audio is un-muted, and everyone pays attention to the program again.

The same holds true for your podcast. It's worth making sure that your audio quality is good, because people will watch low quality video if the audio sounds good. They won't watch good quality video if the audio sounds bad.

A fairly safe rule is to use about 10 to 20 percent of your total bit rate for audio. At higher bit rates, you can stay toward the bottom of that scale; at lower bit rates, stay toward the top. Another suggestion is to avoid the lowest audio bit rate settings. The difference between a 5 kbps audio feed and an 8 kbps audio feed is huge; those extra 3 kbps won't add much to your video quality.

Dec 11, 2008

Multi-format encoding

In the beginning, podcasts were audio only and always encoded using the MP3 codec. But as people have started to realize the potential for video podcasting and portable media player displays have improved, the possibilities for podcasting have multiplied. The problem is that most of these enhanced opportunities come at a price, and that price is compatibility. Enhanced podcasts designed for the iPod do not play on other portable media players. Podcasts encoded using the Windows Media format for compatibility with the "Plays For Sure" portable players do not play on the iPod. And if you want to offer a video image larger than 320×240, it may or may not play back on portable media players.

So what can a podcaster who wants to push the envelope do? The best approach is to offer a number of formats and let your audience choose which version they'd like to subscribe to. Of course, if you're offering multiple formats, you're no longer encoding a single version of your podcast; you may be encoding three or four. For example, if you really want to offer every possible choice, you might offer the following:

  • An MP3 version for older media players

  • An enhanced iPod version with images

  • A 320×240 video version encoded in Windows Media

  • A 320×240 video version encoded in QuickTime H.264

  • A 640×480 video version encoded in Flash format for Web viewing

    Granted, this example may seem excessive, and the chances that someone would encode into so many different formats are pretty slim. However, it's not out of the realms of possibility. Rocketboom, one of the most popular video podcasts, is encoded into four different formats. If you want the largest possible audience and want to stay at the cutting edge of podcasting technology, you're going to have to encode multiple versions. This is where a multi-format encoder comes in handy.

    Tip If you're offering more than one format, offer separate RSS feeds for each so people can subscribe to their favorite format.

    Multi-format encoders enable you to choose a single source file and output to multiple formats. These encoders usually allow you to set up encoding presets, so you don't have to re-enter the encoding settings every time you encode. Many multi-format encoders also allow you to preprocess your original master, so if you want to do color correction or resizing, it can be done at the encoding stage.

    Some multi-format encoders offer automatic batch processing, where files placed into a specific directory are automatically processed and encoded. You can streamline your production chain if you're using a multi-format encoder with batch processing. This allows you to concentrate on your programming and let the batch processing take care of the rest.

    A number of multi-format encoding solutions are available, including these popular ones:

  • Sorenson Squeeze: The Sorenson Squeeze Compression Suite offers MP3, AAC, QuickTime, Windows Media, and Real formats (Mac users must have the Flip4Mac plugin installed to get Windows Media capabilities). You can add Flash encoding with an additional plug-in (see Figure 1), or by purchasing the Squeeze Power Pack.

    Figure 1: Sorenson Squeeze

  • Canopus Procoder: The Express version offers QuickTime, Windows Media, and Real support. The full 2.0 version also offers MP3 encoding, and Flash encoding if you have Flash MX installed.

  • Telestream FlipFactory: This offers MP3, QuickTime, Windows Media, Real, and Flash support. It also supports 3GPP (mobile phone format) with an additional plug-in.

  • Digital Rapids Stream: The basic version offers QuickTime, Windows Media, and Real support. The Pro version adds MP3 and Flash support. All Digital Rapids software requires Digital Rapids capture cards.