May 1, 2009

Using Peer-to-Peer Distribution

The last method of distribution you may want to consider is peer-to-peer. Peer-to-peer (P2P) distribution uses other people on the network to distribute files, instead of sending everything from a single centralized server. P2P distribution came to notoriety with the arrival of Napster, which was originally used to share music files across the Internet. Since then, it has gone mainstream, with many different types of data distributed in this fashion. Skype, the Internet telephony start-up, is actually a P2P application.

There are a number of different P2P approaches, but basically the way it works is that if you request a file and the P2P network knows that someone near you already has a copy of the file, you are directed to that person's computer to get a copy, instead of sending another copy all the way across the network. Sometimes, downloads are distributed across multiple computers, so you're downloading parts of a file from many different participants on the P2P network.

The advantage of P2P distribution is that is uses the audience's bandwidth, so you don't have to pay for the throughput. Instead, people are being directed to other people on the network, and they're using their bandwidth, not yours. However, P2P distribution really works only if your content is very popular.

How P2P works

To explore how peer-to-peer distribution works, we'll look at BitTorrent. BitTorrent is perhaps the best-known P2P system. There is a vast amount of BitTorrent traffic on the Internet, by some estimates as much as 35 percent of the traffic at any given time. It's anyone's guess what all this traffic is and whether or not it's legal. Regardless, it's a proven system that works well.

BitTorrent is a protocol that defines how files can be shared between two or more hosts. It's also the name of one of the programs that distributes files using the BitTorrent protocol. Essentially, BitTorrent works by breaking large files into many small pieces. BitTorrent downloads are not done sequentially, like regular FTP or HTTP downloads. Instead, BitTorrent clients download files in pieces, from as many different clients as possible. BitTorrent clients find out about the different locations they can download files from by checking in with a BitTorrent tracker, which keeps track of everyone who is participating in the distribution of a particular file. It may seem a bit confusing, but it's actually pretty simple. Here's an example of how it works:

  1. You create a "torrent" for the file you want to distribute. This is a small file that contains all the information people need to know about the file to download it. The torrent is created in your BitTorrent application.

  2. After the torrent is created, it is placed on a Web server and registered with what is known as a tracker. The tracker keeps track of everyone who is participating in the distribution of the file.

  3. Next, you have to seed the file. This means getting the initial copy of the file into distribution. Usually this is done from the user's desktop. You click the link to the torrent on the Web site and indicate in your BitTorrent client that you're seeding for this file.

  4. When the first audience member clicks the torrent link, the torrent file opens by their BitTorrent client. The BitTorrent client finds out from the tracker who is participating in the distribution. Because no one else is, the BitTorrent client begins downloading the file from the original seed file, which in this case is your computer.

  5. When the next person clicks the torrent link and then checks in with the tracker, he finds that there are now two machines participating in the distribution: the original and the first audience member. Their BitTorrent client requests pieces of the file from both clients.

  6. As more clients joint the torrent, the distribution becomes more and more distributed, allowing clients to download the file from many different clients. Files that are very popular have many people participating in the torrent, so the distribution process scales accordingly.

  7. BitTorrent "etiquette" dictates that it's nice to leave your BitTorrent client on for a while after you've downloaded the file, so that you can help distribute the file to other people.

This is a simplified picture of how P2P distribution occurs, but essentially it's correct. For P2P distribution to be efficient, it requires lots of clients participating in the distribution. So when you're first starting out, P2P distribution offers very little benefit, because your audience most likely will be downloading at different times and won't be able to take advantage of the distributed download. When your podcast audience is in the thousands, then you can make an argument for P2P distribution.

Is P2P for you?

One thing that we've conveniently ignored up to this point is that P2P software is not built in to any podcatching software. Podcatchers can download MP3 files using HTTP, but they cannot participate in a P2P distribution scheme. So if you want to use P2P as a distribution scheme, your audience has to download and install P2P distribution software. Considering the antipathy some folks have to installing software, this may not be the easiest sell.

P2P distribution is used widely by the gaming community to distribute new releases and software fixes. It's a proven distribution technology that can potentially save you lots of money in bandwidth costs. The problem, however, is that it isn't yet integrated into podcasting in any meaningful way. Although P2P may be an effective way to scale podcasting distribution in the near future, for the time being you're probably better off sticking to other methods of distribution.