Downloading evolved with Metalink
Anyone who has tried downloading a popular software or Linux distribution on or just after release day, knows of the pain of the phrase "connection timed out". Getting the software can be quite a struggle, despite all the mirrors and BitTorrent Samaritans. Anthony Bryan's Metalink is an open standard that makes downloading easier, faster, and more reliable by helping users extract the last drop of juice out of their connection.
I emailed a set of questions to Bryan to understand what separates Metalink from other run-of-the-mill download accelerators and listen in awe as he explains how Metalink combines traditional HTTP and FTP methods of downloading files along with BitTorrent.
Mayank Sharma: Let's begin with the traditional what is question -- what is Metalink? Is it another way to download "stuff"?
Anthony Bryan: Yep, Metalink is a way to download "stuff." But it's not a new transfer protocol like FTP or P2P method. It attempts to make the traditional download process simpler by automating advanced features and hiding complexity where possible. It's something that download programs like Web browsers, FTP programs, download managers, and P2P clients support. Information about the files to be downloaded is stored in a simple XML file with a .metalink extension.
MS: So how exactly is using a metalink beneficial over other traditional methods of downloading?
AB: In general, it should be easier and be error free. It can also be much faster. The information a download program would need is in a machine-processable format in the .metalink file, stuff like all the different ways to download a file (from multiple mirrors to P2P) and other things like the priority and geographical location of the mirrors, checksums, publisher, license, OS, language, etc. More the information, more the things that can be done!
So, instead of a single link to a file, you have many links. This means higher availability. If many servers are down or very busy (like on a release day), your download program can automatically check links and see which ones are good. You don't have to manually gather and check each link. This improves usability. Some programs can use the multiple links to download different parts or segments of a file from many places, which is called multi-source downloading. Usually, your download will be much faster, but that all depends on your connection. A single source download usually can't offer as fast download speeds.
Metalinks can also contain checksums or repair information to fix corrupted downloads, either files downloaded with other programs or repairing the download in progress. You can repair downloads with rsync and some P2P, but with Metalink you can do it over regular FTP/HTTP.
These benefits are mainly seen in large downloads like ISO files (CD/DVD images), which can range from 500 MB to 4 GB or larger. Time is money and sometimes you need things downloaded before you can get work done. If there was an error in one of these files, it could be painful to need to re-download it. This could be very helpful to movie and music downloads, where (once this technology is included in browsers) you'll be able to download huge files and recover from errors, and download a whole album or multiple albums in one click, within a browser and not some special application. For instance, when you buy a bunch of albums on Bleep.com, you can download the files individually or in an archive. The archive takes up resources on the server and on your computer. With Metalink, you can list all the files and have them added to a download queue in one click. This is also useful for software like KDE that is available in multiple files.
MS: Whoa, those are some advantages. While it's good for the person downloading, what about the person offering the download? Any reasons why he should make a .metalink for his downloads?
AB: As we've already discussed, Metalink should offer your downloaders an improved experience - more reliable and without errors.
This translates to cheaper bandwidth bills and support costs. Not very exciting unfortunately, but something that needs to be done since it's simple and can be automated. You can give different mirrors or P2P sources a priority. If you list FTP/HTTP mirrors and P2P, the mirrors can be used in case the person can't access P2P or if the file is no longer on the P2P network. This can be handy in situations where a business or university may block P2P because of copyright concerns or that they don't want someone continuously uploading and using bandwidth, or because of configuration issues with routers, etc. In the end, you want people to be able to download whatever you are distributing.
MS: Great! So how does one create a .metalink for their downloads?
AB: The easiest way to create a .metalink is with the Metalink Editor which has a graphical interface. If you're going to be making a bunch, you may want to use the automated tools which have a command line interface or server tools which are meant to be run on mirrors. This article on osresources details the non-graphical methods for .metalink creation.
MS: Right. So do I need a special application to use a metalink download? Won't it work with wget?
AB: Right now you do need a special application that supports reading the .metalink file and using that info. Luckily, most download managers support it right now, but no Web browsers natively. Support in Opera and Firefox would make the benefits way more accessible to regular people. wget doesn't support it yet, but aria2 is a similar command-line program that does. The beauty of free and open source software is that if people are interested enough, then support can be added to a bunch of download programs quickly and easily.
MS: Do you track the number of projects using Metalink?
AB: Yes, mostly free and open source projects use it right now, but some proprietary companies and game makes use it. OpenOffice.org, openSUSE, cURL, Arch Linux, DesktopBSD, blag linux, StartCom Linux, Berry Linux, PC-BSD, Linux Mint, Ubuntu Christian Edition, redWall Firewall, GoboLinux, TrueBSD, PuppyLinux, UniProt Consortium, Eiffel Software, and Ankama games have used Metalink.
MS: Any new feature Metalink users, both offering and downloading stuff, should keep an eye out for?
AB: I view Metalink as basic infrastructure and hopefully it will get to the point where people don't even know they are using it. Metalink as it is now is Wave One (or maybe .5) of coming improvements. I think we can build on it and offer some really cool things, which will again work transparently for people. Not all clients support all the features of Metalink, like using repair information - that's a really important feature.
Phex is the first program with Metalink support that is primarily P2P (other BitTorrent clients like aria2 and GetRight support it). You'll be able to export your library, or parts of it. If you want to share something with a friend, you can just email a .metalink or put it on a server for download, and they'll get the same exact files since they're identified with checksums. No weeding through search results searching for similar files will be needed.
MS: Thanks for taking time out for the interview. Please share a little information about yourself and your life outside of computers.
AB: I'm 29, just finished my Master's degree, and looking for work related to Metalink with open source projects. :) I live in southeastern Florida, in Pompano Beach. I'm pretty laid back and I love the ocean. I'm really into music, mostly instrumental, from electronic to Indian, dub, jazz, and even normal stuff. I like old books and movies and I'm interested in technology and how it affects our lives.