MP3 vs. WMA: Compressing Files
June 8th, 2006 Jason Dunn
I covered off the topic of bit rate previously, and why having a true CD-quality file at 1410 kbps means we end up with each music file in the 40 to 50MB range. When you think back to the mid-’90s when the popularity of the Web and email began to explode, most people were still using dial-up modems. Moral and legal ramifications aside, the roots of digital music lay in people trading and sharing music online. It’s hard to share a 50MB file while on 28.8 kbps dial-up – such a file would take, under realistic conditions of 2KB per second download speed, roughly seven hours to download. If the file was instead 2MB in size, it would only take 16 minutes to download. That’s quite a big difference! So file sizes needed to get smaller, but how? Psychoacoustic compression.
Psychoacoustics is a fancy word that simply means “what human beings can hear”. The human ear can only perceive certain frequencies of sound. Without getting too complicated (and mostly because I’m not an acoustic scientist), the idea is that in any given audio recording, there are frequencies that we can’t hear at all, but are still in the recording. If we get rid of those frequencies, we have less data to store in the song, and that means a smaller digital file size. The audio quality slider on Windows Media Player that has “Smallest Size” on one end and “Highest Quality” on the right is a nice visual for how this works. The more frequencies are dropped, the more compressed it is, and the smaller the file size – but the lower the quality of the audio, because as the file gets smaller, parts of the song that you can hear get tossed out.
By the way, this same theory works in JPEG pictures and MPEG movies, where visual data we can’t perceive is removed and the more data is removed the worse it looks. If you want to dig into the gory details of human hearing, this Web site has a lot of detail.
So let’s loop this back into our previous discussion of bit rate: the lower the bit rate, the less data there is in a song. Bit rate is the way we describe the level of compression. A 64kbps song is four times more compressed than a 256kbps song. It has four times less bits per second, which is four times less audio data. Now here’s where human hearing factors in: there’s a point where, once so much data has been removed from the song, that it just doesn’t sound right. This threshold is different for every person. Some people claim they can tell the difference between a CD and a 320kbps audio file. Some people can tell the difference between 64kbps and 128kbps, and others can’t. Just like eyesight, everyone hears differently. That’s why there’s no “right” answer when it comes to audio compression. The best you can do is select a quality level that’s right for your ears. No more theory – next up, the rubber meets the road and we talk about selecting the right audio file format (WMA or MP3) and the right bit rate.