Why (Almost) Everything You Thought You Knew About Bit Depth Is Probably Wrong

A lot of competent audio engineers working in the field today have some real misconceptions and gaps in their knowledge around digital audio.

Not a month goes by that I don’t encounter an otherwise capable music professional who makes simple errors about all sorts of basic digital audio principles – The very kinds of fundamental concepts that today’s 22 year-olds couldn’t graduate college without understanding.

There are a few good reasons for this, and two big ones come to mind immediately:

The first is that you don’t really need to know a lot about science in order to make great-sounding records. It just doesn’t hurt. A lot of people have made good careers in audio by focusing on the aesthetic and interpersonal aspects of studio work, which are arguably the most important.

(Similarly, a race car driver doesn’t need to know everything about how his engine works. But it can help.)

The second is that digital audio is a complex and relatively new field – its roots lie in a theorem set to paper by Harry Nyquist 1928 and further developed by Claude Shannon in 1946 – and quite honestly, we’re still figuring out how to explain it to people properly.

In fact, I wouldn’t be surprised if a greater number of people had a decent understanding of Einstein’s theories of relativity, originally published in 1905 and 1916! You’d at least expect to encounter those in a high school science class.

If your education was anything like mine, you’ve probably taken college level courses, seminars, or done some comparable reading in which well-meaning professors or authors tried to describe digital audio with all manner of stair-step diagrams and jagged-looking line drawings.

It’s only recently that we’ve come to discover that such methods have led to almost as much confusion as understanding. In some respects, they are just plain wrong.

What You Probably Misunderstand About Bit Depth

I’ve tried to help correct some commonly mistaken notions about ultra-high sampling rates, decibels and loudness, the real fidelity of historical formats, and the sound quality of today’s compressed media files.

Meanwhile, Monty Montgomery of xiph.org does an even better job than I ever could of explaining how there are no stair-steps in digital audio, and why “inferior sound quality” is not actually among the problems facing the music industry today.

A bad way, and a better way to visualize digital audio. Images courtesy Monty Montgomery's Digital Show and Tell video. (Xiph.org)

A bad way, and a better way to visualize digital audio. Images courtesy Monty Montgomery’s Digital Show and Tell video. (Xiph.org)

After these, some of the most common misconceptions I encounter center around “bit depth.”

Chances are that if you’re reading SonicScoop, you understand that the bit depth of an audio file is what determines its “dynamic range” – the distance between the quietest sound and the loudest sound we can reproduce.

But things start to go a little haywire when people start thinking about bit depth in terms of the “resolution” of an audio file. In the context of digital audio, that word is technically correct. It’s only what people think the word “resolution” means that’s the problem. For the purpose of talking about audio casually among peers, we might be even better off abandoning it completely.

When people imagine the “resolution” of an audio file, they tend to immediately think of the “resolution” of their computer screen. Turn down the resolution of your screen, and the image gets fuzzier. Things get blockier, hazier, and they start to lose their clarity and detail pretty quickly.

Perfect analogy, right? Well, unfortunately, it’s almost exactly wrong.

All other things being equal, when your turn down the bit depth of a file, all you’ll get is an increasing amount of low-level noise, kind of like tape hiss. (Except that with any reasonable digital audio file, that virtual “tape hiss” will be far lower than it ever was on tape.)

That’s it. The whole enchilada. Keep everything else the same but turn down the bit depth? You’ll get a slightly higher noise floor. Nothing more. And, in all but extreme cases, that noise floor is still going to be – objectively speaking – “better” than analog.

On Bits, Bytes and Gameboys

This sounds counter-intuitive to some people. A common question at this point is: “But what about all that terrible low-resolution 8-bit sound on video games back in the day? That sounded like a lot more than just tape hiss.

That’s a fair question to ask. Just like with troubleshooting a signal path, the key to untangling the answer is to isolate our variables.

Do you know what else was going on with 8-bit audio back in the day? Here’s a partial list: Lack of dither, aliasing, ultra-low sampling rates, harmonic distortion from poor analog circuits, low-quality dither, low-quality DA converters and filters, early digital synthesis, poor quality computer speakers… We could go on like this. I’ll spare you.

Nostalgia, being one of humanity’s most easily renewable resources, has made it so that plenty of folks around my age even remember some of these old formats fondly. Today there are electronic musicians who make whole remix albums with Nintendos and Gameboys, which offer only 4 bits of audio as well as a myriad of other, far more significant issues.

(If you like weird music and haven’t checked out 8-Bit Operators’ The Music of Kraftwerk, you owe it to yourself. They’ve also made tributes to Devo and The Beatles.)

But despite all that comes to mind when we think of the term “8 Bits,” the reality is that if you took all of today’s advances in digital technology and simply turned down the bit depth to 8, all you’d get is a waaaaaaay better version of tape cassette.

There’d be no frequency problems, no extra distortion, none of the “wow” and “flutter” of tape, nor the aliasing and other weird artifacts of early digital. You’d just have a higher-than-ideal noise floor. But with at least 48 dB of dynamic range, even the noise floor of modern 8-bit audio would still be better than cassette. (And early 78 RPM records, too.)

Pages: First | 1 | 2 | 3 | Next → | Last | Single Page

  • John Lardinois

    It is – that’s why the sample rate is 44.1 and not 40. I simply was using numbers that were consistent with 20kHz human hearing range so as to not confuse those who are not familiar with how nyquist filters and sample rates work.

    I figured there are people who are reading this who have never heard of a nyquist filter, but know that human hearing goes up to 20k (common knowledge).

    I meant it as a way to simplify the explanation for the readers who are new to digital audio theory. So, yes, you are correct. I merely meant it as a simpler, cleaner response for the new guys.

  • ahh, got ya.

  • Kon

    Thanks for sharing this knowledge… the Internet needs more guys like you, seriously.

  • Eric D. Ryser

    Yep. I have tin ears, okay? and even I can hear very non-subtle quantization errors when I use my 16 bit recorder, especially when the sound is “Morendo”. Admittedly, this is probably due more to the quality of the digital signal converters than anything, but that doesn’t take away from the fact that, as you point out, given more resolution that conversion will have less error. This article is wrong in it’s basic premise, that bit depth is not “about” resolution…because it’s exactly about resolution.

  • preferred user

    16/44.1 lower Nyqist rate is close @ 22.5kHz i.e. ½ of 44.1kHz

  • Bob Fridz

    Nice article but aren’t a lot of people failing to mention or realize that in a studio people often stack up multiple tracks to make an arrangement. This can be 100+ tracks nowadays in pop music, but it happens in all genres.

    One might not be able to hear the difference between a 16 bit finished track or a 24 bit finished track, but I am 1000% sure everyone can hear the difference between a mix built up out of 100+ 16bit tracks or a 100+ 24 bit tracks. Same goes for 44.1 up to 192khz.
    Record a drumkit with 15 mics (close, mid and room) and have a drummer play the same beat and fills. Record a section at 16bit 44.1khz and record a similar section at 24bit 192khz.
    I will put money on this: everyone will hear the difference! And no it’s not magic. 24/192 isn’t amazing, it’s just the most accurate way of capturing sound.
    Downgrading a 24/192 mix to 16/44.1 afterward will still sound better than doing the whole process at 16/44.1 from the start.

  • JSchmoe

    “Anything quieter than the noise floor just can’t be heard”
    This is completely inaccurate. Think a little more.

  • Justin C.

    It is indeed my job to think about these things, Mr. Schmoe. If you’d like to make a reasoned counterargument, or provide some evidence to support your assertion, then great! If not, then this doesn’t add much to the conversation.

    Until then, yes, the original statement is correct: Any signal that is lower in level than the noise floor will, by definition, be masked by the noise floor and be indistinguishable from the random, full-spectrum noise.

    One caveat that may be worth noting is that the noise floor can potentially be at a slightly different level at different frequencies, but the essential concept, and the original statement, still holds, even with this caveat.

  • “Any signal that is lower in level than the noise floor will, by definition, be masked by the noise floor and be indistinguishable from the random, full-spectrum noise.”
    That’s utter nonsense. Humans can pick up patterns WAY below the noise floor. E.g:

  • Physicsonboard

    This is partially true, but it is not what the studies suggest. The double blind studies showed that a higher proportion of “super listeners” preferred the lower resolution files, no one knows why. OTOH, the idea that one should rely on science when making an audio recording is as ridiculous as making a painting using “color by number.”

  • George Piazza

    I have one question for you John: Do you convert your files to 96 kHz before mixing them, or do you simply set your DAW to output a 96kHz file?

    If you do the latter, ask yourself – Where is the DAW converting the data into 96 kHz? Before the first mixing calculation? Really?

    Or, as is much more likely, the whole mix is happening at the original sample rate, then the DAW is up-converting the final output to 96 kHz? Therefore, none of the EQs & other processors are ever seeing 96 kHz (except for the ones that have internal up-sampling as part of their processing). All you are doing is up-converting a stereo file after the mixing.

    I have seen a number of people who say they do what you do, but if you check your CPU usage, does it double when you mix out at double the original file sample rates? Or at least go up substantially, taking into account the number of plugins that already up-sample – down-sample the data stream? And if the session (and converter) are still set to 48 kHz, is it reasonable to assume that you are getting 96 kHz processing throughout? Even if you set the DAW to 96 kHz playback, you are now playing back 48 kHz files at twice their speed! Without changing the actual processing sample rate of the calculations. (and the ‘mix’ file would probably sound rather odd.. you know, like the chipmunks on meth!).

    I have yet to see a DAW manufacturer that unequivocally specifies the actual location of the up-sampling when the output medium is set at a higher sample rate, but the logic dictates that 48 kHz files playing back at 48 kHz will be processed at 48 kHz, then up-sampled to 96 kHz upon output.

    Therefore, the only way to assure all processing takes place at 96 kHz is to batch up-sample every file, set the DAW at the new sampling rate (96 kHz) and then mix. Hope you consolidate all your files first!

  • John Morris

    I have been trying to educate people on this for years. If you think this is bad try explaining to amateur engineers that 0dbfs (digital full scale) doesn’t equal 0vu. 0vu or rather 0dbu for 24 bit is -18dbfs RMS (average level not peak) I tell them their converters will sound better if they run them at the level they were designed to operate at. Average at -18dbfs (between -22 and – 12dbfs) and don’t peak beyond -6dbfs but I get this, “The closer you get to 0dbfs the more bits you use, so the better it sounds…” amateur engineers recording too hot has become a big problem. Even some professionals with million dollar studios, who should know better are doing this “max it out to zero” foolishness.