What does the bit depth and sample rate refer to?
Most modern audio interfaces provide the facility to specify the bit depth and sample rate for a particualr project but what differences do these values actually make to the quality of the audio? To help you understand exactly what these measures of digital audio are refering to we have created this helpful guide.
Bit Depth - 16 Bit vs 24 Bit
The easiest way to work out how a project recorded at 16 bit will sound different to a project recorded at 24 bit is to make a direct comparison to digital images. Audio files or indeed image files stored digitally on a computer are represented by a series of 1s and 0s stored on the computer. Groups of these 1s and 0s are known as bits and it is the number of bits used which dictates how detailed the image (or audio) is described... confused. Perhaps its better if I give an example. The following images are the same image but one has been saved at a higer bit depth.
16 bit image file | 4 bit image file |
As you can see the image saved at a lower bit depth looks grainy and undefined. Now translate this into the world of digital audio. Audio recorded at a lower bit depth will sound grainy and less defined therefore recording audio at 24 bit will be of a higher quality than something recorded at 16 bit. So why doesn't everyone record at 24 bit? Well we have identified the benefits of recording at higher bit depths but what about the costs? If something is recorded at 24 bits then it is using more bits in a file and as such the files are going to be much bigger. They will take up more disc space and they also require more computing power to process. So how do you choose what bit depth to work in? The best option is to provide yourself with the choice, buy an audio interface cabable of recording at higher bit depths but only use the higher bit depths if you know your project will benefit from it. For example it may be more efficient to stick to 16 bits if your project is aimed at FM transmissin or internet streaming.
Sample Rates
The sample rate is slightly more complicated to explain. The sample rate is the number of "snapshots" of audio that are sampled every second. The continuous audio stream is digitally encoded in a similar way to a movie camera capturing motion by recording an image frame many times per second. The higher the sample rate (and bit depth), the more accurately the original sound can be represented. The following diagram helps to illustrate this point. The curve can be thought of as being the original sound whereas the columns can be thought of as digital data trying to represent the original sound.
Again this seems to suggest that you should always record at higher sample rates, but again there are costs and you need to decide the most appropriate rate for your project. 44.1k refers to 44.1 thousand sample per second and you will find that 44.1k, 48k and 96k are the most common sample rates although 192k is now becoming more popular. 44.1k is the standard for CDs, 48k is common in video, 96k is popular in professional studios because it offers more headroom for mixing purposes and finally 192k is being used for very high quality DVD projects (normal DVD projects operate at 24bit 96k). Another point to consider is the human ear, this wonderful human tool is limited with regards to the frequencies it can actually detect. It is best not to record at sample rates below 44.1k because of the "Nyquist Frequency" a formula that indicates that the audio bandwidth of a sampled signal is restricted to half of the sampling rate (its getting a bit heavy now isn't it). So in order to cover the approximately 20khz range of human hearing, the equipment must sample at more then 40,000 (40k) samples per second. Put simply... Reducing the sample rate will reduce the sound quality and the bandwidth, and therefore should only be used when absolutely necessary, such as for internet streaming of voice-only sources.
How do the two values relate?
Again this is probably best illustrated in a diagram. See the graph below which shows the relation between the two values.
We have already established that Bit Depth refers to the number of bits you have to capture audio. The easiest way to see how this would effect music would be to view it as a series of levels that audio enery can be sliced at any given moment in time for example for 16 bit audio there are over 65536 possible levels. With every bit of greater resolution, the number of levels double. If we were to record at 24 bit then we would have over 16777216 levels for a slice of audio frozen in a single moment of time.
The biggest advantage of recording at higher bit depths is the extra headroom it provides during mixing with regards to the dynamic range and noise floor of the system. The extra number of values possible for your level indicator give you a better dynamic range and lowers the noise floor. This is transparent to the user since levels are indicated in decibels which is a logarithmic unit of sound intensity; 10 times the logarithm of the ratio of the sound intensity to some reference intensity... or in other words a decibel is a ratio rather than a defined value. Again this is best illustrated by a diagram.
24 bit recording | 16 bit recording |
If we were to then introduce time into the equation (after all audio frozen in a single moment is not particularly useful) then this gives rise to the sample rate. As discussed above the sample rate is the number of times your audio is measured (sampled) per second. Therefore 96KHz refers to 96000 slices of audio sampled each second.
How does this effect file sizes?
Bit Depth | Sample Rate | Bit Rate | File Size for 1 minute stereo mix | File Size for 3 minute stereo mix |
16 |
44,100
|
1.35 Mbit/sec
|
10.1 Mb
|
30.3 Mb |
16 |
48,000
|
1.46 Mbit/sec
|
11.0 Mb
|
33 Mb |
24 |
96,000
|
4.39 Mbit/sec
|
33.0 Mb
|
99 Mb |
MP3 File |
128 k/bit rate
|
0.13 Mbit/Sec
|
0.94 Mb
|
2.82 Mb |