Most musicians who also hang around on the other side of the mixing desk will likely have encountered convolution in its most common application for musicians and producers – reverb. A convolution reverb (such as the well known Waves IR-1 plugin) is a great way to recreate a physical space in a recording, such as a The Royal Albert Hall, Abbey Road Studios or any other desirable recording location.
By processing the audio track through the convolution reverb filter with the chosen reverb, the characteristics of the selected room are applied to the audio track. So how is it possible to accurately recreate the reverb characteristics of a room?
In order to do this we need to characterize the room first. If you dig into the reverb plugin, you may notice that the reverb options are actually .wav files, each one being no more than a few seconds long (roughly equivalent to the reverberation time of the room). This is an actual recording captured at the location to be recreated and it contains all the information necessary to do the processing. A typical example would sound something like a gunshot which you can hear in the clip below.
This is known as the Impulse Response of the room.
In signal processing, an impulse is a special type of signal which has a value of 1 for a single sample and a value of 0 everywhere else.
We may also write .
The impulse response therefore is simply the response of a system (specifically an LTI or Linear Time-Invariant System*) to the input of an impulse. For practical purposes, an impulse is usually generated by some sort of explosion e.g. a balloon popping, or a starter pistol (hence the impulse response sounds just like a gunshot) to produce a sound of the shortest duration possible. So in the case of the room we want to reproduce, the impulse is the balloon pop, and the impulse response is the sound reverberating in the space, which is what we want to capture.
So you might wonder how a recording of a balloon bursting, can turn your dry vocal track into a beautifully reverberant recording? The answer to that lies in the nature of the impulse signal. Because an impulse is instantaneous, it is an aperiodic signal. Most audio is made up of periodic signals which, thanks to the Fourier Series, can be broken down into sine waves of specific frequencies. You can then plot those frequencies on a chart as in the example below, where you can see some frequencies are more represented than others.
However if you plot the frequency spectrum of an impulse, rather than having no frequency components, you’ll find that all frequencies are equally represented. (See below)
So since the impulse is a signal which has all frequencies contained in it, the impulse response tells us what happens to each of those frequencies when they go through the filter (in this case the room is the filter). By applying this information to our audio track we can therefore recreate the response of the room as if the track was recorded there. This processing is carried out in the frequency domain through a process known as convolution which will be explained in Part 2.
*Linear means the response is predictably proportional to the input, and time-invariant means that if you delay the input, the output will also be delayed.