The FFT is one of the backbones of signal processing, allowing us to quickly transform a signal from time-domain to frequency-domain (and vice-versa via the inverse FFT). This allows us to analyze or manipulate the signal such as filtering, noise reduction, pitch detection or correction, time-stretching or even removing elements from the signal. The FFT or Fast Fourier Transform is an efficient algorithm for calculating the Discrete Fourier Transform or DFT. The DFT itself is one of the many forms of the Fourier Series. Named after the French mathematician Jean-Baptiste Joseph Fourier who made the discovery in the 19th century, that any periodic signal could be decomposed into a (possibly infinite) number of simple sinusoids of different frequencies.

The classic example used to demonstrate this incredibly useful property is the square wave. Looking at a square wave signal it’s not obvious that adding together smoother signals like sinusoids can give you a signal with a discontinuity such as a square wave. The illustrations below nicely demonstrate the point however (the images are captured from the interactive tool created by Lucas Vieira posted at the bottom of the betterexplained.com article on the Fourier Transform which you can also play around with the recreate the images below).

By taking a signal with fundamental frequency f and adding the harmonics 3f, 5f, 7f you can see the signal more closely resembles the square wave. If you were to continue adding an infinite number of odd harmonics you would get an exact square wave. This animation from wikipedia also shows the process nicely.

In the example above, it’s clear that the amplitude of the harmonics are not all equal. In fact to create a square wave you would divide the amplitude of the harmonic by the number of the harmonic i.e. 3f/3, 5f/5 etc. What the Fourier transform does is allow us to work backward, starting with a signal such as a square wave, or a complex audio signal, and figure out how much energy each harmonic has. Lucas Vieira again has created an excellent animation (for Wikimedia Commons) to demonstrate this.

The DFT is the discrete version of the Fourier Transform, meaning we are dealing with sampled signals. When we talk about Fourier Series of Digital signals, we’re using the DFT. The DFT is defined aswhich looks pretty intimidating. Fortunately Practical Crpytography has a great Intuitive Tutorial to explain exactly what’s going on. It boils down to a type of correlation. Essentially what is happening is that we are running a correlation of each harmonic of the signal against cosines and sines of various frequencies (the harmonics of the fundamental) to determine how much energy is contained in that harmonic.

Correlation is a widely used concept in signal processing, which at its heart is a measure of how similar two signals are. If the correlation is high, then the signals are similar, if the correlation is near zero the signals are not very similar. In general, if you ever see an equation that looks like this:

where

xandyare signals of some sort, you can say ‘Aha! We are calculating the correlation (or similarity) between x and y!’. How does such a simple looking equation calculate the similarity between two signals? Ifxis very similar toy, then whenx(i)is positive,y(i)will also be positive, and whenx(i)is negative,y(i)will usually also be negative (since the signals are similar). If we multiply 2 positive numbers (x(i)andy(i)), we get another positive number. If we multiply two negatives, we get another positive number. Then when we sum up all these positive numbers we get a large positive number. What happens if the signals are not very similar? Then sometimesx(i)will be positive whiley(i)is negative and vice versa. Other timesx(i)will be positive andy(i)will also be positive. In this way when we calculate the final sum we add a few positive numbers and a few negative numbers, which results in a number near zero. This is the idea behind it. Very similar signals will achieve a high correlation, while very different signals will achieve a low correlation, and signals that are a little bit similar will get a correlation score somewhere in between.

So the DFT is an equation, which iterates a bunch of times (N) for a bunch of frequencies (k) using a type of cross-correlation against sines and cosines of that frequency. But what frequencies are we talking about? Here’s where we get to make some choices. We’re dealing with discrete ie sampled signals, so we know what the maximum frequency is – it’ll be half the sampling frequency according to the Nyquist rate. So by choosing the number of points of the DFT, we can decide how much frequency resolution we want our transform to have. By dividing the maximum frequency by N, we’ll get our fundamental frequency, and the harmonics will naturally be multiples of this frequency.

So a typical example might be a block of audio. Our DFT has 1024 points so N=1024. It’s CD quality audio so our sampling rate is 44,100Hz. That gives us a fundamental frequency of 43.066Hz and the ‘bins’ of our transform will be evenly spaced all the way up to 22,050. You may have noticed that with that spacing we only have 512 bins. Because of the way the DFT works, we essentially get a mirror image of each bin above the Nyquist frequency – ie from 22,050Hz to 44,100Hz.

So there are two ways to increase the resolution of our transform – increase the number of points of the DFT or (depending on the maximum frequency in the signal) increase the sampling interval. In the example below I show the spectrum of a generated wave which contains a 30Hz and a 100Hz tone. The first shows a sampling rate of 44,100 and a 1,024 point FFT, which isn’t all that useful for measuring this signal. So I increased the number of points to 4,096 – much better. I then went back to 1,024 points but decreased the sampling rate to 11,025Hz (knowing that my tones are still well below the Nyquist cutoff) and get the same result – except for the maximum frequency measured is only 5.5kHz instead of 22.05kHz.

But 30Hz isn’t a harmonic of any of the bins we’re measuring. In fact you can see from the table below the exact values (in dB) in each of the bins of the plot above.

Frequency (Hz) Level (dB)

10.766602 -47.746212

21.533203 -25.607122

**32.299805 -22.253382**

43.066406 -31.223303

53.833008 -54.750214

64.599609 -62.747978

75.366211 -55.633057

86.132813 -34.577427

**96.899414 -24.465746**

**107.666016 -26.931360**

118.432617 -46.350952

129.199219 -60.688786

You can see the max levels are in the bins closest to 30Hz and 100Hz, but since we don’t have bins for those exact values we get leakage in proportional amounts to the closest bins available. The plots above have done some interpolation to graph the values nicely. So while we might not always get the resolution fine enough to correctly measure the frequency content exactly, one of the great properties of the Fourier Transform is that using these measured values and performing the inverse transform, we will get back * exactly *the same time domain signal we started with.

In part two we’ll dive into some of the mathematics of the DFT (including Euler’s Formula and Complex Numbers), discuss the equation and introduce the FFT algorithm. Part three will be a breakdown of some actual FFT code.