This HTML version of is provided for convenience, but it is not the best format for the book. In particular, some of the symbols are not rendered correctly.
You might prefer to read the PDF version.
Chapter 7 Discrete Fourier Transform
We’ve been using the Discrete Fourier Transform (DFT) since Chapter 1, but I haven’t explained how it works. Now is the time.
If you understand the Discrete Cosine Transform (DCT), you will understand the DFT. The only difference is that instead of using the cosine function, we’ll use the complex exponential function. I’ll start by explaining complex exponentials, then I’ll follow the same progression as in Chapter 6:
7.1 Complex exponentials
One of the more interesting moves in mathematics is the generalization of an operation from one type to another. For example, factorial is a function that operates on integers; the natural definition for factorial of n is the product of all integers from 1 to n.
If you are of a certain inclination, you might wonder how to compute the factorial of a non-integer like 3.5. Since the natural definition doesn’t apply, you might look for other ways to compute the factorial function, ways that would work with non-integers.
In 1730, Leonhard Euler found one, a generalization of the factorial function that we know as the gamma function (see http://en.wikipedia.org/wiki/Gamma_function).
The natural definition of exponentiation is repeated multiplication. For example, φ3 = φ · φ · φ. But this definition doesn’t apply to non-integer exponents.
However, exponentiation can also be expressed as a power series:
This definition works with real numbers, imaginary numbers and, by a simple extension, with complex numbers. Applying this definition to a pure imaginary number, iφ, we get
By rearranging terms, we can show that this is equivalent to:
You can see the derivation at http://en.wikipedia.org/wiki/Euler's_formula.
This formula implies that eiφ is a complex number with magnitude 1; if you think of it as a point in the complex plane, it is always on the unit circle. And if you think of it as a vector, the angle in radians between the vector and the positive x-axis is the argument, φ.
In the case where the exponent is a complex number, we have:
NumPy provides a version of exp that works with complex numbers:
>>> phi = 1.5 >>> z = np.exp(1j * phi) >>> z (0.0707+0.997j)
When the argument to np.exp is imaginary or complex, the result is a complex number; specifically, an np.complex128, which is represented by two 64-bit floating-point numbers. In this example, the result is 0.0707+0.997j.
Complex numbers have attributes real and imag:
>>> z.real 0.0707 >>> z.imag 0.997
>>> abs(z) 1.0 >>> np.absolute(z) 1.0
To get the angle, you can use np.angle:
>>> np.angle(z) 1.5
7.2 Complex signals
If φ(t) is a function of time, ei φ(t) is also a function of time. Specifically,
In the special case where the frequency of the signal is constant, φ(t) is 2 π f t and the result is a complex sinusoid:
Or more generally, the signal might start at a phase offset φ0, yielding
class ComplexSinusoid(Sinusoid): def evaluate(self, ts): phases = PI2 * self.freq * ts + self.offset ys = self.amp * np.exp(1j * phases) return ys
The result is a NumPy array of complex numbers:
>>> signal = thinkdsp.ComplexSinusoid(freq=1, amp=0.6, offset=1) >>> wave = signal.make_wave(duration=1, framerate=4) >>> wave.ys [ 0.324+0.505j -0.505+0.324j -0.324-0.505j 0.505-0.324j]
This example evaluates the signal at 4 places equally spaced between 0 and 1 second. The resulting samples are complex numbers.
7.3 The synthesis problem
Just as we did with real sinusoids, we we can create compound signals by adding up complex sinusoids with different frequencies. And that brings us to the complex version of the synthesis problem: given the frequency and amplitude of each complex component, how do we evaluate the signal?
The simplest solution is to create ComplexSinusoid objects and add them up.
def synthesize1(amps, fs, ts): components = [thinkdsp.ComplexSinusoid(freq, amp) for amp, freq in zip(amps, fs)] signal = thinkdsp.SumSignal(*components) ys = signal.evaluate(ts) return ys
This function is almost identical to synthesize1 in Section 6.1; the only difference is that I replaced CosSignal with ComplexSinusoid.
Here’s an example:
amps = np.array([0.6, 0.25, 0.1, 0.05]) fs = [100, 200, 300, 400] framerate = 11025 ts = np.linspace(0, 1, framerate) ys = synthesize1(amps, fs, ts)
The result is:
[ 1.000 +0.000e+00j 0.995 +9.093e-02j 0.979 +1.803e-01j ..., 0.979 -1.803e-01j 0.995 -9.093e-02j 1.000 -5.081e-15j]
At the lowest level, a complex signal is a sequence of complex numbers. But how should we interpret it? We have some intuition for real signals: they represent quantities that vary in time; for example, a sound signal represents changes in air pressure. But nothing we measure in the world yields complex numbers.
Taking the second point of view, we can split the previous signal into its real and imaginary parts:
n = 500 thinkplot.plot(ts[:n], ys[:n].real, label='real') thinkplot.plot(ts[:n], ys[:n].imag, label='imag')
Figure 7.1 shows a segment of the result. The real part is a sum of cosines; the imaginary part is a sum of sines. Although the waveforms look different, they contain the same frequency components in the same proportions. To our ears, they sound the same (in general, we don’t hear phase offsets).
7.4 Synthesis with matrices
As we saw in Section 6.2, we can also express the synthesis problem in terms of matrix multiplication:
PI2 = np.pi * 2 def synthesize2(amps, fs, ts): args = np.outer(ts, fs) M = np.exp(1j * PI2 * args) ys = np.dot(M, amps) return ys
Again, amps is a NumPy array that contains a sequence of amplitudes.
fs is a sequence containing the frequencies of the components. ts contains the times where we will evaluate the signal.
args contains the outer product of ts and fs, with the ts running down the rows and the fs running across the columns (you might want to refer back to Figure 6.1).
Each column of matrix M contains a complex sinusoid with a particular frequency, evaluated at a sequence of ts.
When we multiply M by the amplitudes, the result is a vector whose elements correspond to the ts; each element is the sum of several complex sinusoids, evaluated at a particular time.
Here’s the example from the previous section again:
>>> ys = synthesize2(amps, fs, ts) >>> ys [ 1.000 +0.000e+00j 0.995 +9.093e-02j 0.979 +1.803e-01j ..., 0.979 -1.803e-01j 0.995 -9.093e-02j 1.000 -5.081e-15j]
The result is the same.
In this example the amplitudes are real, but they could also be complex. What effect does a complex amplitude have on the result? Remember that we can think of a complex number in two ways: either the sum of a real and imaginary part, x + i y, or the product of a real amplitude and a complex exponential, A ei φ0. Using the second interpretation, we can see what happens when we multiply a complex amplitude by a complex sinusoid. For each frequency, f, we have:
We can test that claim by running the previous example with complex amplitudes:
phi = 1.5 amps2 = amps * np.exp(1j * phi) ys2 = synthesize2(amps2, fs, ts) thinkplot.plot(ts[:n], ys.real[:n]) thinkplot.plot(ts[:n], ys2.real[:n])
Since amps is an array of reals, multiplying by np.exp(1j * phi) yields an array of complex numbers with phase offset phi radians, and the same magnitudes as amps.
Figure 7.2 shows waveforms with different phase offsets. With φ0 = 1.5 each frequency component gets shifted by about a quarter of a cycle. But components with different frequencies have different cycles; as a result, each component is shifted by a different amount in time. When we add up the components, the resulting waveforms look different.
Now that we have the more general solution to the synthesis problem – one that handles complex amplitudes – we are ready for the analysis problem.
7.5 The analysis problem
The analysis problem is the inverse of the synthesis problem: given a sequence of samples, y, and knowing the frequencies that make up the signal, can we compute the complex amplitudes of the components, a?
As we saw in Section 6.3, we can solve this problem by forming the synthesis matrix, M, and solving the system of linear equations, M a = y for a.
def analyze1(ys, fs, ts): args = np.outer(ts, fs) M = np.exp(1j * PI2 * args) amps = np.linalg.solve(M, ys) return amps
analyze1 takes a (possibly complex) wave array, ys, a sequence of real frequencies, fs, and a sequence of real times, ts. It returns a sequence of complex amplitudes, amps.
Continuing the previous example, we can confirm that analyze1 recovers the amplitudes we started with. For the linear system solver to work, M has to be square, so we need ys, fs and ts to have the same length. I’ll ensure that by slicing ys and ts down to the length of fs:
>>> n = len(fs) >>> amps2 = analyze1(ys[:n], fs, ts[:n]) >>> amps2 [ 0.60+0.j 0.25-0.j 0.10+0.j 0.05-0.j]
These are approximately the amplitudes we started with, although each component has a small imaginary part due to floating-point errors.
7.6 Efficient analysis
Unfortunately, solving a linear system of equations is slow. For the DCT, we were able to speed things up by choosing fs and ts so that M is orthogonal. That way, the inverse of M is the transpose of M, and we can compute both DCT and inverse DCT by matrix multiplication.
We’ll do the same thing for the DFT, with one small change. Since M is complex, we need it to be unitary, rather than orthogonal, which means that the inverse of M is the conjugate transpose of M, which we can compute by transposing the matrix and negating the imaginary part of each element. See http://en.wikipedia.org/wiki/Unitary_matrix.
N = 4 ts = np.arange(N) / N fs = np.arange(N) args = np.outer(ts, fs) M = np.exp(1j * PI2 * args)
MstarM = M.conj().transpose().dot(M)
The result, within the tolerance of floating-point error, is 4 I, so M is unitary except for an extra factor of N, similar to the extra factor of 2 we found with the DCT.
We can use this result to write a faster version of analyze1:
def analyze2(ys, fs, ts): args = np.outer(ts, fs) M = np.exp(1j * PI2 * args) amps = M.conj().transpose().dot(ys) / N return amps
And test it with appropriate values of fs and ts:
N = 4 amps = np.array([0.6, 0.25, 0.1, 0.05]) fs = np.arange(N) ts = np.arange(N) / N ys = synthesize2(amps, fs, ts) amps3 = analyze2(ys, fs, ts)
Again, the result is correct within the tolerance of floating-point arithmetic.
[ 0.60+0.j 0.25+0.j 0.10-0.j 0.05-0.j]
First, I’ll make a function to compute the synthesis matrix, M:
def synthesis_matrix(N): ts = np.arange(N) / N fs = np.arange(N) args = np.outer(ts, fs) M = np.exp(1j * PI2 * args) return M
Then I’ll write the function that takes ys and returns amps:
def analyze3(ys): N = len(ys) M = synthesis_matrix(N) amps = M.conj().transpose().dot(ys) / N return amps
We are almost done; analyze3 computes something very close to the DFT, with one difference. The conventional definition of DFT does not divide by N:
def dft(ys): N = len(ys) M = synthesis_matrix(N) amps = M.conj().transpose().dot(ys) return amps
Now we can confirm that my version yields the same result as np.fft.fft:
>>> dft(ys) [ 2.4+0.j 1.0+0.j 0.4-0.j 0.2-0.j]
The result is close to amps * N. And here’s the version in np.fft:
>>> np.fft.fft(ys) [ 2.4+0.j 1.0+0.j 0.4-0.j 0.2-0.j]
They are the same, within floating point error.
def idft(ys): N = len(ys) M = synthesis_matrix(N) amps = M.dot(ys) / N return amps
Finally, we can confirm that dft(idft(amps)) yields amps.
>>> ys = idft(amps) >>> dft(ys) [ 0.60+0.j 0.25+0.j 0.10-0.j 0.05-0.j]
If I could go back in time, I might change the definition of DFT so it divides by N and the inverse DFT doesn’t. That would be more consistent with my presentation of the synthesis and analysis problems.
Or I might change the definition so that both operations divide through by √N. Then the DFT and inverse DFT would be more symmetric.
But I can’t go back in time (yet!), so we’re stuck with a slightly weird convention. For practical purposes it doesn’t really matter.
7.8 The DFT is periodic
In this chapter I presented the DFT in the form of matrix multiplication. We compute the synthesis matrix, M, and the analysis matrix, M*. When we multiply M* by the wave array, y, each element of the result is the product of a row from M* and y, which we can write in the form of a summation:
Normally we evaluate this summation for N values of k, running from 0 to N−1. We could evaluate it for other values of k, but there is no point, because they start to repeat. That is, the value at k is the same as the value at k+N or k+2N or k−N, etc.
We can see that mathematically by plugging k+N into the summation:
Since there is a sum in the exponent, we can break it into two parts:
In the second term, the exponent is always an integer multiple of 2 π, so the result is always 1, and we can drop it:
And we can see that this summation is equivalent to DFT(y)[k]. So the DFT is periodic, with period N. You will need this result for one of the exercises below, which asks you to implement the Fast Fourier Transform (FFT).
As an aside, writing the DFT in the form of a summation provides an insight into how it works. If you review the diagram in Section 6.2, you’ll see that each column of the synthesis matrix is a signal evaluated at a sequence of times. The analysis matrix is the (conjugate) transpose of the synthesis matrix, so each row is a signal evaluated at a sequence of times.
Therefore, each summation is the correlation of y with one of the signals in the array (see Section 5.5). That is, each element of the DFT is a correlation that quantifies the similarity of the wave array, y, and a complex exponential at a particular frequency.
7.9 DFT of real signals
The Spectrum class in thinkdsp is based on np.ftt.rfft, which computes the “real DFT”; that is, it works with real signals. But the DFT as presented in this chapter is more general than that; it works with complex signals.
signal = thinkdsp.SawtoothSignal(freq=500) wave = signal.make_wave(duration=0.1, framerate=10000) hs = dft(wave.ys) amps = np.absolute(hs)
This code makes a sawtooth wave with frequency 500 Hz, sampled at frame rate 10 kHz. hs contains the complex DFT of the wave; amps contains the amplitude at each frequency. But what frequency do these amplitudes correspond to? If we look at the body of dft, we see:
fs = np.arange(N)
It’s tempting to think that these values are the right frequencies. The problem is that dft doesn’t know the sampling rate. The DFT assumes that the duration of the wave is 1 time unit, so it thinks the sampling rate is N per time unit. In order to interpret the frequencies, we have to convert from these arbitrary time units back to seconds, like this:
fs = np.arange(N) * framerate / N
With this change, the range of frequencies is from 0 to the actual frame rate, 10 kHz. Now we can plot the spectrum:
thinkplot.plot(fs, amps) thinkplot.config(xlabel='frequency (Hz)', ylabel='amplitude')
Figure 7.3 shows the amplitude of the signal for each frequency component from 0 to 10 kHz. The left half of the figure is what we should expect: the dominant frequency is at 500 Hz, with harmonics dropping off like 1/f.
But the right half of the figure is a surprise. Past 5000 Hz, the amplitude of the harmonics starts growing again, peaking at 9500 Hz. What’s going on?
The answer: aliasing. Remember that with frame rate 10000 Hz, the folding frequency is 5000 Hz. As we saw in Section 2.3, a component at 5500 Hz is indistinguishable from a component at 4500 Hz. When we evaluate the DFT at 5500 Hz, we get the same value as at 4500 Hz. Similarly, the value at 6000 Hz is the same as the one at 4000 Hz, and so on.
The DFT of a real signal is symmetric around the folding frequency. Since there is no additional information past this point, we can save time by evaluating only the first half of the DFT, and that’s exactly what np.fft.rfft does.
Solutions to these exercises are in chap07soln.ipynb.
Exercise 1 The notebook for this chapter, chap07.ipynb, contains additional examples and explanations. Read through it and run the code.
Exercise 2 In this chapter, I showed how we can express the DFT and inverse DFT as matrix multiplications. These operations take time proportional to N2, where N is the length of the wave array. That is fast enough for many applications, but there is a faster algorithm, the Fast Fourier Transform (FFT), which takes time proportional to N logN.
The key to the FFT is the Danielson-Lanczos lemma:
Where DFT(y)[n] is the nth element of the DFT of y; e is a wave array containing the even elements of y, and o contains the odd elements of y.
This lemma suggests a recursive algorithm for the DFT:
For the base case of this recursion, you could wait until the length of y is 1. In that case, DFT(y) = y. Or if the length of y is sufficiently small, you could compute its DFT by matrix multiplication, possibly using a precomputed matrix.
Hint: I suggest you implement this algorithm incrementally by starting with a version that is not truly recursive. In Step 2, instead of making a recursive call, use dft, as defined in Section 7.7, or np.fft.fft. Get Step 3 working, and confirm that the results are consistent with the other implementations. Then add a base case and confirm that it works. Finally, replace Step 2 with recursive calls.
One more hint: Remember that the DFT is periodic; you might find np.tile useful.
You can read more about the FFT at https://en.wikipedia.org/wiki/Fast_Fourier_transform.
Are you using one of our books in a class?We'd like to know about it. Please consider filling out this short survey.