This HTML version of Think DSP is provided for convenience, but it is not the best format for the book. In particular, some of the math symbols are not rendered correctly.

You might prefer to read the PDF version, or you can buy a hard copy from Amazon.

Chapter 7 Discrete Fourier Transform

We’ve been using the discrete Fourier transform (DFT) since Chapter 1, but I haven’t explained how it works. Now is the time.

If you understand the discrete cosine transform (DCT), you will understand the DFT. The only difference is that instead of using the cosine function, we’ll use the complex exponential function. I’ll start by explaining complex exponentials, then I’ll follow the same progression as in Chapter 6:

We’ll start with the synthesis problem: given a set of frequency components and their amplitudes, how can we construct a signal? The synthesis problem is equivalent to the inverse DFT.
Then I’ll rewrite the synthesis problem in the form of matrix multiplication using NumPy arrays.
Next we’ll solve the analysis problem, which is equivalent to the DFT: given a signal, how to we find the amplitude and phase offset of its frequency components?
Finally, we’ll use linear algebra to find a more efficient way to compute the DFT.

The code for this chapter is in chap07.ipynb, which is in the repository for this book (see Section 0.2). You can also view it at http://tinyurl.com/thinkdsp07.

7.1 Complex exponentials

One of the more interesting moves in mathematics is the generalization of an operation from one type to another. For example, factorial is a function that operates on integers; the natural definition for factorial of n is the product of all integers from 1 to n.

If you are of a certain inclination, you might wonder how to compute the factorial of a non-integer like 3.5. Since the natural definition doesn’t apply, you might look for other ways to compute the factorial function, ways that would work with non-integers.

In 1730, Leonhard Euler found one, a generalization of the factorial function that we know as the gamma function (see http://en.wikipedia.org/wiki/Gamma_function).

Euler also found one of the most useful generalizations in applied mathematics, the complex exponential function.

The natural definition of exponentiation is repeated multiplication. For example, φ³ = φ · φ · φ. But this definition doesn’t apply to non-integer exponents.

However, exponentiation can also be expressed as a power series:

e^φ= 1 + φ + φ²/2! + φ³/3! + ...

This definition works with real numbers, imaginary numbers and, by a simple extension, with complex numbers. Applying this definition to a pure imaginary number, iφ, we get

e^iφ = 1 + iφ − φ²/2! − iφ³/3! + ...

By rearranging terms, we can show that this is equivalent to:

e^iφ = cosφ + i sinφ

You can see the derivation at http://en.wikipedia.org/wiki/Euler's_formula.

This formula implies that e^iφ is a complex number with magnitude 1; if you think of it as a point in the complex plane, it is always on the unit circle. And if you think of it as a vector, the angle in radians between the vector and the positive x-axis is the argument, φ.

In the case where the exponent is a complex number, we have:

e^a + iφ = e^a e^iφ = A e^iφ

where A is a real number that indicates amplitude and e^iφ is a unit complex number that indicates angle.

NumPy provides a version of exp that works with complex numbers:

>>> phi = 1.5
>>> z = np.exp(1j * phi)
>>> z
(0.0707+0.997j)

Python uses j to represent the imaginary unit, rather than i. A number ending in j is considered imaginary, so 1j is just i.

When the argument to np.exp is imaginary or complex, the result is a complex number; specifically, a np.complex128, which is represented by two 64-bit floating-point numbers. In this example, the result is 0.0707+0.997j.

Complex numbers have attributes real and imag:

>>> z.real
0.0707
>>> z.imag
0.997

To get the magnitude, you can use the built-in function abs or np.absolute:

>>> abs(z)
1.0
>>> np.absolute(z)
1.0

To get the angle, you can use np.angle:

>>> np.angle(z)
1.5

This example confirms that e^{i φ} is a complex number with magnitude 1 and angle φ radians.

7.2 Complex signals

If φ(t) is a function of time, e^{i φ(t)} is also a function of time. Specifically,

e^i φ(t) = cosφ(t) + i sinφ(t)

This function describes a quantity that varies in time, so it is a signal. Specifically, it is a complex exponential signal.

In the special case where the frequency of the signal is constant, φ(t) is 2 π f t and the result is a complex sinusoid:

e^{i 2 π f t} = cos2 π f t + i sin2 π f t

Or more generally, the signal might start at a phase offset φ₀, yielding

e^{i (2 π f t + φ₀)}

thinkdsp provides an implementation of this signal, ComplexSinusoid:

class ComplexSinusoid(Sinusoid):

   def evaluate(self, ts):
        phases = PI2 * self.freq * ts + self.offset
        ys = self.amp * np.exp(1j * phases)
        return ys

ComplexSinusoid inherits __init__ from Sinusoid. It provides a version of evaluate that is almost identical to Sinusoid.evaluate; the only difference is that it uses np.exp instead of np.sin.

The result is a NumPy array of complex numbers:

>>> signal = thinkdsp.ComplexSinusoid(freq=1, amp=0.6, offset=1)
>>> wave = signal.make_wave(duration=1, framerate=4)
>>> wave.ys
[ 0.324+0.505j -0.505+0.324j -0.324-0.505j  0.505-0.324j]

The frequency of this signal is 1 cycle per second, the amplitude is 0.6 (in unspecified units), and the phase offset is 1 radian.

This example evaluates the signal at 4 places equally spaced between 0 and 1 second. The resulting samples are complex numbers.

7.3 The synthesis problem

Just as we did with real sinusoids, we can create compound signals by adding up complex sinusoids with different frequencies. And that brings us to the complex version of the synthesis problem: given the frequency and amplitude of each complex component, how do we evaluate the signal?

The simplest solution is to create ComplexSinusoid objects and add them up.

def synthesize1(amps, fs, ts):
    components = [thinkdsp.ComplexSinusoid(freq, amp)
                  for amp, freq in zip(amps, fs)]
    signal = thinkdsp.SumSignal(*components)
    ys = signal.evaluate(ts)
    return ys

This function is almost identical to synthesize1 in Section 6.1; the only difference is that I replaced CosSignal with ComplexSinusoid.

Here’s an example:

    amps = np.array([0.6, 0.25, 0.1, 0.05])
    fs = [100, 200, 300, 400]
    framerate = 11025
    ts = np.linspace(0, 1, framerate)
    ys = synthesize1(amps, fs, ts)

The result is:

[ 1.000 +0.000e+00j  0.995 +9.093e-02j  0.979 +1.803e-01j ...,
  0.979 -1.803e-01j  0.995 -9.093e-02j  1.000 -5.081e-15j]

At the lowest level, a complex signal is a sequence of complex numbers. But how should we interpret it? We have some intuition for real signals: they represent quantities that vary in time; for example, a sound signal represents changes in air pressure. But nothing we measure in the world yields complex numbers.

Figure 7.1: Real and imaginary parts of a mixture of complex sinusoids.

So what is a complex signal? I don’t have a satisfying answer to this question. The best I can offer is two unsatisfying answers:

A complex signal is a mathematical abstraction that is useful for computation and analysis, but it does not correspond directly with anything in the real world.
If you like, you can think of a complex signal as a sequence of complex numbers that contains two signals as its real and imaginary parts.

Taking the second point of view, we can split the previous signal into its real and imaginary parts:

    n = 500
    plt.plot(ts[:n], ys[:n].real, label='real')
    plt.plot(ts[:n], ys[:n].imag, label='imag')

Figure 7.1 shows a segment of the result. The real part is a sum of cosines; the imaginary part is a sum of sines. Although the waveforms look different, they contain the same frequency components in the same proportions. To our ears, they sound the same (in general, we don’t hear phase offsets).

7.4 Synthesis with matrices

As we saw in Section 6.2, we can also express the synthesis problem in terms of matrix multiplication:

PI2 = np.pi * 2

def synthesize2(amps, fs, ts):
    args = np.outer(ts, fs)
    M = np.exp(1j * PI2 * args)
    ys = np.dot(M, amps)
    return ys

Again, amps is a NumPy array that contains a sequence of amplitudes.

fs is a sequence containing the frequencies of the components. ts contains the times where we will evaluate the signal.

args contains the outer product of ts and fs, with the ts running down the rows and the fs running across the columns (you might want to refer back to Figure 6.1).

Each column of matrix M contains a complex sinusoid with a particular frequency, evaluated at a sequence of ts.

When we multiply M by the amplitudes, the result is a vector whose elements correspond to the ts; each element is the sum of several complex sinusoids, evaluated at a particular time.

Here’s the example from the previous section again:

>>> ys = synthesize2(amps, fs, ts)
>>> ys
[ 1.000 +0.000e+00j  0.995 +9.093e-02j  0.979 +1.803e-01j ...,
  0.979 -1.803e-01j  0.995 -9.093e-02j  1.000 -5.081e-15j]

The result is the same.

In this example the amplitudes are real, but they could also be complex. What effect does a complex amplitude have on the result? Remember that we can think of a complex number in two ways: either the sum of a real and imaginary part, x + i y, or the product of a real amplitude and a complex exponential, A e^{i φ₀}. Using the second interpretation, we can see what happens when we multiply a complex amplitude by a complex sinusoid. For each frequency, f, we have:

A expi φ₀ · expi 2 π f t = A expi (2 π f t + φ₀)

Multiplying by A e^{i φ₀} multiplies the amplitude by A and adds the phase offset φ₀.

Figure 7.2: Real part of two complex signals that differ by a phase offset.

We can test that claim by running the previous example with φ₀ = 1.5 for all frequency components:

    phi = 1.5
    amps2 = amps * np.exp(1j * phi)
    ys2 = synthesize2(amps2, fs, ts)

    plt.plot(ts[:n], ys.real[:n])
    plt.plot(ts[:n], ys2.real[:n])

Since amps is an array of reals, multiplying by np.exp(1j * phi) yields an array of complex numbers with phase offset phi radians, and the same magnitudes as amps.

Figure 7.2 shows the result. The phase offset φ₀ = 1.5 shifts the wave to the left by about one quarter of a cycle; it also changes the waveform, because the same phase offset applied to different frequencies changes how the frequency components line up with each other.

Now that we have the more general solution to the synthesis problem – one that handles complex amplitudes – we are ready for the analysis problem.

7.5 The analysis problem

The analysis problem is the inverse of the synthesis problem: given a sequence of samples, y, and knowing the frequencies that make up the signal, can we compute the complex amplitudes of the components, a?

As we saw in Section 6.3, we can solve this problem by forming the synthesis matrix, M, and solving the system of linear equations, M a = y for a.

def analyze1(ys, fs, ts):
    args = np.outer(ts, fs)
    M = np.exp(1j * PI2 * args)
    amps = np.linalg.solve(M, ys)
    return amps

analyze1 takes a (possibly complex) wave array, ys, a sequence of real frequencies, fs, and a sequence of real times, ts. It returns a sequence of complex amplitudes, amps.

Continuing the previous example, we can confirm that analyze1 recovers the amplitudes we started with. For the linear system solver to work, M has to be square, so we need ys, fs and ts to have the same length. I’ll insure that by slicing ys and ts down to the length of fs:

>>> n = len(fs)
>>> amps2 = analyze1(ys[:n], fs, ts[:n])
>>> amps2
[ 0.60+0.j  0.25-0.j  0.10+0.j  0.05-0.j]

These are approximately the amplitudes we started with, although each component has a small imaginary part due to floating-point errors.

7.6 Efficient analysis

Unfortunately, solving a linear system of equations is slow. For the DCT, we were able to speed things up by choosing fs and ts so that M is orthogonal. That way, the inverse of M is the transpose of M, and we can compute both DCT and inverse DCT by matrix multiplication.

We’ll do the same thing for the DFT, with one small change. Since M is complex, we need it to be unitary, rather than orthogonal, which means that the inverse of M is the conjugate transpose of M, which we can compute by transposing the matrix and negating the imaginary part of each element. See http://en.wikipedia.org/wiki/Unitary_matrix.

The NumPy methods conj and transpose do what we want. Here’s the code that computes M for N=4 components:

    N = 4
    ts = np.arange(N) / N
    fs = np.arange(N)
    args = np.outer(ts, fs)
    M = np.exp(1j * PI2 * args)

If M is unitary, M^*M = I, where M^* is the conjugate transpose of M, and I is the identity matrix. We can test whether M is unitary like this:

    MstarM = M.conj().transpose().dot(M)

The result, within the tolerance of floating-point error, is 4 I, so M is unitary except for an extra factor of N, similar to the extra factor of 2 we found with the DCT.

We can use this result to write a faster version of analyze1:

def analyze2(ys, fs, ts):
    args = np.outer(ts, fs)
    M = np.exp(1j * PI2 * args)
    amps = M.conj().transpose().dot(ys) / N
    return amps

And test it with appropriate values of fs and ts:

    N = 4
    amps = np.array([0.6, 0.25, 0.1, 0.05])
    fs = np.arange(N)
    ts = np.arange(N) / N
    ys = synthesize2(amps, fs, ts)
    amps3 = analyze2(ys, fs, ts)

Again, the result is correct within the tolerance of floating-point arithmetic.

[ 0.60+0.j  0.25+0.j  0.10-0.j  0.05-0.j]

7.7 DFT

As a function, analyze2 would be hard to use because it only works if fs and ts are chosen correctly. Instead, I will rewrite it to take just ys and compute fs and ts itself.

First, I’ll make a function to compute the synthesis matrix, M:

def synthesis_matrix(N):
    ts = np.arange(N) / N
    fs = np.arange(N)
    args = np.outer(ts, fs)
    M = np.exp(1j * PI2 * args)
    return M

Then I’ll write the function that takes ys and returns amps:

def analyze3(ys):
    N = len(ys)
    M = synthesis_matrix(N)
    amps = M.conj().transpose().dot(ys) / N
    return amps

We are almost done; analyze3 computes something very close to the DFT, with one difference. The conventional definition of DFT does not divide by N:

def dft(ys):
    N = len(ys)
    M = synthesis_matrix(N)
    amps = M.conj().transpose().dot(ys)
    return amps

Now we can confirm that my version yields the same result as FFT:

>>> dft(ys)
[ 2.4+0.j  1.0+0.j  0.4-0.j  0.2-0.j]

The result is close to amps * N. And here’s the version in np.fft:

>>> np.fft.fft(ys)
[ 2.4+0.j  1.0+0.j  0.4-0.j  0.2-0.j]

They are the same, within floating point error.

The inverse DFT is almost the same, except we don’t have to transpose and conjugate M, and now we have to divide through by N:

def idft(amps):
    N = len(amps)
    M = synthesis_matrix(N)
    ys = M.dot(amps) / N
    return ys

Finally, we can confirm that dft(idft(amps)) yields amps.

>>> ys = idft(amps)
>>> dft(ys)
[ 0.60+0.j  0.25+0.j  0.10-0.j  0.05-0.j]

If I could go back in time, I might change the definition of DFT so it divides by N and the inverse DFT doesn’t. That would be more consistent with my presentation of the synthesis and analysis problems.

Or I might change the definition so that both operations divide through by √N. Then the DFT and inverse DFT would be more symmetric.

But I can’t go back in time (yet!), so we’re stuck with a slightly weird convention. For practical purposes it doesn’t really matter.

7.8 The DFT is periodic

In this chapter I presented the DFT in the form of matrix multiplication. We compute the synthesis matrix, M, and the analysis matrix, M^*. When we multiply M^* by the wave array, y, each element of the result is the product of a row from M^* and y, which we can write in the form of a summation:

DFT(y)[k] =

∑

y[n] exp(−2 π i n k / N)

where k is an index of frequency from 0 to N−1 and n is an index of time from 0 to N−1. So DFT(y)[k] is the kth element of the DFT of y.

Normally we evaluate this summation for N values of k, running from 0 to N−1. We could evaluate it for other values of k, but there is no point, because they start to repeat. That is, the value at k is the same as the value at k+N or k+2N or k−N, etc.

We can see that mathematically by plugging k+N into the summation:

DFT(y)[k+N] =

∑

y[n] exp(−2 π i n (k+N) / N)

Since there is a sum in the exponent, we can break it into two parts:

DFT(y)[k+N] =

∑

y[n] exp(−2 π i n k / N) exp(−2 π i n N / N)

In the second term, the exponent is always an integer multiple of 2 π, so the result is always 1, and we can drop it:

DFT(y)[k+N] =

∑

y[n] exp(−2 π i n k / N)

And we can see that this summation is equivalent to DFT(y)[k]. So the DFT is periodic, with period N. You will need this result for one of the exercises below, which asks you to implement the Fast Fourier Transform (FFT).

As an aside, writing the DFT in the form of a summation provides an insight into how it works. If you review the diagram in Section 6.2, you’ll see that each column of the synthesis matrix is a signal evaluated at a sequence of times. The analysis matrix is the (conjugate) transpose of the synthesis matrix, so each row is a signal evaluated at a sequence of times.

Therefore, each summation is the correlation of y with one of the signals in the array (see Section 5.5). That is, each element of the DFT is a correlation that quantifies the similarity of the wave array, y, and a complex exponential at a particular frequency.

7.9 DFT of real signals

Figure 7.3: DFT of a 500 Hz sawtooth signal sampled at 10 kHz.

The Spectrum class in thinkdsp is based on np.fft.rfft, which computes the “real DFT”; that is, it works with real signals. But the DFT as presented in this chapter is more general than that; it works with complex signals.

So what happens when we apply the “full DFT” to a real signal? Let’s look at an example:

    signal = thinkdsp.SawtoothSignal(freq=500)
    wave = signal.make_wave(duration=0.1, framerate=10000)
    hs = dft(wave.ys)
    amps = np.absolute(hs)

This code makes a sawtooth wave with frequency 500 Hz, sampled at framerate 10 kHz. hs contains the complex DFT of the wave; amps contains the amplitude at each frequency. But what frequency do these amplitudes correspond to? If we look at the body of dft, we see:

    fs = np.arange(N)

It’s tempting to think that these values are the right frequencies. The problem is that dft doesn’t know the sampling rate. The DFT assumes that the duration of the wave is 1 time unit, so it thinks the sampling rate is N per time unit. In order to interpret the frequencies, we have to convert from these arbitrary time units back to seconds, like this:

    fs = np.arange(N) * framerate / N

With this change, the range of frequencies is from 0 to the actual framerate, 10 kHz. Now we can plot the spectrum:

    plt.plot(fs, amps)

Figure 7.3 shows the amplitude of the signal for each frequency component from 0 to 10 kHz. The left half of the figure is what we should expect: the dominant frequency is at 500 Hz, with harmonics dropping off like 1/f.

But the right half of the figure is a surprise. Past 5000 Hz, the amplitude of the harmonics start growing again, peaking at 9500 Hz. What’s going on?

The answer: aliasing. Remember that with framerate 10000 Hz, the folding frequency is 5000 Hz. As we saw in Section 2.3, a component at 5500 Hz is indistinguishable from a component at 4500 Hz. When we evaluate the DFT at 5500 Hz, we get the same value as at 4500 Hz. Similarly, the value at 6000 Hz is the same as the one at 4000 Hz, and so on.

The DFT of a real signal is symmetric around the folding frequency. Since there is no additional information past this point, we can save time by evaluating only the first half of the DFT, and that’s exactly what np.fft.rfft does.

7.10 Exercises

Solutions to these exercises are in chap07soln.ipynb.

Exercise 1 The notebook for this chapter, chap07.ipynb, contains additional examples and explanations. Read through it and run the code.

Exercise 2 In this chapter, I showed how we can express the DFT and inverse DFT as matrix multiplications. These operations take time proportional to N², where N is the length of the wave array. That is fast enough for many applications, but there is a faster algorithm, the Fast Fourier Transform (FFT), which takes time proportional to N logN.

The key to the FFT is the Danielson-Lanczos lemma:

DFT(y)[n] = DFT(e)[n] + exp(−2 π i n / N) DFT(o)[n]

Where DFT(y)[n] is the nth element of the DFT of y; e is a wave array containing the even elements of y, and o contains the odd elements of y.

This lemma suggests a recursive algorithm for the DFT:

Given a wave array, y, split it into its even elements, e, and its odd elements, o.
Compute the DFT of e and o by making recursive calls.
Compute DFT(y) for each value of n using the Danielson-Lanczos lemma.

For the base case of this recursion, you could wait until the length of y is 1. In that case, DFT(y) = y. Or if the length of y is sufficiently small, you could compute its DFT by matrix multiplication, possibly using a precomputed matrix.

Hint: I suggest you implement this algorithm incrementally by starting with a version that is not truly recursive. In Step 2, instead of making a recursive call, use dft, as defined by Section 7.7, or np.fft.fft. Get Step 3 working, and confirm that the results are consistent with the other implementations. Then add a base case and confirm that it works. Finally, replace Step 2 with recursive calls.

One more hint: Remember that the DFT is periodic; you might find np.tile useful.

You can read more about the FFT at https://en.wikipedia.org/wiki/Fast_Fourier_transform.

Chapter 7 Discrete Fourier Transform

7.1 Complex exponentials

7.2 Complex signals

7.3 The synthesis problem

7.4 Synthesis with matrices

7.5 The analysis problem

7.6 Efficient analysis

7.7 DFT

7.8 The DFT is periodic

7.9 DFT of real signals

7.10 Exercises

Are you using one of our books in a class?