This HTML version of is provided for convenience, but it is not the best format for the book. In particular, some of the symbols are not rendered correctly.
You might prefer to read the PDF version.
Chapter 5 Autocorrelation
In the previous chapter I characterized white noise as “uncorrelated”, which means that each value is independent of the others, and Brownian noise as “correlated”, because each value depends on the preceding value. In this chapter I define these terms more precisely and present the autocorrelation function, which is a useful tool for signal analysis.
In general, correlation between variables means that if you know the value of one, you have some information about the other. There are several ways to quantify correlation, but the most common is the Pearson product-moment correlation coefficient, usually denoted ρ. For two variables, x and y, that each contain N values:
Pearson’s correlation is always between -1 and +1 (including both). If ρ is positive, we say that the correlation is positive, which means that when one variable is high, the other tends to be high. If ρ is negative, the correlation is negative, so when one variable is high, the other tends to be low.
The magnitude of ρ indicates the strength of the correlation. If ρ is 1 or -1, the variables are perfectly correlated, which means that if you know one, you can make a perfect prediction about the other. If ρ is near zero, the correlation is probably weak, so if you know one, it doesn’t tell you much about the others,
I say “probably weak” because it is also possible that there is a nonlinear relationship that is not captured by the coefficient of correlation. Nonlinear relationships are often important in statistics, but less often relevant for signal processing, so I won’t say more about them here.
I’ll present an example with only two variables. First, I define a function that constructs sine waves with different phase offsets:
def make_sine(offset): signal = thinkdsp.SinSignal(freq=440, offset=offset) wave = signal.make_wave(duration=0.5, framerate=10000) return wave
Next I instantiate two waves with different offsets:
wave1 = make_sine(offset=0) wave2 = make_sine(offset=1)
Figure 5.1 shows what the first few periods of these waves look like. When one wave is high, the other is usually high, so we expect them to be correlated.
>>> corr_matrix = np.corrcoef(wave1.ys, wave2.ys, ddof=0) [[ 1. 0.54] [ 0.54 1. ]]
The option ddof=0 indicates that corrcoef should divide by N, as in the equation above, rather than use the default, N−1.
The result is a correlation matrix: the first element is the correlation of wave1 with itself, which is always 1. Similarly, the last element is the correlation of wave2 with itself.
The off-diagonal elements contain the value we’re interested in, the correlation of wave1 and wave2. The value 0.54 indicates that the strength of the correlation is moderate.
As the phase offset increases, this correlation decreases until the waves are 180 degrees out of phase, which yields correlation -1. Then it increases until the offset differs by 360 degrees. At that point we have come full circle and the correlation is 1.
Figure 5.2 shows the relationship between correlation and phase offset for a sine wave. The shape of that curve should look familiar; it is a cosine.
thinkdsp provides a simple interface for computing the correlation between waves:
>>> wave1.corr(wave2) 0.54
5.2 Serial correlation
Signals often represent measurements of quantities that vary in time. For example, the sound signals we’ve worked with represent measurements of voltage (or current), which correspond to the changes in air pressure we perceive as sound.
Measurements like this almost always have serial correlation, which is the correlation between each element and the next (or the previous). To compute serial correlation, we can shift a signal and then compute the correlation of the shifted version with the original.
def serial_corr(wave, lag=1): n = len(wave) y1 = wave.ys[lag:] y2 = wave.ys[:n-lag] corr = np.corrcoef(y1, y2, ddof=0)[0, 1] return corr
signal = thinkdsp.UncorrelatedGaussianNoise() wave = signal.make_wave(duration=0.5, framerate=11025) serial_corr(wave)
When I ran this example, I got 0.006, which indicates a very small serial correlation. You might get a different value when you run it, but it should be comparably small.
signal = thinkdsp.BrownianNoise() wave = signal.make_wave(duration=0.5, framerate=11025) serial_corr(wave)
Sure enough, the result I got is greater than 0.999.
signal = thinkdsp.PinkNoise(beta=1) wave = signal.make_wave(duration=0.5, framerate=11025) serial_corr(wave)
With parameter β=1, I got a serial correlation of 0.851. As we vary the parameter from β=0, which is uncorrelated noise, to β=2, which is Brownian, serial correlation ranges from 0 to almost 1, as shown in Figure 5.3.
You can think of
def autocorr(wave): lags = range(len(wave.ys)//2) corrs = [serial_corr(wave, lag) for lag in lags] return lags, corrs
autocorr takes a Wave object and returns the autocorrelation function as a pair of sequences: lags is a sequence of integers from 0 to half the length of the wave; corrs is the sequence of serial correlations for each lag.
Figure 5.4 shows autocorrelation functions for pink noise with three values of β. For low values of β, the signal is less correlated, and the autocorrelation function drops off to zero quickly. For larger values, serial correlation is stronger and drops off more slowly. With β=1.7 serial correlation is strong even for long lags; this phenomenon is called long-range dependence, because it indicates that each value in the signal depends on many preceding values.
5.4 Autocorrelation of periodic signals
As an example, I downloaded from freesound.org a recording of someone singing a chirp; the repository for this book includes the file: 28042__bcjordan__voicedownbew.wav. You can use the Jupyter notebook for this chapter, chap05.ipynb, to play it.
Figure 5.5 shows the spectrogram of this wave. The fundamental frequency and some of the harmonics show up clearly. The chirp starts near 500 Hz and drops down to about 300 Hz, roughly from C5 to E4.
duration = 0.01 segment = wave.segment(start=0.2, duration=duration) spectrum = segment.make_spectrum() spectrum.plot(high=1000)
This segment starts at 0.2 seconds and lasts 0.01 seconds. Figure 5.6 shows its spectrum. There is a clear peak near 400 Hz, but it is hard to identify the pitch precisely. The length of the segment is 441 samples at a frame rate of 44100 Hz, so the frequency resolution is 100 Hz (see Section 3.5). That means the estimated pitch might be off by 50 Hz; in musical terms, the range from 350 Hz to 450 Hz is about 5 semitones, which is a big difference!
We could get better frequency resolution by taking a longer segment, but since the pitch is changing over time, we would also get “motion blur”; that is, the peak would spread between the start and end pitch of the segment, as we saw in Section 3.3.
We can estimate pitch more precisely using autocorrelation. If a signal is periodic, we expect the autocorrelation to spike when the lag equals the period.
To show why that works, I’ll plot two segments from the same recording.
def plot_shifted(wave, offset=0.001, start=0.2): thinkplot.preplot(2) segment1 = wave.segment(start=start, duration=0.01) segment1.plot(linewidth=2, alpha=0.8) segment2 = wave.segment(start=start-offset, duration=0.01) segment2.shift(offset) segment2.plot(linewidth=2, alpha=0.4) corr = segment1.corr(segment2) text = r'$\rho =$ %.2g' % corr thinkplot.text(segment1.start+0.0005, -0.8, text) thinkplot.config(xlabel='Time (s)')
One segment starts at 0.2 seconds; the other starts 0.0023 seconds later. Figure 5.7 shows the result. The segments are similar, and their correlation is 0.99. This result suggests that the period is near 0.0023 seconds, which corresponds to a frequency of 435 Hz.
lags, corrs = autocorr(segment) thinkplot.plot(lags, corrs)
Figure 5.8 shows the autocorrelation function for the segment starting at t=0.2 seconds. The first peak occurs at lag=101. We can compute the frequency that corresponds to that period like this:
period = lag / segment.framerate frequency = 1 / period
The estimated fundamental frequency is 437 Hz. To evaluate the precision of the estimate, we can run the same computation with lags 100 and 102, which correspond to frequencies 432 and 441 Hz. The frequency precision using autocorrelation is less than 10 Hz, compared with 100 Hz using the spectrum. In musical terms, the expected error is about 30 cents (a third of a semitone).
5.5 Correlation as dot product
I started the chapter with this definition of Pearson’s correlation coefficient:
Then I used ρ to define serial correlation and autocorrelation. That’s consistent with how these terms are used in statistics, but in the context of signal processing, the definitions are a little different.
In signal processing, we are often working with unbiased signals, where the mean is 0, and normalized signals, where the standard deviation is 1. In that case, the definition of ρ simplifies to:
And it is common to simplify even further:
This definition of correlation is not “standardized”, so it doesn’t generally fall between -1 and 1. But it has other useful properties.
If you think of x and y as vectors, you might recognize this formula as the dot product, x · y. See http://en.wikipedia.org/wiki/Dot_product.
The dot product indicates the degree to which the signals are similar. If they are normalized so their standard deviations are 1,
where θ is the angle between the vectors. And that explains why Figure 5.2 is a cosine curve.
5.6 Using NumPy
NumPy provides a function, correlate, that computes the correlation of two functions or the autocorrelation of one function. We can use it to compute the autocorrelation of the segment from the previous section:
corrs2 = np.correlate(segment.ys, segment.ys, mode='same')
The option mode tells correlate what range of lag to use. With the value ’same’, the range is from −N/2 to N/2, where N is the length of the wave array.
Figure 5.9 shows the result. It is symmetric because the two signals are identical, so a negative lag on one has the same effect as a positive lag on the other. To compare with the results from autocorr, we can select the second half:
N = len(corrs2) half = corrs2[N//2:]
If you compare Figure 5.9 to Figure 5.8, you’ll notice that the correlations computed by np.correlate get smaller as the lags increase. That’s because np.correlate uses the unstandardized definition of correlation; as the lag gets bigger, the number of points in the overlap between the two signals gets smaller, so the magnitude of the correlations decreases.
We can correct that by dividing through by the lengths:
lengths = range(N, N//2, -1) half /= lengths
Finally, we can normalize the results so the correlation with lag=0 is 1.
half /= half
With these adjustments, the results computed by autocorr and np.correlate are nearly the same. They still differ by 1-2%. The reason is not important, but if you are curious: autocorr standardizes the correlations independently for each lag; for np.correlate, we standardized them all at the end.
More importantly, now you know what autocorrelation is, how to use it to estimate the fundamental period of a signal, and two ways to compute it.
Solutions to these exercises are in chap05soln.ipynb.
Exercise 1 The Jupyter notebook for this chapter, chap05.ipynb, includes an interaction that lets you compute autocorrelations for different lags. Use this interaction to estimate the pitch of the vocal chirp for a few different start times.
Exercise 2 The example code in
Exercise 3 If you did the exercises in the previous chapter, you downloaded the historical price of BitCoins and estimated the power spectrum of the price changes. Using the same data, compute the autocorrelation of BitCoin prices. Does the autocorrelation function drop off quickly? Is there evidence of periodic behavior?
Exercise 4 In the repository for this book you will find a Jupyter notebook called
Vi Hart has an excellent video called “What is up with Noises? (The Science and Mathematics of Sound, Frequency, and Pitch)”; it demonstrates the missing fundamental phenomenon and explains how pitch perception works (at least, to the degree that we know). Watch it at https://www.youtube.com/watch?v=i_0DXxNeaQ0.
Are you using one of our books in a class?We'd like to know about it. Please consider filling out this short survey.