# Chapter 9  Self-organized criticality

## 9.1  Sand piles

In 1987 Bak, Tang and Wiesenfeld published a paper in Physical Review Letters, “Self-organized criticality: an explanation of 1/f noise.” You can download it from http://prl.aps.org/abstract/PRL/v59/i4/p381_1.

The title takes some explaining. A system is “critical” if it is in transition between two phases; for example, water at its freezing point is a critical system.

A variety of critical systems demonstrate common behaviors:

• Long-tailed distributions of some physical quantities: for example, in freezing water the distribution of crystal sizes is characterized by a power law.
• Fractal geometries: freezing water tends to form fractal patterns—the canonical example is a snowflake. Fractals are characterized by self-similarity; that is, parts of the pattern resemble scaled copies of the whole.
• Variations in time that exhibit pink noise: what we call “noise” is a time series with many frequency components. In “white” noise, all of the components have equal power. In “pink” noise, low-frequency components have more power than high-frequency components. Specifically, the power at frequency f is proportional to 1/f. Visible light with this power spectrum looks pink, hence the name.

Critical systems are usually unstable. For example, to keep water in a partially frozen state requires active control of the temperature. If the system is near the critical temperature, a small deviation tends to move the system into one phase or the other.

Many natural systems exhibit characteristic behaviors of criticality, but if critical points are unstable, they should not be common in nature. This is the puzzle Bak, Tang and Wiesenfeld address. Their solution is called self-organized criticality (SOC), where “self-organized” means that from any initial condition, the system tends to move toward a critical state, and stay there, without external control.

As an example, they propose a model of a sand pile. The model is not realistic, but it has become the standard example of self-organized criticality.

The model is a 2-D cellular automaton where the state of each cell, z(i,j), represents the slope of a part of a sand pile. During each time step, each cell is checked to see whether it exceeds some critical value, K. If so, an “avalanche” occurs that transfers sand to neighboring cells; specifically, z(i,j) is decreased by 4, and each of the 4 neighbors is increased by 1.

At the perimeter of the grid, all cells are kept at z=0, so the excess spills over the edge. To initialize the system, Bak et al. start with all z > K and evolve the system until it stabilizes. Then they observe the effect of small perturbations; they choose a cell at random, increment its value by 1, and evolve the system, again, until it stabilizes.

For each perturbation, they measure D, the total number of cells that are affected by the resulting avalanche. Most of the time, D is small, usually 1. But occasionally a large avalanche affects a substantial fraction of the grid. The distribution of D turns out to be long-tailed, which supports the claim that the system is in a critical state.

Exercise 1

Read the paper and write a program that implements their CA. You might want to start with a copy of thinkcomplex.com/Life.py.

See if you can reproduce their Figure 2(a), which shows the distribution of cluster sizes.

After the system has been running for a while, compute its fractal dimension.

## 9.2  Spectral density

To understand 1/f noise, we have to take a detour to understand spectral density. If h(t) is a signal that varies in time, it can be described by its power spectral density, P(f), which is a function that maps from a frequency, f, to the amount of power the signal contains at that frequency.

This analysis applies to any varying signal, but I use sound as an example. The note we call “middle A” corresponds to a frequency of 440 cycles per second, or Hertz (Hz). If you strike a middle A tuning fork, it produces a sound that is close to a pure sine wave at 440 Hz. But if you play the same note on a piano, what you hear is a complex sound that contains components at many different frequencies. The frequency with the most power is 440, which is why we perceive the sound as a middle A, but there are also components at 880, 1320 and many higher frequencies. These components are called “harmonics.”

What we identify as the pitch of a sound is usually the dominant frequency component. But if a sound contains many different components with roughly the same power, it has no particular pitch. To our ears, it sounds like noise.

Spectral analysis is the process of taking a signal and computing its spectral density1. The first step is to compute the Fourier transform of h(t):

H(ω) =
 ∞ −∞
h(tei ω t dt

where ω = 2 π f is the angular frequency in radians per second (rather than cycles per second). The advantage of working with angular frequency is that it reduces the number of times the term 2 π appears.

H(ω) is written with a capital letter because it is a complex number, which you can think of as a vector with a magnitude, |H(ω)|, and an angle. The power spectral density is related to the Fourier transform by the following relation:

 P(f) = |H(2 π f)|2

Depending on the application, we may not care about the difference between f and −f. In that case, we would use the one-sided power spectral density:

 P(f) = |H(2 π f)|2 + |H(−2 π f)|2

So far we have assumed that h(t) is a continuous function, but often it is a series of values at discrete times. In that case we can replace the continuous Fourier transform with the discrete Fourier transform (DFT). Suppose that we have N values hk with k in the range from 0 to N−1. The DFT is written Hn, where n is an index related to frequency:

Hn =
 N−1 ∑ k=0
hk e2 π i k n / N     (1)

Each element of this sequence corresponds to a particular frequency. If the elements of hk are equally spaced in time, with time step d, the frequency that corresponds to Hn is

fn =
 n N d

To get the one-sided power spectral density, you can compute Hn with n in the range −N/2 to N/2, and

 Pn = |Hn|2 + |H−n|2

To avoid negative indices, it is conventional to compute Hn with n in the range 0 to N−1 and use the relation Hn = HNn to convert.

Exercise 2

Write a function named dft that takes h, a sequence of N values, and returns the sequence Hn with n in the range 0 to N−1.

Python provides support for complex numbers as a built-in type. The function complex takes two arguments, a real part and an imaginary part, and returns a complex number:

```>>> complex(1, 1)
(1+1j)
```

The cmath module provides math functions that support complex numbers:

```>>> import cmath
>>> i = complex(0, 1)
>>> N = 128
>>> cmath.exp(2 * math.pi * i / N)
(0.99879545620517241+0.049067674327418015j)
>>> abs(complex(3, 4))
5.0
```

What is the order of growth run time of dft?

Hoisting is a way to speed up code by moving an expression that does not change out of a loop. See http://en.wikipedia.org/wiki/Loop-invariant_code_motion. You can make your code easier to read and more efficient by hoisting:

 W = e2 π i / N     (2)

What effect does hoisting have on the order of growth?

## 9.3  Fast Fourier Transform

The Fast Fourier Transform (FFT) is an efficient algorithm for computing the DFT. It is often attributed to Cooley and Tukey, but it was independently discovered several times earlier. See http://en.wikipedia.org/wiki/Fast_Fourier_transform.

The first step toward the FFT is to rewrite Equation 9.1 with the substitution W = e2 π i/N:

Hn =
 N−1 ∑ k=0
hk Wn k     (3)

The second step is the Danielson-Lanczos Lemma which states

 Hn = Hne + Wk Hno

where He is the DFT of the even-indexed elements of h, and Ho is the DFT of the odd-indexed elements. This lemma follows naturally from the definition of Hn; you can see a proof at http://mathworld.wolfram.com/Danielson-LanczosLemma.html.

This lemma suggests a recursive algorithm for evaluating the DFT of a sequence h. If h has only a single element, then H=h. Otherwise:

1. Split h into he and ho.
2. Compute He and Ho by making two recursive calls.
3. Use the lemma to combine He and Ho to form H.

If H has 2N elements, He and Ho have only N. In order to merge them, you have to wrap around, but you can do that because Hn+Ne = Hne.

This recursive algorithm is the Fast Fourier Transform.

Exercise 3

Write a function called fft that implements the Fast Fourier Transform. To check your function, you can compare it to the function fft provided by the module numpy.fft.

What is the order of growth run time of your implementation? What is the order of growth for the space required?

Most FFT implementations use a clever indexing scheme to avoid copying the sequence; instead, they transform the elements in place. You can read http://en.wikipedia.org/wiki/Butterfly_diagram to get the details.

Once your fft is working, write a function named psd that takes a sequence, h, and returns its one-sided power spectral density, P.

## 9.4  Pink noise

In a followup paper in 1988, Bak, Tang and Wiesenfeld looked at a time series F(t), which is the number of cells that exceed the threshold during each time step. If I understand their model, they seed avalanches by incrementing the state of a random cell at random intervals; for example, there might be a fixed probability during each time step that a cell is incremented. In this model (unlike the previous one) there may be more than one avalanche at a time.

A plot of F(t) shows that it is noisy, but not completely random, which is consistent with pink, or 1/f noise. As a stronger test, they plot the power spectral density of F on a log-log scale. If F is 1/f noise, then

Pn ∼ 1 / fn =
 N d n

Since the units of time in this model are arbitrary, we can choose d=1. Taking the log of both sides yields:

 logPn ∼ logN − logn

So on a log-log scale, the PSD of 1/f noise is a straight line with slope -1.

Exercise 4

Modify your implementation of the sand pile model to increment a random cell at random intervals and record the number of cells that exceed the threshold during each time step.

To estimate the average PSD, you can divide the time series into chunks of 128 to 256 values, compute the PSD of each chunk, and average together the PSDs. Plot the result on a log-log scale and estimate the slope.

Exercise 5

In a 1989 paper, “Self-organized criticality in the ’Game of Life’,” Bak, Chen and Creutz present evidence that the Game of Life is a self-organized critical system (http://www.nature.com/nature/journal/v342/n6251/abs/342780a0.html).

To replicate their tests, run the GoL CA until it stabilizes, then choose a random cell and toggle it. Run the CA until it stabilizes again, keeping track of t, the number of time steps it takes, and s, the number of cells affected. Repeat for a large number of trials and plot the distributions of t and s. Also, see if you can think of an effective experiment to test for 1/f noise.

Some later work has called the conclusions of this paper into question. You might want to read Blok, “Life without bounds,” at http://zoology.ubc.ca/~rikblok/lib/blok95b.html.

## 9.5  Reductionism and Holism

The original paper by Bak, Tang and Wiesenfeld is one of the most frequently-cited papers in the last few decades. Many new systems have been shown to be self-organized critical, and the sand-pile model, in particular, has been studied in detail.

As it turns out, the sand-pile model is not a very good model of a sand pile. Sand is dense and not very sticky, so momentum has a non-negligible effect on the behavior of avalanches. As a result, there are fewer very large and very small avalanches than the model predicts, and the distribution is not long tailed.

Bak has suggested that this observation misses the point. The sand pile model is not meant to be a realistic model of a sand pile; it is meant to be a simple example of a broad category of models.

To understand this point, it is useful to think about two kinds of models, reductionist and holistic. A reductionist model describes a system by describing its parts and their interactions. When a reductionist model is used as an explanation, it depends on an analogy between the components of the model and the components of the system.

For example, to explain why the ideal gas law holds, we can model the molecules that make up a gas with point masses, and model their interactions as elastic collisions. If you simulate or analyze this model you find that it obeys the ideal gas law. This model is satisfactory to the degree that molecules in a gas behave like molecules in the model. The analogy is between the parts of the system and the parts of the model.

Holistic models are more focused on similarities between systems and less interested in analogous parts. A holistic approach to modeling often consists of two steps, not necessarily in this order:

• Identify a kind of behavior that appears in a variety of systems.
• Find the simplest model that demonstrates that behavior.

For example, in The Selfish Gene, Richard Dawkins suggests that genetic evolution is just one example of an evolutionary system. He identifies the essential elements of the category—discrete replicators, variability and differential reproduction—and proposes that any system that has these elements displays similar behavior, including complexity without design.

As another example of an evolutionary system, he proposes memes, which are thoughts or behaviors that are “replicated” by transmission from person to person. As memes compete for the resource of human attention, they evolve in ways that are similar to genetic evolution.

Critics of memetics have pointed out that memes are a poor analogy for genes. Memes differ from genes in many obvious ways. But Dawkins has argued that these differences are beside the point because memes are not supposed to be analogous to genes. Rather, memetics and genetics are examples of the same category—evolutionary systems. The differences between them emphasize the real point, which is that evolution is a general model that applies to many seemingly disparate systems. The logical structure of this argument is shown in Figure 9.1.

Bak has made a similar argument that self-organized criticality is a general model for a broad category of systems. According to Wikipedia, “SOC is typically observed in slowly-driven non-equilibrium systems with extended degrees of freedom and a high level of nonlinearity.”

Many natural systems demonstrate behaviors characteristic of critical systems. Bak’s explanation for this prevalence is that these systems are examples of the broad category of self-organized criticality. There are two ways to support this argument. One is to build a realistic model of a particular system and show that the model exhibits SOC. The second is to show that SOC is a feature of many diverse models, and to identify the essential characteristics those models have in common.

The first approach, which I characterize as reductionist, can explain the behavior of a particular system. The second, holistic, approach, explains the prevalence of criticality in natural systems. They are different models with different purposes.

For reductionist models, realism is the primary virtue, and simplicity is secondary. For holistic models, it is the other way around.

Exercise 6

Exercise 7

In a 1996 paper in Nature, Frette et al report the results of experiments with rice piles (http://www.nature.com/nature/journal/v379/n6560/abs/379049a0.html). They find that some kinds of rice yield evidence of critical behavior, but others do not.

Similarly, Pruessner and Jensen studied large-scale versions of the forest fire model (using an algorithm similar to Newman and Ziff’s). In their 2004 paper, “Efficient algorithm for the forest fire model,” they present evidence that the system is not critical after all (http://pre.aps.org/abstract/PRE/v70/i6/e066707).

How do these results bear on Bak’s claim that SOC explains the prevalence of critical phenomena in nature?

Exercise 8

In The Fractal Geometry of Nature, Benoit Mandelbrot proposes what he calls a “heretical” explanation for the prevalence of long-tailed distributions in natural systems (page 344). It may not be, as Bak suggests, that many systems can generate this behavior in isolation. Instead there may be only a few, but there may be interactions between systems that cause the behavior to propagate.

To support this argument, Mandelbrot points out:

• The distribution of observed data is often “the joint effect of a fixed underlying ’true distribution’ and a highly variable ’filter.’"
• Long-tailed distributions are robust to filtering; that is, “a wide variety of filters leave their asymptotic behavior unchanged.”

What do you think of this argument? Would you characterize it as reductionist or holist?

## 9.6  SOC, causation and prediction

If a stock market index drops by a fraction of a percent in a day, there is no need for an explanation. But if it drops 10%, people want to know why. Pundits on television are willing to offer explanations, but the real answer may be that there is no explanation.

Day-to-day variability in the stock market shows evidence of criticality: the distribution of value changes is long-tailed and the time series exhibits 1/f noise. If the stock market is a self-organized critical system, we should expect occasional large changes as part of the ordinary behavior of the market.

The distribution of earthquake sizes is also long-tailed, and there are simple models of the dynamics of geological faults that might explain this behavior. If these models are right, they imply that large earthquakes are unexceptional; that is, they do not require explanation any more than small earthquakes do.

Similarly, Charles Perrow has suggested that failures in large engineered systems, like nuclear power plants, are like avalanches in the sand pile model. Most failures are small, isolated and harmless, but occasionally a coincidence of bad fortune yields a catastrophe. When big accidents occur, investigators go looking for the cause, but if Perrow’s “normal accident theory” is correct, there may be no special cause of large failures.

These conclusions are not comforting. Among other things, they imply that large earthquakes and some kinds of accidents are fundamentally unpredictable. It is impossible to look at the state of a critical system and say whether a large avalanche is “due.” If the system is in a critical state, then a large avalanche is always possible. It just depends on the next grain of sand.

In a sand-pile model, what is the cause of a large avalanche? Philosophers sometimes distinguish the proximate cause, which is most immediately responsible, from the ultimate cause, which is, for whatever reason, considered the true cause.

In the sand-pile model, the proximate cause of an avalanche is a grain of sand, but the grain that causes a large avalanche is identical to any other grain, so it offers no special explanation. The ultimate cause of a large avalanche is the structure and dynamics of the systems as a whole: large avalanches occur because they are a property of the system.

Many social phenomena, including wars, revolutions, epidemics, inventions and terrorist attacks, are characterized by long-tailed distributions. If the reason for these distributions is that social systems are critical, that suggests that major historical events may be fundamentally unpredictable and unexplainable.

Exercise 9

Read about the “Great Man” theory of history at http://en.wikipedia.org/wiki/Great_man_theory. What implication does self-organized criticality have for this theory?

1
The presentation here follows Press et al, Numerical Recipes in C.

#### Are you using one of our books in a class?

We'd like to know about it. Please consider filling out this short survey.