Previous Up Next

This HTML version of Think Complexity, 2nd Edition is provided for convenience, but it is not the best format of the book. In particular, some of the symbols are not rendered correctly.

You might prefer to read the PDF version.

Chapter 8  Self-organized criticality

In the previous chapter we saw an example of a system with a critical point and we explored one of the common properties of critical systems, fractal geometry.

In this chapter, we explore two other properties of critical systems: heavy-tailed distributions, which we saw in Chapter ?? and pink noise, which I’ll explain in this chapter.

These properties are interesting in part because they appear frequently in nature; that is, many natural systems produce fractal-like geometry, heavy-tailed distributions, and pink noise.

This observation raises a natural question: why do so many natural systems have properties of critical systems? A possible answer is self-organized criticality (SOC), which is the tendency of some systems to evolve toward and stay in a critical state.

In this chapter I’ll present a sand pile model, which was the first system shown to exhibit SOC.

The code for this chapter is in chap08.ipynb in the repository for this book. More information about working with the code is in Section ??.

8.1  Critical Systems

Many critical systems demonstrate common behaviors:

  • Fractal geometry: For example, freezing water tends to form fractal patterns, including snowflakes and other crystal structures. Fractals are characterized by self-similarity; that is, parts of the pattern are similar to scaled copies of the whole.

  • Heavy-tailed distributions of some physical quantities: For example, in freezing water the distribution of crystal sizes is characterized by a power law.

  • Variations in time that exhibit pink noise. Complex signals can be decomposed into their frequency components. In pink noise, low-frequency components have more power than high-frequency components. Specifically, the power at frequency f is proportional to 1/f.

Critical systems are usually unstable. For example, to keep water in a partially frozen state requires active control of the temperature. If the system is near the critical temperature, a small deviation tends to move the system into one phase or the other.

Many natural systems exhibit characteristic behaviors of criticality, but if critical points are unstable, they should not be common in nature. This is the puzzle Bak, Tang and Wiesenfeld address. Their solution is called self-organized criticality (SOC), where “self-organized” means that from any initial condition, the system moves toward a critical state, and stays there, without external control.

8.2  Sand Piles

The sand pile model was proposed by Bak, Tang and Wiesenfeld in 1987. It is not meant to be a realistic model of a sand pile, but rather an abstraction that models physical systems with (1) a large number of elements that (2) interact with their neighbors.

The sand pile model is a 2-D cellular automaton where the state of each cell represents the slope of a part of a sand pile. During each time step, each cell is checked to see whether it exceeds a critical value, K, which is usually 3. If so, it “topples” and transfers sand to four neighboring cells; that is, the slope of the cell is decreased by 4, and each of the neighbors is increased by 1. At the perimeter of the grid, all cells are kept at slope 0, so the excess spills over the edge.

Bak, Tang and Wiesenfeld initialize all cells a level level greater than K and run the model until it stabilizes. Then they observe the effect of small perturbations; they choose a cell at random, increment its value by 1, and run the model, again, until it stabilizes.

For each perturbation, they measure T, the number of time steps the pile takes to stabilize, and S, the total number of cells that topple1.

Most of the time, dropping an single grain causes no cells to topple, so T=1 and S=0. But occasionally a single grain can cause an avalanche that affects a substantial fraction of the grid. The distributions of T and S turn out to be heavy-tailed, which supports the claim that the system is in a critical state.

They conclude that the sand pile model exhibits “self-organized criticality”, which means that from the initial condition it evolves toward a critical state without the need for external control or what they call “fine tuning” of any parameters. And the model stays in a critical state as more grains are added.

In the next few sections I replicate their experiments and interpret the results.

8.3  Implementing the Sand Pile

To implement the sand pile model, I define a class called SandPile that inherits from Cell2D, which is defined in Cell2D.py.

class SandPile(Cell2D): def __init__(self, n, m, level=9): self.array = np.ones((n, m)) * level

All values in the array are initialized to level, which is generally greater than the toppling threshold, K.

Here’s the step method that finds all cells above K and topples them:

kernel = np.array([[0, 1, 0], [1,-4, 1], [0, 1, 0]], dtype=np.int32) def step(self, K=3): toppling = self.array > K c = correlate2d(toppling, self.kernel, mode='same') self.array += c return num_toppled

To explain how that works, I’ll start with a small pile with just two cells ready to topple:

pile = SandPile(n=3, m=5, level=0) pile.array[1, 1] = 4 pile.array[1, 3] = 4

Initially, pile.array looks like this:

[[0 0 0 0 0] [0 4 0 4 0] [0 0 0 0 0]]

Now we can select the cells that are above the toppling threshold:

toppling = pile.array > K

The result is an array of booleans, but we can use it as if it were an array of integers like this:

[[0 0 0 0 0] [0 1 0 1 0] [0 0 0 0 0]]

If we correlate this array with the kernel, it makes copies of the kernel at each location where toppling is 1.

c = correlate2d(toppling, kernel, mode='same')

And here’s the result:

[[ 0 1 0 1 0] [ 1 -4 2 -4 1] [ 0 1 0 1 0]]

Notice that where the copies of the kernel overlap, they add up.

This array contains the change for each cell, which we use to update the original array:

pile.array += c

And here’s the result.

[[0 1 0 1 0] [1 0 2 0 1] [0 1 0 1 0]]

So that’s how step works.

By default, correlate2d considers the boundary of the array to be fixed at zero, so any grains of sand that go over the edge disappear.

SandPile also provides run, which calls step until no more cells topple:

def run(self): total = 0 for i in itertools.count(1): num_toppled = self.step() total += num_toppled if num_toppled == 0: return i, total

The return value is a tuple that contains the number of time steps and the total number of cells that toppled.

If you are not familiar with itertools.count, it is an infinite generator that counts up from the given initial value, so the for loop runs until step returns 0.

Finally, the drop method chooses a random cell and adds a grain of sand:

def drop(self): a = self.array n, m = a.shape index = np.random.randint(n), np.random.randint(m) a[index] += 1

Let’s look at a bigger example, with n=20:

pile = SandPile(n=20, level=10) pile.run()

Figure 8.1:

With an initial level of 10, this sand pile takes 332 time steps to reach equilibrium, with a total of 53,336 topplings. Figure ?? (left) show the configuration, after this initial run. Notice that it has the repeating elements that are characteristic of fractals. We’ll come back to that soon.

Figure ?? (middle) shows the configuration of the sand pile after dropping 200 grains onto random cells, each time running until the pile reaches equilibrium. The symmetry of the initial configuration has been broken; the configuration looks random.

Finally Figure ?? (right) shows the configuration after 400 drops. It looks similar to the configuration after 200 drops. In fact, the pile is now in a steady state where its statistical properties don’t change over time. I’ll explain some of those statistical properties in the next section.

8.4  Heavy-tailed distributions

If the sand pile model is in a critical state, we expect to find heavy-tailed distributions for quantities like the duration and size of avalanches. So let’s take a look.

I’ll make a larger sand pile, with n=50, an initial level of 30, and run until equilibrium:

pile2 = SandPile(n=50, level=30) pile2.run()

Next, I’ll run 100,000 random drops

iters = 100000 res = [pile2.drop_and_run() for _ in range(iters)]

As the name suggests, drop_and_run calls drop and run and returns the duration of the avalanche and total number of cells that toppled.

So res is a list of (T, S) tuples, where T is duration, in time steps, and S is cells toppled. We can use np.transpose to unpack res into two NumPy arrays:

T, S = np.transpose(res)

A large majority of drops have duration 1 and no toppled cells, so we’ll filter those out.

T = T[T>1] S = S[S>0]

The distributions of T and S have many small values and a few very large ones. I’ll use the Hist class from thinkstats2 to make a histogram of the values; that is, a map from each value to the number of times it occurs.

from thinkstats2 import Hist histT = Hist(T) histS = Hist(S)

Figure 8.2:


Figure 8.3:

Figure ?? shows the results for values less than 50. But as we saw in Section ??, we can get a clearer picture of these distributions by plotting them on a log-log scale, as shown in Figure ??.

For values between 1 and 100, the distributions are nearly straight on a log-log scale, which is indicative of a heavy tail. The gray lines in the figure have slope -1, which suggests that these distributions follow a power law with parameter α=1.

For values greater than 100, the distributions fall away more quickly than the power law model, which means there are fewer very large values than the model predicts. One explanation is that this effect is due to the finite size of the sand pile, so we might expect larger piles to fit the power law better.

Another possibility, which you can explore in one of the exercises at the end of this chapter, is that these distributions do not strictly obey a power law. But even if they are not power-law distributions, they are still heavy-tailed as we expect for a system in a critical state.

8.5  Fractals

Another property of critical systems is fractal geometry. The initial configuration in Figure ?? (left) resembles a fractal, but you can’t always tell by looking. A more reliable way to identify a fractal is to estimate its fractal dimension, as we saw in Section ?? and Section ??.

I’ll start by making a bigger sand pile, with n=131 and initial level 22.

pile3 = SandPile(n=131, level=22) pile3.run()

By the way, it takes 28,379 steps for this pile to reach equilibrium, with more than 200 million cells toppled.

To see the resulting pattern more clearly, I select cells with levels 0, 1, 2, and 3, and plot them separately:

def draw_four(viewer, vals=range(4)): thinkplot.preplot(rows=2, cols=2) a = viewer.viewee.array for i, val in enumerate(vals): thinkplot.subplot(i+1) viewer.draw_array(a==vals[i], vmax=1)

draw_four takes a SandPileViewer object, which is defined in Sand.py. The parameter vals is the list of values we want to plot; the default values are 0, 1, 2, and 3.

Here’s how it’s used:

viewer3 = SandPileViewer(pile3) draw_four(viewer3)

Figure 8.4:

Figure ?? shows the results. Now for each of these patterns we can estimate the fractal dimension using a box-counting algorithm: we’ll count the number of cells in a small box at the center of the pile, and then see how the number of cell increases as the box gets bigger. Here’s my implementation:

def count_cells(a): n, m = a.shape end = min(n, m) res = [] for i in range(1, end, 2): top = (n-i) // 2 left = (m-i) // 2 box = a[top:top+i, left:left+i] total = np.sum(box) res.append((i, i**2, total)) return np.transpose(res)

The parameter, a, is a NumPy array of booleans or 0s and 1s. The size of the box is initially 1. Each time through the loop, it increases by 2 until it reaches end, which is the smaller of n and m.

Each time through the loop, box is a set of cells with width and height i, centered in the array. total is the number of “on” cells in the box.

The result is a list of tuples, where each tuple contains i and i**2, for purposes of comparison, and the number of cells in the box.

Finally, we use np.transpose to make a NumPy array with 3 rows containing i, i**2, and total.

To estimate the fractal dimension, we extract the rows:

steps, steps2, cells = res

Then we can plot the results:

thinkplot.plot(steps, steps2, linestyle='dashed') thinkplot.plot(steps, cells) thinkplot.plot(steps, steps, linestyle='dashed')

And use linregress to fit a line on a log-log scale.

from scipy.stats import linregress params = linregress(np.log(steps), np.log(cells)) slope = params[0]

Figure 8.5:

Figure ?? shows the results. Notice that only val=2 (lower left) starts from box size 1 because the center cell has value 2; the other lines start at the first box size that contains a nonzero cell count.

On a log-log scale, the cell counts form nearly straight lines, which indicates that we are measuring fractal dimension over a valid range of box sizes.

And the estimated fractal dimensions are:

0 1.871 1 3.502 2 1.781 3 2.084

The fractal dimension for values 0, 1, and 2 seems to be clearly non-integer, which indicates that the image is fractal.

Strictly, the fractal dimension for value 3 is indistinguishable from 2, but given the results for the other values, the apparent curvature of the line, and the appearance of the pattern, it seems likely that it is also fractal.

One of the exercises at the end of this chapter asks you to run this analysis again with different values of n and level to see if the estimated dimensions are consistent.

8.6  Spectral Density

The title of the original paper that presented the sand pile model is “Self-Organized Criticality: An Explanation of 1/f Noise”. As the subtitle suggests, Bak, Tang and Wiesenfeld were trying to explain why many natural and engineered systems exhibit 1/f noise, which is also known as “flicker noise” and “pink noise”.

To understand pink noise, we have to take a detour to understand signals, spectral analysis, and noise.

Signal:
A signal is any quantity that varies in time. One example is sound, which is variation in air density. In this section we’ll explore how avalanche durations and sizes vary over time steps.
Spectral analysis:
Any signal can be decomposed into a set of frequency components with different volume or power. For example, the sound of a violin playing the A above middle C contains a dominant component at frequency 440 Hz, but it also contains less powerful components at 880 Hz, 1320 Hz, and other integer multiples of the fundamental. Spectral analysis is the process of finding the components that make up a signal and their powers, which is called its spectrum.
Noise:
In common use, “noise” is usually an unwanted sound, but in the context of signal processing, it is a signal that contains many frequency components.

There are many kinds of noise. For example, “white noise” is a signal that has components with equal power over a wide range of frequencies.

Other kinds of noise have different relationships between frequency and power. In “red noise”, the power at frequency f is 1/f2, which we can write like this:

P(f) = 1/f2 

We can generalize that by replacing the exponent 2 with a parameter β:

P(f) = 1/fβ

When β=0, this equation describes white noise; when β=2 it describes red noise. When the parameter is near 1, we call the result 1/f noise. More generally, noise with any value between 0 and 2 is called “pink”, because it’s between white and red.

So how does this apply to the sand pile model? Suppose that every time a cell topples, it makes a sound. If we recorded a sand pile model while its running, what would it sound like?

As my implementation of SandPile runs, it records the number of cells that topple during each time step, recording the results in a list called toppled_seq. For example, after running the model in Section ??, we can extract the resulting signal:

signal = pile2.toppled_seq

To compute the spectrum of a signal (again, that’s the frequencies it contains and their powers) we can use the Fast Fourier Transform (FFT).

The only problem is that the spectrum of a noise signal tends to be noisy. However, we can smooth it out by breaking a long signal into segments, computing the FFT of each segment, and then computing the average power at each frequency.

One version of this algorithm is called “Welch’s method” and SciPy provides an implementation. We can use it like this:

from scipy.signal import welch nperseg = 2048 freqs, spectrum = welch(signal, nperseg=nperseg, fs=nperseg)

nperseg is the length of the segments the signal is broken into. With longer segments we get more frequency components, but since there are fewer segments to average, the results are noisier.

fs is the “sampling frequency”, which is the number of data points in the signal per unit of time. By setting fs=nperseg, we get a range of frequencies from 0 to nperseg/2, but the units of time in the model are arbitrary, so the frequencies don’t mean very much.

The return values, freqs and powers, are NumPy arrays containing the frequencies of the components and their corresponding powers.

If the signal is pink noise, we expect

P(f) = 1/fβ

Taking the log of both sides yields

logP(f) = −β logf 

So if we plot powers versus freqs on a log-log scale, we expect a straight line with slope −β.


Figure 8.6:

Figure ?? shows the result. For frequencies between 10 and 1000 (in arbitrary units), the spectrum falls on a straight line. The gray line has slope −1.58, which corresponds to the parameter, β=1.58 reported by Bak, Tang, and Wiesenfeld.

This result confirms that the sand pile model generates pink noise.

8.7  Reductionism and Holism

The original paper by Bak, Tang and Wiesenfeld is one of the most frequently-cited papers in the last few decades. Other systems have been shown to be self-organized critical, and the sand-pile model, in particular, has been studied in detail.

As it turns out, the sand pile model is not a very good model of a sand pile. Sand is dense and not very sticky, so momentum has a non-negligible effect on the behavior of avalanches. As a result, there are fewer very large and very small avalanches than the model predicts, and the distribution might not be heavy-tailed.

Bak has suggested that this observation misses the point. The sand pile model is not meant to be a realistic model of a sand pile; it is meant to be a simple model for a broad category of systems.

To understand this point, it is useful to think about two kinds of models, reductionist and holistic. A reductionist model describes a system by describing its parts and their interactions. When a reductionist model is used as an explanation, it depends on an analogy between the components of the model and the components of the system.

For example, to explain why the ideal gas law holds, we can model the molecules that make up a gas with point masses and model their interactions as elastic collisions. If you simulate or analyze this model, you find that it obeys the ideal gas law. This model is satisfactory to the degree that molecules in a gas behave like molecules in the model. The analogy is between the parts of the system and the parts of the model.


Figure 8.7: The logical structure of a holistic model.

Holistic models are more focused on similarities between systems and less interested in analogous parts. A holistic approach to modeling often consists of two steps, not necessarily in this order:

  • Identify a kind of behavior that appears in a variety of systems.
  • Find a simple model that demonstrates that behavior.

For example, in The Selfish Gene, Richard Dawkins suggests that genetic evolution is just one example of an evolutionary system. He identifies the essential elements of the category—discrete replicators, variability, and differential reproduction—and proposes that any system with these elements displays similar behavior, including complexity without design.

As another example of an evolutionary system, he proposes “memes”, which are thoughts or behaviors that are replicated by transmission from person to person2. As memes compete for the resource of human attention, they evolve in ways that are similar to genetic evolution.

Critics of the meme model have pointed out that memes are a poor analogy for genes. Memes differ from genes in many obvious ways. But Dawkins has argued that these differences are beside the point because memes are not supposed to be analogous to genes. Rather, memes and genes are examples of the same category: evolutionary systems. The differences between them emphasize the real point, which is that evolution is a general model that applies to many seemingly disparate systems. The logical structure of this argument is shown in Figure ??.

Bak has made a similar argument that self-organized criticality is a general model for a broad category of system:

Since these phenomena appear everywhere, they cannot depend on any specific detail whatsoever... If the physics of a large class of problems is the same, this gives [the theorist] the option of selecting the simplest possible [model] belonging to that class for detailed study.3

Many natural systems demonstrate behaviors characteristic of critical systems. Bak’s explanation for this prevalence is that these systems are examples of the broad category of self-organized criticality. There are two ways to support this argument. One is to build a realistic model of a particular system and show that the model exhibits SOC. The second is to show that SOC is a feature of many diverse models, and to identify the essential characteristics those models have in common.

The first approach, which I characterize as reductionist, can explain the behavior of a particular system. The second, holistic, approach, explains the prevalence of criticality in natural systems. They are different models with different purposes.

For reductionist models, realism is the primary virtue, and simplicity is secondary. For holistic models, it is the other way around.

8.8  SOC, causation and prediction

If a stock market index drops by a fraction of a percent in a day, there is no need for an explanation. But if it drops 10%, people want to know why. Pundits on television are willing to offer explanations, but the real answer may be that there is no explanation.

Day-to-day variability in the stock market shows evidence of criticality: the distribution of value changes is heavy-tailed and the time series exhibits pink noise. If the stock market is a critical system, we should expect occasional large changes as part of the ordinary behavior of the market.

The distribution of earthquake sizes is also heavy-tailed, and there are simple models of the dynamics of geological faults that might explain this behavior. If these models are right, they imply that large earthquakes are unexceptional; that is, they do not require explanation any more than small earthquakes do.

Similarly, Charles Perrow has suggested that failures in large engineered systems, like nuclear power plants, are like avalanches in the sand pile model. Most failures are small, isolated and harmless, but occasionally a coincidence of bad fortune yields a catastrophe. When big accidents occur, investigators go looking for the cause, but if Perrow’s “normal accident theory” is correct, there may be no special cause of large failures.

These conclusions are not comforting. Among other things, they imply that large earthquakes and some kinds of accidents are fundamentally unpredictable. It is impossible to look at the state of a critical system and say whether a large avalanche is “due”. If the system is in a critical state, then a large avalanche is always possible. It just depends on the next grain of sand.

In a sand pile model, what is the cause of a large avalanche? Philosophers sometimes distinguish the proximate cause, which is most immediately responsible, from the ultimate cause, which is, for whatever reason, considered the true cause.

In the sand pile model, the proximate cause of an avalanche is a grain of sand, but the grain that causes a large avalanche is identical to every other grain, so it offers no special explanation. The ultimate cause of a large avalanche is the structure and dynamics of the systems as a whole: large avalanches occur because they are a property of the system.

Many social phenomena, including wars, revolutions, epidemics, inventions and terrorist attacks, are characterized by heavy-tailed distributions. If the reason for these distributions is that social systems are critical, that suggests that major historical events may be fundamentally unpredictable and unexplainable.

8.9  Exercises

The code for this chapter is in chap08.ipynb, which is a Jupyter notebook in the repository for this book. For more information about working with this code, see Section ??.

Exercise 1  

Launch chap08.ipynb and run the code. There are a few short exercises embedded in the notebook that you might want to try.

Exercise 2  

To test whether the distributions of T and S are heavy-tailed, we plotted their histograms on a log-log scale, which is what Bak, Tang and Wiesenfeld show in their paper. But as we saw in Section ??, this visualization can obscure the shape of the distribution. Using the same data, make a plot that shows the cumulative distributions (CDFs) of S and T. What can you say about their shape? Do they follow a power law? Are they heavy tailed?

You might find it helpful to plot the CDFs on a log-x scale and on a log-log scale.

Exercise 3  

In Section ?? we showed that the initial configuration of the sand pile model produces fractal patterns. But after we drop a large number of random grains, the patterns look more random.

Starting with the example in Section ??, run the sand pile model for a while and then compute fractal dimensions for each of the 4 levels. Is the sand pile model fractal in steady state?

Exercise 4  

Another version of the sand pile model, called the “single source” model, starts from a different initial condition: instead of all cells at the same level, all cells are set to 0 except the center cell, which is set to a large value. Write a function that creates a SandPile object, sets up the single source initial condition, and runs until the pile reaches equilibrium. Does the result appear to be fractal?

You can read more about this version of the sand pile model at http://math.cmu.edu/~wes/sandgallery.html.

Exercise 5  

In a 1989 paper, Bak, Chen and Creutz suggest that the Game of Life is an SOC system.4

To replicate their tests, run the GoL CA until it stabilizes, then choose a random cell and flip it. Run the CA until it stabilizes again, keeping track of T, the number of time steps it takes, and S, the number of cells affected. Repeat for a large number of trials and plot the distributions of T and S. Also, record the number of cells that change state during each time step and see if the resulting time series resembles pink noise.

Exercise 6  

In The Fractal Geometry of Nature, Benoit Mandelbrot proposes what he calls a “heretical” explanation for the prevalence of heavy-tailed distributions in natural systems. It may not be, as Bak suggests, that many systems can generate this behavior in isolation. Instead there may be only a few, but there may be interactions between systems that cause the behavior to propagate.

To support this argument, Mandelbrot points out:

  • The distribution of observed data is often “the joint effect of a fixed underlying true distribution and a highly variable filter.”
  • Heavy-tailed distributions are robust to filtering; that is, “a wide variety of filters leave their asymptotic behavior unchanged.”

What do you think of this argument? Would you characterize it as reductionist or holist?

Exercise 7  

Read about the “Great Man” theory of history at http://en.wikipedia.org/wiki/Great_man_theory. What implication does self-organized criticality have for this theory?


1
The original paper uses a different definition of S, but most later work uses this definition.
2
This use of “meme” is original to Dawkins, and predates the distantly-related use of the word on the Internet by about 20 years.
3
Bak, How Nature Works, Springer-Verlag 1996, page 43.
4
“Self-organized criticality in the Game of Life”, available from http://www.nature.com/nature/journal/v342/n6251/abs/342780a0.html.

Are you using one of our books in a class?

We'd like to know about it. Please consider filling out this short survey.


Think DSP

Think Java

Think Bayes

Think Python 2e

Think Stats 2e

Think Complexity


Previous Up Next