Averages¶

I remember, quite vividly, a maths class in primary school where the teacher explained the 'three' types of average to the class: mean, median, and mode. It was, she said, vital that we remembered this whenever watching the news. You know, when a presenter says the average of something is $x$, we must always think to ourselves

Aha! But which average are you talking about? The mean, the median or the mode?

in a slightly arrogant tone. For much of my life I laughed at this— they always use the mean average. They always add up the numbers and divide by the amount of numbers. Simple. However, there is in fact a subtlety to this that became clear when doing a module on Ergodic Theory during my masters year. The aim of this post is to put the concept in a simple, applied way that points out a potential pitfall whenever an 'average' is talked about. The universal nature of this topic means that it leads to many questions about how we evaluate decisions and outcomes. I hope to outline both that and the relevance to the current crisis in this short post. The pre-requisites for what I am about to describe are almost nonexistent.

A country¶

Suppose that we start with a country that has a population of $10,000,000$. We will look at a very rudimentary model of wealth growth. Every time period (say a month), everyone either loses all of their wealth or triples their wealth, with probability $0.5$ each way. Put simply, every individual flips their own coin. Heads they have to give away their entire wealth, tails they get a cheque worth $\%200$ of their current wealth to add to their fortune. These scenarios correspond to multiplying the individual's total wealth by $0$ if heads, and $3$ if tails. For the first month, every citizen starts with $£100$. Will this system lead to an increase of wealth for the country, or a decrease? Will the citizens get richer or poorer? This is what we will investigate.

So to start, it is useful to look at one individual for one time period, so we are sure to understand the process. Individual $1$ starts on $£100$. They flip their coin, and it lands on tails. Win. They receive a cheque for $2 \times 100 = 200$, meaning they now have $£300$. Great. If the coin had landed on heads, they would have had to give away $1 \times 100 = 100$, meaning they would now have $£0$. Remember, the probability of these two events is $0.5$. We are just flipping a coin. The 'average' that an individual gets out of each month is easy, then.

$$ (0.5 \times 0) + (0.5 \times 3) = 1.5 $$

All that we have done here is the probability of heads times the outcome in that scenario, plus the probability of tails times the outcome in that scenario. This is the average multiplier. This means $50\%$ growth. Okay, that's good right. The expected value of this game is positive. In game theory, this exact calculation is made to assess the value of the game to the individual. Let's push this a little further. Let's simulate it. We start with our country of $10,000,000$ people, and they all start with $£100$. Each month we go down the list of how much each individual has and multiply by $0$ half the time, and $3$ the other half. The code below produces the graph seen. We have plotted the average wealth of the country at each month, throughout a year. That is, summed up the list of wealths and divided by $10,000,000$ to get the average at each month. On the left we see the results of the simulation, and on the right is what we would see if the growth rate was indeed $1.5$.

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

N = 10000000
T = 12
win = 3
lose = 0
x = 100*np.ones(N)
avg = [x.mean()]
for i in range(T):
    x = x*np.random.choice([lose, win], size = N)
    avg.append(x.mean())
fig, ax = plt.subplots(1, 2, sharex = True, sharey = True, figsize= (15,6))
ax[0].plot(avg)
ax[1].plot(100*1.5**(np.array((range(T+1)))))
ax[0].set_ylabel("Average wealth")
ax[0].set_xlabel("Month")

Text(0.5, 0, 'Month')

They match up perfectly. So we do see growth in the average wealth of the country. In fact, it is emphatic. After a year the average wealth (GDP per capita) is over $£12,000$. Exponential growth of the average wealth is clear. This must be benefiting the country massively. Let's look at the trajectories of some individual's wealth during this.

y = 100
avg = [y]
for i in range(T):
    y = y*np.random.choice([lose, win])
    avg.append(y)
plt.plot(avg)
plt.xlabel("Month")
plt.ylabel("Wealth")

Text(0, 0.5, 'Wealth')

Oh no. This individual ended the year with nothing. Their coin landed on tails in the $2^{nd}$ month. We'll try another...

z = 100
avg = [z]
for i in range(T):
    z = z*np.random.choice([lose, win])
    avg.append(z)
plt.plot(avg)
plt.xlabel("Month")
plt.ylabel("Wealth")

Text(0, 0.5, 'Wealth')

This individual won the first $2$ months (going up to $100 \times 3 \times 3 = 900$), but lost in the $3^{rd}$, meaning they still ended the year with nothing. Right, maybe we were just unlucky. We can calculate the probability of an individivual not ending the year with $0$. They would have to win every month. They must get tails in the first month, tails in the second, tails in the third, etc. Since it is just the flipping of a coin, independence means

$$ \mathbb{P}\left[\text{not ending with }0 \right] = 0.5 \times 0.5 \times \cdots \times 0.5 = 0.5^{12}. $$

That is a pretty low number. In fact is is $0.00024414062$. If you were a citizen of this country, then that is the probability of you ending the year with anything other than $£0$. Given the $10,000,000$ population, about $2440$ (that small number $\times$ the population) will end the year with more than $£0$. We can calculate this in our simulation:

print(sum(i > 100 for i in x))

2421

which matches pretty well with the theory just calculated. Yet, as we saw, the GDP per capita, or 'average' wealth grew exponentially. So pretty much no one gains anything from this process, but the country gains a lot. So what is going on here? There are a couple of things to note.

Expectation¶

The first average we looked at was expectation. This involves looking at the possibilities over one time period. In our system there are just two. We essentially invoke a multiple universes way of looking at the game. Either we multiply by $0$, or we multiply by $3$. These two events have the same probability, so we weight them equally and take the average. This is a very common way of looking at things in game theory.

Time average¶

Really, what an individual cares about in this case is whether they will win or lose over time. They exist in one universe (probably), so the step of looking across the possibilities actually goes against the interests of the individual. We saw that each individual has an exceedingly low probability of not ending the year with $£0$. This is what the individual would care about when making an assessment of whether the process is good or bad for them. In fact, in the limit of time $\rightarrow \infty$, every individuals time average of wealth will be $0$. To see this, notice that eventually everyone will strike unlucky, and go to $£0$. After this point, they cannot escape this absorbing state and so we are taking the average of more and more $0$'s.

Ergodic theory¶

Here's where an area of reasonably pure maths comes in. A system or process or game is Ergodic if the time averages converge to the space averages (think of this as the first average we looked at). Consider simply flipping a coin, with a score $1$ for heads and $0$ for tails. The expectation here for one flip is, of course, $0.5$. If you play for a very long time, you will get roughly $\%50$ heads and $\%50$ tails, so your time average will again be close to $0.5$. The longer you play the closer you will get. The time average matches the expectation.

The system for the wealth growth of a country outlined here is not Ergodic. This means we cannot tell anything about the time averages from the expectation. You could try the calculations seen here with a loss multiplier of $0.6$ (as opposed to 0) and a win multiplier of $1.5$ (instead of 3). The same behaviour that we saw here is prevalent in this process, but it is slightly subtler. This is not to say that the time average is always what we want. It also means we cannot necessarily tell anything about the ensemble average from time averages. Think about a psychological or health study. Usually what happens is a select few individuals get tested. They are tested over time. This does not necessarily tell us anything about the average across the population. In a non Ergodic system, the experience throughout time of an individual tells you nothing of the experience of the collection. Similarly, the experience of the collection tells you nothing about the experience throughout time of an individual.

In particular, systems that exhibit any sort of catastrophe (i.e losing all your wealth) or multiplicative growth (also seen here) or irreversible events are not Ergodic. This idea of non-ergodicity is the reason for many things we might think of as quirks. Insurance companies make use of it: a policy can both be in the interests of the company (who averages over everyone buying it in an expectation sense) and the individual (who cares about the time average relevant to them). The mismatch is the profit that the company can make. It is why we should take a precautionary attitude to catastrophic risks that we are repeatedly exposed to, but have little chance. Think wearing a seatbelt. Another example is any sort of loss aversion. This is a term used in Economics to describe the fact that humans tend to avoid risks of loss even when the expected value of the game to them is positive. This is because we are built to avoid systemic risks in the interests of survival. Irreversible damage is best avoided, even if potential upsides outweigh these on average. Ergodicity provides an angle to look at this opposing expected utility theory, and this can be read about in the couple of references put at the end here.

My primary school teacher had a good point. We should be skeptical when someone invokes averages. Maybe not for the reason she outlined, though.

Final words¶

It must be said that I have done some sleight of hand here. The formal arguments of Ergodic theory are very technical with pre-requisites that require it to be a postgraduate maths topic when taught in its fully abstracted form, and at the least an undergraduate statistics topic to iron out the technicalities. However, I think it is possibly the most relevant concept from higher level maths that can be used to understand the mechanisms in our society. Hopefully this short introduction has made sense. I have added a reference for those that are intrepid.

A note on the current coronavirus pandemic and the relevance of this topic: it should have been taken far more seriously, far earlier than it was because the risks are non-ergodic. We could not tell very much about the virus early on. Model predictions always give error. When the risk is to the whole of society at once, then things need to be thought about very differently. With every pandemic comes a risk of ruin or catastrophe, and so every one should be treated with extreme caution. The fact that on average the per year deaths of pandemics do not exceed, say car accidents means absolutely nothing because a pandemic could cause the death of all $8$ billion odd of us. Car accidents can't.

References¶

https://www.nature.com/articles/s41567-019-0732-0.pdf

http://www.weizmann.ac.il/math/sarigo/sites/math.sarigo/files/uploads/ergodicnotes.pdf