# Benford’s Law and the Iranian Election

You may be asking yourself, what does mathematics have to say about the Iranian election?  Long time readers know that voting is a subtle process with lots of interesting mathematics behind it.  See here and here.  Right now there is a great deal of controversy about whether the Iranian election was manipulated.  Given the very limited amount of data available about the voting, it’s difficult for outsiders to make a definitive assesment of the validity of the election.  However, lots of people are applying various statistical and mathematicial tools to study what data is available:  for example, here, here, and here.

So far, the conclusion seems to be that although there is some peculiarities, so far there is no definitive evidence of fraud.  For example, Walter Mebane, a University of Michigan political science and statistics professor, has this to say after doing a statistical analysis of the available data:

While it is not possible given only the current data to say for sure whether this reﬂects natural complexity in the political processes or artiﬁcial manipulations, the numerous outliers comport more with the idea that there was widespread fraud than with the idea that all the departures from the model are benign. …. [There is] moderately strong support for a diagnosis that the 2009 election was afflicted by significant fraud.

–Walter Mebane

A rudimentary tool for checking if data is likely to be tampered with is Benford’s Law.  It says that many collections of data which occur in nature aren’t random, they have the following very simple structure:  If you look at the first digit of each number, then you would guess that 0, 1, 2, …, 9 each occurs 10% of the time.  But that doesn’t happen, what happens is that 1 occurs about 30% of the time, 2 occurs 17% of the time, etc.  This is because the data isn’t random.  Benford’s Law applies to data which is distributed logarithmically (which happens surprisingly often in the real world).  For example, it was recently shown that the first digits of the prime numbers follow a version of Benford’s Law.

First digits as predicted by Benford's Law

Benford’s Law gives you an easy way to check if someone has fiddled with your data (assuming the data should follow the Law).  If humans are asked to make up “random” data, then they have a very strong sense that there should be no patterns (If you ask for a digit between 1 and 9000, they are much more likely to pick 7041 than 3333, even though both are equally likely mathematically).  So if your data doesn’t fit Benford’s Law, then that’s a good sign that there’s been human intervention.

Very recently Boudewijn F. Roukema studied the available data from the Iranian election using Benford’s Law.  His paper is here.  The conclusion?  Much as Dr. Mebane opinion:  No definititive signs of fraud, but there is a statistically significant gap between what the data says and what Benford’s Law predicts.

P.S.  Benford’s Law was first observed by Simon Newcomb in 1881, but was rediscovered (and named after) Frank Benford in 1938.  The other Benford’s Law (which also very relevent to this discussion!) was formulated by OU graduate Gregory Benford in 1980:

Passion is inversely proportional to the amount of real information available.

— Gregory Benford