The statistics don’t tell us anything we didn’t already know. The stealing of the Iran’s June 12 election has been obvious from the start. But that’s the nature of statistics; it’s real value is telling you that you what don’t know, it’s eliminating false positives. Walter Mebane of the University of Michigan in Ann Arbor has done the work to show that this disgraceful event really is a fact. I saw his article (pdf) when it first came out in mid-June, but seeing it again in Science News jogged me to talk about it. From SN:
“[Iranian election data] suggests that the actual outcome should have been pretty close,” says Mebane…. The official results showed Ahmadinejad getting almost twice as many votes as his closest rival.
Mebane cautions that the anomalous statistics could imaginably have an innocent explanation, that limited data is available, and that he is not himself an expert on Iranian politics. Nevertheless, he concludes that “because the evidence is so strikingly suspicious, the credibility of the election is in question until it can be demonstrated that there are benign explanations for these patterns.”
[A couple of paragraphs follow discussing the distribution of numbers in real data, known as Benford’s Law.]
When Mebane studied polling station-level data from Iran, he found that the numbers on the ballots for Ahmadinejad and two of the minor candidates didn’t conform to Benford’s Law well at all.
In any fair election, a certain percentage of votes are illegible or otherwise problematic and have to be discarded. When people commit fraud by adding extra votes, they often forget to add invalid ones. Suspiciously, Mebane found that in towns with few invalid votes, Ahmadinejad’s ballot numbers were further off from Benford’s Law — and furthermore, that Ahmadinejad got a greater percentage of the votes.
“The natural interpretation is that they had some ballot boxes and they added a whole bunch of votes for Ahmadinejad,” Mebane says.
Mebane also received data from the 2005 Iran election that aggregated the votes of entire towns…. If Ahmadinejad fared poorly in a particular town in 2005, you wouldn’t expect him to do especially well there in 2009 either. …
The best relationship the model found produced 81 outliers out of 320 towns in the analysis, a strikingly high percentage. Another 91 fit the model, but poorly. In the majority of these 172 towns, Ahmadinejad did better than the model would have predicted.
“This is not necessarily diagnostic of fraud,” Mebane says. “It could just be that the model is really terrible.” But since the first analysis gives evidence of fraud, the cities the model flags as problematic are the sensible ones to scrutinize.
For me, the new bit of data in all that is just how bad they were at faking it. That gives watchdog groups a big opportunity if they can somehow get at the raw data before it’s destroyed.
I only regret that we in the US, with our long string of elections-as-theater, don’t have the Iranian opposition’s fire, and that we do have much more polished cheaters.
Update, Jul 24, 2009. I see today that there was another excellent article on the BBC on this topic, providing yet more examples of voting anomalies.
Iran, election, fraud