Why a Credible VVPAT-based Audit of EVMs is Imperative
Voter Verified Paper Audit Trail is a critical tool for detecting counting errors, deterring fraud, and improving the security and reliability of electronic voting machines. But by itself, it can’t prevent EVM malfunction or manipulation. If it is to have any real security value, it is essential to tally the EVM count with the manual count of slips for a statistically significant sample size of EVMs drawn at random from a suitably defined ‘population’ of voting machines
CHENNAI: Electronic Voting Machines (EVMs) are “black boxes” in which it is impossible for voters to verify whether their votes have been "recorded as cast" and "counted as recorded”. There is always some risk of loss of votes and wrong totalling due to equipment malfunction. With EVMs, counting mistakes and frauds are undetectable, and the losers have no means to challenge the results.
So, there should be an additional verifiable physical record of every vote cast in an EVM. In 2013, the Supreme Court mandated the deployment of “Voter Verified Paper Audit Trail” (VVPAT) units along with EVMs. The printed paper slips provide a back-up in case of loss of votes due to equipment failure, and allow for a partial or total recount independent of the EVM count.
VVPAT is a critical tool for detecting counting errors, deterring fraud, and improving the security and reliability of EVMs. But VVPAT, by itself, cannot prevent EVM malfunction or manipulation. If it is to have real security value, it is essential to tally the EVM count with the manual count of VVPAT slips for a statistically significant sample size of EVMs drawn at random from a suitably defined ‘population’ of EVMs.
The argument “Is such tallying really necessary when the voters have already verified their VVPAT slips at the time of polling?” is incorrect. Voter verification of VVPAT slips only ensures that the votes have been "recorded as cast" but it doesn’t ensure that they have been "counted as recorded” due to the risk of EVM malfunction or manipulation.
EVM Audit Plan:
The VVPAT-based audit of EVMs is very similar to “lot acceptance sampling”, a statistical quality control technique widely used in industry and trade for assuring the quality of incoming and outgoing goods. An acceptance number of defectives (‘c’) is specified. If the number of defectives found in a randomly drawn, statistical sample is less than or equal to ‘c’, the lot (or ‘population’) is accepted; otherwise, the lot is rejected.
We define a ‘defective EVM’ as one where there is a mismatch between the EVM count and the VVPAT’s manual count of voter slips due to EVM malfunction or manipulation.
Unlike industry and trade where a few defectives in the sample may be tolerated, in the context of elections, ‘c’ will have to be ‘zero defective EVM’. In other words, even if there is a single instance of mismatch between the EVM count and VVPAT manual count in the randomly drawn sample, the ‘population’ of EVMs from which the sample was drawn should be ‘rejected’.
In this case, ‘rejection’ means non-acceptance of the EVM count for that ‘population’ and doing hand counting of VVPAT slips for all the remaining EVMs of that ‘population’. In such a scenario, the election result should be declared only on the basis of the VVPAT count.
Thus, VVPAT-based audit of EVMs involves four essential elements:
- A clear definition of the ‘population’ (polling stations or EVMs) from which the statistical sample would be drawn. It could be all the EVMs deployed in an Assembly Constituency, a Parliamentary Constituency, a State as a whole, India as a whole, a Region within a State (a district or an integral number of districts), or any other. The population size (‘N’) could vary widely depending on how we define the ‘population’.
- Determination of a statistically correct and administratively viable sample size (‘n’) of EVMs whose VVPAT slips will be hand counted.
- Application of the ‘decision rule’ in the event of a mismatch. If one or more ‘defective EVMs’ turn up in the chosen sample of ‘n’ EVMs, the ‘decision rule’ is to do hand counting of VVPAT slips for all the remaining (N-n) EVMs forming part of that ‘population’.
- Doing the EVM count-VVPAT count comparison for the chosen samples at the beginning of the counting day alongside the counting of postal ballots. Where there is a mismatch, the manual counting of VVPAT slips for all the remaining EVMs of the corresponding ‘population’ should begin right away and the election results declared only on the basis of the VVPAT count. Where there is a perfect match, the election results should be declared based on EVM count.
An Exercise in Tokenism:
However, the EVM Audit Plan put in place by the Election Commission of India (ECI) is found wanting on many counts:
The ECI has prescribed a uniform sample size of “5 EVMs per Assembly Constituency” for all Assembly Constituencies across the country. As we shall demonstrate presently, this is a statistical howler with very high margins of error.
- It has not explained how it arrived at its sample size.
- It has not specified the ‘population’ to which its sample size relates.
- It is silent about the ‘next steps’ in the event of a mismatch between the EVM count and the VVPAT count in the chosen sample.
- It has glossed over the reported cases of mismatch in the past. For instance, at least eight reported cases of mismatch during the 2019 Lok Sabha elections were acknowledged by the ECI itself. Two reported cases of mismatch during the 2018 Telangana Assembly elections went to the High Court.
- It has ruled that in the event of a mismatch, the VVPAT figure may be adopted as the correct count for the particular EVM. It has also held that the “small discrepancy” between the EVM count and VVPAT count for a sample wouldn’t affect the final result. Both these rulings make a mockery of statistical quality control protocol. A mismatch in a sample, however small, is a sign of a deeper problem, namely, that the ‘population’ from which the sample was drawn is ‘defective’, and calls for the application of the ‘decision rule’ mentioned earlier.
- The ECI has scheduled the EVM count-VVPAT count tallying exercise at the fag end of the counting day. It usually takes place in the night, after the results based on EVM counts have already been released to the media, and when the winning candidate is breathing down the Returning Officer’s neck for an early formal declaration. The bone-tired election personnel and counting agents, most of whom do not fully understand the statistical significance of the exercise, view it as a chore to be rushed through and done with - much like a vote of thanks at the end of a long function. There is every psychological incentive for the election personnel to make the two counts match, more so when even the ECI mistakenly thinks that a “small discrepancy” in a sample would not affect the final result!
The distinct impression one gets is that the ECI is lackadaisical about VVPAT-based audit of EVMs. This defeats the very purpose of introducing VVPAT.
Why the ECI’s sample size is wrong:
Sample size depends upon (i) the choice of the probability distribution, (ii) the assumed percentage of ‘defective EVMs’ (‘P’) in the population, and (iii) the percentage of accuracy with which we want the sample to detect at least one ‘defective EVM’.
We apply the Hypergeometric Distribution model to VVPAT-based audit of EVMs because it is an exact fit. We assume the percentage of ‘defective EVMs’(‘P’) in the population to be very low, say, 1%; for higher values of ‘P’, the sample size required is smaller. We aim at 99% probability of the sample detecting at least one ‘defective EVM’.
As seen from Table 1, when the population size (N) of EVMs is small (100), the sample size (n) required is nearly as big as the population size (99). As N increases, n also increases but at a much slower rate. It is only 458 for a population size of one lakh, 459 for a population size of ten lakhs, and remains at 459 for a population size of one crore. That is, it ‘hits a plateau’ beyond some point.
Table 1 also tells us how statistical sampling is superior to arbitrary, non-statistical sampling such as, say, a “10% sample”. With statistical sampling, the sample size required is 99 for a population size of one hundred, and just 459 for a population size of one crore. But with a “10% sample”, the sample size required is 10 for a population size of one hundred, and 10 lakhs for a population size of one crore. Thus, a “10% sample” is too small and statistically incorrect for small population sizes and too big and administratively unviable for very big population sizes.
The ECI’s critics are guilty of demanding arbitrary, non-statistical sample sizes like “25% samples” and “50% samples” for VVPAT-based audit of EVMs under the mistaken impression that a “bigger percentage” guarantees greater accuracy of results. Some are now demanding a 100% manual count of all VVPAT voter slips which is an overkill.
As seen from Table 2, if we define EVMs deployed in an Assembly Constituency as the ‘population’, then in view of the small population sizes, the sample sizes required are rather big and administratively unviable. Statistical sampling doesn’t allow a uniform sample size for non-uniform, small population sizes. So, the ECI-prescribed uniform sample size of “5 EVMs per Assembly Constituency” for all Assembly Constituencies fails 95% of the time for this choice as the ‘population’.
As seen from Table 3, if we define EVMs deployed in a Parliamentary Constituency as the ‘population’, then in view of the small population sizes, the sample sizes required are again rather big and administratively unviable.
If we define the EVMs deployed in a State as a whole (or) India as a whole as the ‘population’, then in view of the bigger population sizes (N), the sample sizes (n) required are small and viable. But, in the event of a mismatch, the workload involved in hand counting the VVPAT slips for all the remaining (N-n) EVMs of the ‘population’, is very large and administratively unviable for India as a whole and for a State as a whole (except the very small States).
The ECI claims that the Indian Statistical Institute, Kolkata had recommended a sample size of 479 EVMs for India as a whole which, on average, works out to just 1 EVM per Assembly Constituency (after rounding off), and so its present sample size of “5 EVMs per Assembly Constituency” is more than adequate. But it glosses over the “next steps” in the event of a mismatch.
The Way Forward:
The ECI should define the ‘population’ in such a way that the sampling fraction (n/N) is small but N is not so big that, in the event of a mismatch, the workload involved in counting the VVPAT slips of all the remaining (N-n) EVMs of the ‘population’ is administratively unviable.
For sampling purposes, I suggest the division of the bigger States into ‘Regions’ with population sizes of approximately 5,000 EVMs each. A Region should comprise a district or an integral number of districts. If we treat “EVMs deployed in a Region” as the ‘population’, the sample size required is 438. On average, there would be about 20 Assembly Constituencies in a Region. So, the average number of EVMs per Assembly Constituency whose VVPAT slips are to be hand counted is 22 which is manageable.
For example, Tamil Nadu with 68,321 EVMs can be divided into 13 Regions with roughly 5,000-odd EVMs each. In the event of a mismatch in a sample, the ECI will have to order the hand counting of VVPAT slips for all the remaining EVMs of the particular Region only, and not the EVMs of the entire State. This option is statistically robust and administratively viable.
More than a century ago, H.G. Wells wrote: “Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.” The EVM-VVPAT saga in Indian elections proves his point!
(The author is a former IAS officer and former Vice-Chancellor of the Indian Maritime University, Chennai)