[EM] IRV ballot pile count (proof of closed form)

Sat Feb 6 00:27:08 PST 2010

James Gilmour wrote:
> Abd ul-Rahman Lomax  > Sent: Friday, February 05, 2010 4:50 PM <CUT>
>> Practically speaking, I'd assume, the precincts would be provided 
>> with a spreadsheet showing the possible combinations, and they
>> would report the combinations using the spreadsheet, transmitting
>> it. So some cells would be blank or zero. With 5 candidates on the
>> ballot, the spreadsheet has gotten large, but it's still doable.
>> What happens if preferential voting encourages more candidates to
>> file, as it tends to do? 23 candidates in San Francisco? Even with
>> three-rank RCV, it gets hairy.
> 
> Respectfully, I would suggest this would NOT be a wise way to collect
> the data.  As I pointed out in my e-mail that correctly listed the
> maximum possible number of preference profiles for various numbers of
> candidates, the actual number of preference profiles in any election
> (or any one precinct) with a significant number of candidates, will
> be limited by the number of voters.  Further, because some (many)
> voters will choose the same profiles of preferences, the actual
> number of preference profiles will likely be even lower  -  as in the
> Dáil Éireann election I quoted.
> 
> Thus a spreadsheet containing all possible preference profiles would
> be unnecessarily large and the probability of making mistakes in data
> entry would likely be greater than if each precinct recorded only the
> numbers for each profile actually found in that precinct.

In short, I agree with you.

If I were to make a raw format for computerized transfer of ballot 
counts, it would be like this:

First bit: determines format type.

If the format type is 1, then
next 31 bits determine length of profile number (in bytes; call this x),
next 32 bits determine length of "how many voted" field (in bytes; call 
this y).

Then, for as many times as needed:

x bytes: profile number (according to a bijection between orderings and 
integers),
y bytes: how many voted this way.

If the profile numbers are in sorted order, adding a new ordering 
entails sorting the entire data, and so is slow. If the profile numbers 
are not, then adding a new ordering is quick, but determining which 
ordering to increment is slow (linear search).

If the format type is 0, then
next 31 bits determine length of "how many voted" field (call it y).

Then, for as many times as there are possible orderings,

y bytes: how many voted this way (i.e. first y bytes: how many voted 
according to the ordering represented by 0, second y bytes: how many 
voted according to the ordering represented by 1, etc).

You'd also need a list of the candidates, because ordering-integer 
bijections only report the ordering in terms of "first candidate > 
second candidate > third candidate".

For all practical purposes, except when there are only a few candidates, 
the first format (1) would be much more compact than the second - which 
is the point you're making. The data is probably quite compressible as well.

-

If the system is direct electronic, a unary count might make more sense. 
Have the entire thing either on a CD-R or on a PROM. In the PROM 
example, every time someone votes, burn a fuse for the ordering he voted 
and an antifuse for all the other orderings. For a CD-R, the concept is 
the same, but the execution might be more difficult, since you can't 
address it on a bit level.

But then I don't really like direct electronic machines, and I'd prefer 
them to be nothing more than expensive pencils - producing paper output 
and otherwise not being involved in the process.

>> There is a way to avoid such massive reporting, which is to report
>>  interactively, which is what is done in Australia. Only one set of
>>  totals is reported from a precinct at a time, the totals for the 
>> current round. (which can be just uncovered votes due to
>> eliminations that have been reported to the precinct from central
>> tabulation.)
>> 
>> However, the problem with this is that a single error in a precinct
>>  can require, then, all precincts to have to retabulate.
> 
> Yes, this "distributed counting" would work.  But there is an even
> simpler solution  -  take all the ballots to one counting centre and
> then sort and count only the ballots that are necessary to determine
> the winner (or winners in an STV-PR election).  That what has been
> done for public elections in Ireland and the UK for many decades and
> it works well without problems.  But I do appreciate that is far too
> simple and practical a solution and it suffers from NMH.

In societies not quite used to democracy, such a process would be a 
focus point for electoral fraud. The centralized counting means only one 
location has to be compromised. In such a case, I'd think the 
distributed/interactive count would be preferrable, or in the single 
winner case, using a summable method.