[Election-Methods] New improved fla for vote counts to be reported for auditing IRV elections

Sat Aug 2 10:44:53 PDT 2008

Kathy Dopp wrote:
>> Well, any election method can be "parallelized" (in quote marks) with a
>> superpolynomial amount of information when there are as many choices as
>> candidates.
> 
> I am not certain what you mean. Precisely, any Ranked Choice ballot
> has a number of possible permutations of all the candidates given by
> the fla above; or a number of unique candidate rankings given by the
> fla above.
> 
>> Since n! < n^n, it also reduces to a polynomial amount of
>> information for a fixed number of choices or rankings (n for one, n*(n-1) =
>> O(n^2) for two, and so on).
>>
>> Thus, if the activists claim that IRV escapes the problem if you do it in
>> that way,
> 
> Not sure what you mean by "that way".

My reference here was, perhaps a bit obliquely, to the summability 
criterion. The summability criterion says that if a method gives an 
internal count that represents data, and that this internal count can be 
aggregated with other such internal counts so that the result for the 
method run on the aggregated count is the same as if the method was run 
on all the ballots that made up the count, then the size of that 
internal count is a polynomial with respect to the number of candidates 
running.

In other words, if a method is summable (absent edge cases like the 
internal count being n^1000), then it can be "counted in districts" - 
like Plurality, Borda, Condorcet, and the others. IRV isn't summable, 
and that is a problem.

When you referred to activists, I assumed you meant IRV activists, and 
that they would use the equation you showed to say that San Francisco's 
IRV is summable. It technically is, because the count of various 
preferences is the internal count, and with only 2 (or 3.. any constant 
less than the number of candidates), the "internal count" (of how many 
had this preference and that preference) is polynomial with respect to 
the number of candidates.

>> you can say that that holds for absolutely every kind of ranked
>> ballot method that exists - at least for the neutral ones, and election
>> systems really should be neutral.
> 
> However, as you know, it is *not* true that the counting method is
> complex or non-additive for each precinct with *any* counting method
> for Ranked Choice ballots. The IRV method is, for instance, far more
> complex than the approval or Borda methods where it is easy to audit
> the accuracy of the machine counts *without* having to publicly report
> all the tallies for all permutations of candidate orderings in order
> to do valid partial post-election audits.

My point here is that if IRV proponents claim that IRV with truncated 
preferences is summable, then you can say that that's true of all ranked 
vote systems because they all take the same input (namely, ranked 
ballots). Therefore, the criterion isn't worth much and, as you point 
out, you should use other things to judge whether IRV is good or not.

>> Rated ballots with a granularity permitting k possible ratings for a single
>> candidate would have complexity k for a single choice, k^2 for two, ..., k^n
>> for a full preference ballot.
> 
> Yes. The ballots might be as complex but the counting method is not as
> complex and doing partial post-election auditing does *not* require
> keeping track of all the permutations of possible unique ballot
> choices that voters can make.

It's certainly possible to make non-summable methods for rated ballots 
that don't force you to truncate - just imagine something like a rated 
IRV, where instead of the first preferences, the highest ratings of each 
voter is counted towards each candidate and then the one with the lowest 
count is eliminated.

Still, I understand what you're saying: Range certainly isn't very 
complex, and Approval is even simpler ("count all the votes").

On an aside, if I were to pick a Range-like, I'd prefer a DSV version of 
Range to handle the compression incentive, but then it gets more complex 
(and perhaps even loses summability).

>> At some point, rated or ranked, it becomes easier to simply send every
>> voter's preference. An interesting consequence of that would be that it'd be
>> possible to "fingerprint" one's own vote to vote-buyers if there are a
> 
> Yes. That certainly might be possible because in most precincts, if
> there were enough candidates, the number of voters would be far less
> than the number of possible unique ballot ranking choices.
> 
> That is a disadvantage of any ranked choice voting ballot method - the
> fact that post-election auditing on the individual ballot level is
> probably not a good idea.  But auditing at the ballot level can be
> problematic anyway due to ballot privacy concerns, even with a single
> choice plurality ballot.

One possible solution would be to do this: Have a device make a rank 
tree. Each node in the tree contains a candidate name and a number 
specifying how many voted the preferences you can get by traversing from 
this level up to the root. For instance, if the ballots were:

1: A > B > C
100: A > C > B
10: B > A > C

Then the tree is
  A : 101
   A > B: 1
    A > B > C : 1
   A > C: 100
    A > C > B: 100
  B : 10
   B > A : 10
    B > A > C : 10

To "sanitize" the tree, cut off any nodes that have a value lower than 
some preset value k, as well as any complement nodes where it would 
otherwise be possible to determine what you sanitized. Then any 
vote-buyer will have to coordinate the purchase of at least k ballots in 
order for it to show up on the audit. The only technical problem is that 
some participant will have to do the sanitizing, and you have to trust 
that participant.

In the above example, for k = 5, the sanitized tree would be:

  A > ... : 101
   rest removed, A > B because it's 1 which is less than 5 and
   A > C because giving the information of 100 votes would let
   one deduce that there's one A > B > C vote left
  B: 10
   B > A : 10
    B > A > C : 10

>> sufficient number of choices, since one could use the extra information to
>> make one's own ballot unique, and thus detectable - at least if the list of
>> every voter's preference were to be made public afterwards.
>>
>> The activists may say that the ballot data can be transferred easily if one
>> uses computerized means: a practical system would pick the lesser number of
>> n! and (number of voters), so a strictly ranked Presidential election with
>> full turnout (300 million voters) and 10 candidates, would need "only"
>> ceil(lg(10!)) * 3e8 bits = 6.6e9 bits, which is slightly less than 787
>> (binary) MB, enough to fit on a DVD.
> 
> How many ranked choices are you allowing each voter to make?  3?

Strict rank over all the candidates. There are 10 candidates, so the 
number of possible strict preferences is 10!. In order to tell 
preferences apart, we can map them to numbers between 0 and 10!. To 
fully specify these numbers, we need lg(10!) bits, and because (short of 
arithmetic compression) there's no such thing as a fractional bit, we 
get ceil(lg(10!)). The 3e8 is just 300 million, because with the full 
turnout specified, we have to store 3e8 such "preference numbers".

Actually, that was a bad example, because 10! is less than 300 million - 
it's about 3.63 million. Therefore, you'd just have to store 3.63 
million numbers, the first saying "how many voted using preference #0", 
the next "how many voted using preference #1", and so on. In the worst 
case, all 300 million may have voted for one preference, so each of 
these numbers have to be ceil(lg(3e8)) = 29 bits in length. 29 * 10! is 
about 10^8, thus you'd get about 12 (binary) MB.

>> Still, that doesn't make the system any more summable, and like you say,
>> when the "competitors" only have to ship 10^2 values (actually 10^2 - 10 =
>> 90; Condorcet matrix with diagonal removed), 787 MB doesn't sound so
>> impressive after all.
> 
> Well, I am worried less about the amount of disk space, although that
> is very interesting that you can calculate that. I am more worried
> about the confusion to the public who would like to achieve the goal
> of publicly verifiable election outcome accuracy via post-election
> manual audits to check the machine counted results.
> 
> In other words, election outcomes should be publicly verifiably
> accurate in a way that is understandable/transparent to the public.
> However, if the IRV folks can get election officials to instead do
> publicly observable 100% manual counts of all IRV election contests,
> that would work too.
> 
> However, I fail to see the need to use such a complex counting method
> (IRV) that fails to even fully solve the spoiler problem it claims to
> solve and which gives undesirable election outcomes some times, when
> there are far simpler to count and audit election methods which seem
> to fully solve the spoiler and other problems and give more
> consistently desirable election outcomes.

I think it's possible to make a cryptographic "zero knowledge proof" 
system that can prove that nobody messed with the ballots without 
actually saying what the ballots are. But I agree with you: it won't be 
of any good if people don't understand how it works, and I'm pretty sure 
they wouldn't. For something like a public election, "trust us because 
we are programmers" wouldn't cut it, and for similar reasons I think 
electronic voting machines should also be little more than expensive pens.
Perhaps rival party counts would work, where various groups do separate 
counts and check each other. But as with all consensus mechanics, a 
rogue group could bring the others to a halt.

For summable election methods like Borda, Condorcet, and so on, do you 
think the public would be satisfied with independent recounts of the 
precinct totals / Condorcet matrices - or would one have to somehow 
solve the problem of ranked vote fingerprinting and then count/audit the 
ranked ballots themselves, no matter the ranked ballot election method?