[EM] Summing runoff/IRV: can this hold?

Sun Sep 22 06:09:36 PDT 2019

[Not subscribed, I think?  CC me on replies]

I think I can sum a countable total for up to five candidates, but it
breaks after that.

Consider these properties:

   1. There are N ballots.
   2. For each alternative {A}, there are A[first] ballots where A occurs
   first.
   3. For each alternative {A}, there are A[last] ballots where A occurs
   last.
   4. For each pair of alternatives {A,B}, there are A[B] ballots where A
   directly and immediately precedes B.
   5. No candidates are ranked equally on any ballot, except that all
   unranked candidates are equally worse than all ranked candidates and are
   considered to not appear on any ballot on which they are unranked.

In theory, it should be possible to generate two sets of ranked ballots
which share these properties.  If this cannot be done, then for C
candidates, we require (C*(C-1+2)+1) or (C^2+C+1) pieces of information
each equal or less in size than the information that there are N ballots.
In theory, we must store N! distinct pieces of information to store ranked
ballots.

Should it be impossible to generate two sets of ranked ballots sharing the
above properties, then the above properties can be summed.

Should it be possible to generate two sets of ranked ballots sharing the
above properties—it *must* be—then there is another question.

Is it possible to generate two sets of ranked ballots sharing the above
properties such that a runoff between the ballots does not compute exactly
the same values for the above properties at each round?

If no, then we may sum the ballots and compute e.g. Instant Runoff Voting,
but we may not know what the actual ballots are in this case.

If yes, then a third question:  is it possible to generate two sets of
ranked ballots sharing the above properties such that a runoff between the
ballots does not compute exactly the same values for the number of ballots
on which each alternative occurs *first* in each round?

If no, then we may still sum the ballots and compute the runoff without
knowing the ballots themselves.

If the above were possible, then it should be possible to eliminate each
candidate in turn and test the number of additional ballots on which each
other candidate appears first, and thus reason that there are N ballots
where A appears first, and by elimination you have M ballots each where B,
C, D, and so forth are ranked second directly after A, plus ballots where A
is the only candidate ranked, and so have counts for A>B A>C A>D and so
forth.  Repeat with each alternative and you will have the entirety of
first ranks, and then can proceed through every round of run-off singly,
and in this way discover all rankings in polynomial time O(n^2) for (n)
alternatives.

Consider the logic:

 - "A is first on 10 ballots" is not the proposition; rather, it is that
BEGIN precedes A on 10 ballots.
 - Likewise, "A is last on 10 ballots" is just "A precedes END"
 - END cannot precede any alternative on any ballot
 - No candidate can precede BEGIN on any ballot
 - Ballots cannot be added or removed
 - The candidates must appear on exactly the same number of ballots

The initial rules are essentially similar to saying "the [first|last] pair
of candidates is {A,B} on N ballots", with A or B being BEGIN or END.

When rearranging ballots, the same number of ballots must start and end
with the given candidates.  Consider any race with three candidates:

 - First and last candidates must stay in place
 - It is impossible to insert candidates between the two candidates ranked
on any ballot of two candidates, as you would change the number of ballots
or the immediate-precedence properties
 - It is impossible to rearrange second candidates, as this would change
the immediate-precedence properties

Thus in a race of three candidates, the above properties will always
describe all ballots.

In a race of four candidates, I cannot break these rules:

 - A>B>C
 - D>A>B>C

The unit of three can move around, but we cannot have a mobile candidate
without ranking in multiple positions on a single ballot.

In a race of five candidates, I can break these rules:

 - A>B>C>E
 - E>A>D>C

If we were to swap B and D, then all is well; however, because *only one
candidate* remains unranked, these ballots are mathematically-equivalent to
a completed ballot, and can be recorded as such:

 - A>B>C>E>D
 - E>A>D>C>B

Now swapping any candidates between the rankings is again impossible.  If
instead our original ballots are as follows:

 - A>B>C
 - E>A>D>C

Then we can produce the following:

 - A>D>C
 - E>A>B>C

Yet this continues to illustrate the problem that we can simply account as
such:

 - A>B>C
 - E>A>D>C>B

Now with five candidates it is impossible to generate a set of ballots with
mobile candidates which can swap positions.

To my senses, this creates an absurdity.

There must be 5! or 120 combinations, and so we need 120 pieces of
information—those being integers of log(N) bits for N ballots—to store the
count of any individual combination, each of which may be 100% of the
ballots or 0% of the ballots, or some number in between.

Yet to store this information uniquely-identifying a set of ranked ballots
in a race between five candidates, we only need:

 - 5 pieces for the number of each candidate ranked first.
 - 5 pieces for the number of each candidate ranked last.
 - 20 pieces, such that for each of the 5 candidates, 5 pieces to indicate
on how many ballots the candidate precedes each other candidate.
 - 1 piece, indicating the number of ballots.

That is 31 pieces of information or 26%.

It must be that this information can be stored in variable sizes of space
by indicating default values and truncation, and 30% is the absolute
average compression.

The properties fail to extend to six candidates:

- F>A>B>C
- E>A>D>C>B>F

If we truncate ballot two:

 - F>A>B>C
 - E>A>D>C

We can now swap:

 - F>A>D>C
 - E>A>B>C

Eliminate F. then A, and you have B first on one set, but D first on the
other.

*However*, there's one more problem:  *if you have a list of pairwise vote
margins, such swaps change those totals*.  Thus for six or more, you also
need C^2-C additional pieces of information.  For six:

 - 6 pieces for the number of each candidate ranked first.
 - 6 pieces for the number of each candidate ranked last.
 - 30 pieces, such that for each of the 5 candidates, 5 pieces to indicate
on how many ballots the candidate precedes each other candidate.
 - 30 pieces, being the pairwise vote margins
 - 1 piece, indicating the number of ballots.

Rather than 6! or 720, you need a mere 73 or 10% of the storage space.

I am uncertain of the pairwise vote margins conjecture and have been unable
to prove it mathematically within these bounds.  Pairwise margins and even
pairwise vote tallies is not sufficient to uniquely identify a set of
ranked ballots.

As well, it may very well be NP-Hard to compute ballots from this
information even if it is unique per ballot.

Still, it is summable and unique for up to five candidates, or so it seems;
and may be summable and unique for six with this extra information.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20190922/5e83ac85/attachment.html>