[EM] Simmons' brilliantly simple "data compression" idea for large multiwinner elections

Tue Dec 15 14:19:17 PST 2015

I like your idea about C*(C-1) factions determined by 1st and 2nd choices.

I'm sure you've already thought of the obvious way of dealing with top
equal and second equal ratiings, but for the record, here it is:

Suppose the respective scores on a certain ballot B for the respective
candidates are

100%, 100%, 80%, 80%, 80%, 70%, 60%, ...

Each of the six bins in the set {bin(X, Y)| {X,Y} in the Cartesian product
of {c1, c2} and {c3, c4, c5}} should get a copy of B with weight 1/6 .
This weight will be used in the respective weighted averages of the ballots
in the respective bins.

Also for the record, here is a suggestion for how to turn ordinal
(i.e.ranked preference style) ballots into cardinal (i.e. range style)
ballots, so that Pereira's transform, and so forth,can be used iin that
context:

(1) Find the respective random ballot probabilities p1, p2, ... for the
respective candidates c1, c2, ...

(2) For each candidate c_i ranked above bottom on ballot B, assign the
score r_i = one minus the sum of the p_k such that c_k is ranked strictly
above c_i on ballot B.

Remarks:

Since there is no candidate ranked above a top ranked candidate, rule (1)
gives every top ranked candidate a score of one.

Unfortunately step (1) cannot be done at the precinct level, unless we are
willing to make do with the precinct estimates of the random ballot
probabilities.

"Random ballot probabilities" can be interpreted in various ways.  For
example we could use the "implicit approval" random ballot model for the
respective probabilities.  I like this version better because it would give
candidate C a higher rating in the following scenario:

45 A1=A2=A3>C
55 B1=B2=B3>C

The respective standard random ballot probabilities (.15, .15, .15, .55/3,
.55/3, .55/3, 0) would yield score vectors ...

45 (1, 1, 1, 0, 0, 0, .55)
55 (0, 0, 0, 1, 1, 1, .45)

Random implicit approval ballot would give C a probability of 1/4 if ties
were to be resolved by coin tosses, or p_C=100% if ties were to be resolved
by further ballot draws.

In the latter case (which I prefer) the probability vector would be

(0, 0, 0, 0, 0, 0, 1),

and the resulting score vectors would be

45  (1, 1, 1, 0, 0, 0, 1)
55  (0, 0, 0, 1, 1, 1, 1)

In the former case (of coin toss tie breaking) the probability vector would
be

(.11, .11, .11, .14, .14, .14, .25),

and the score vectors would be

45 (1, 1, 1, 0, 0, 0, .67)
55 (0, 0, 0, 1, 1, 1, .58)

In either case, C is the range winner.

On Tue, Dec 15, 2015 at 9:50 AM, Warren D Smith <warren.wds at gmail.com>
wrote:

> I think this whole "data compression" idea by Forest Simmons for
> multiwinner elections is brilliantly simple.
>
> The obvious first worry about Forest Simmons' idea is that averaging
> too-large sets of ballots
> as step 1, destroys all hope for proportional representation voting.
>
> E.g. imagine we averaged ALL the ballots as step 1, then we'd only
> have one amalgamated ballot, and from it we clearly would be unable to
> elect a PR parliament.
>
> But Simmons is being smarter about it, he is reducing everything down
> to C amalgamated ballots, if C is the number of candidates -- not one.
> I.e. he amalgamates all
> the ballots that score X highest, for each candidate X.
>
> And that actually seems acceptable, in the sense that with 100%
> "racist" voters in
> a situation with "colored" voters and candidates,
> you would not lose any information in this way!
> Thus, you could still do proportional representation in "racist"
> situations, which
> by some reckonings is all you need -- i.e. if the definition of PR is
> "yields
> color-proportionality in racist situations."
>
> However, there would be a problem if we sort of had "2-level racism."
> E.g. suppose every voter had both a "color" and a "secondary color."
> Ditto for candidates.  Voters give candidates score 7 if agree on
> color, and bonus score
> 3 if  agree on secondary color.
>
> In that kind of situation, Simmons' "data compression" method would lose
> information and presumably lose the ability to deliver (the more
> clever sort of) PR
> that it ought to.
>
> Also, Toby Pereira and (at his urging) me too, like to think about
> elections in which there are 2 kinds of candidates -- colored and
> uncolored -- and voters
> give same-color candidates score 10, other-color candidates score 0,
> and uncolored candidates get a score that depends only on the candidate not
> on the voter.  But it seems to me Simmons' data compression technique
> already is lossless for these elections.
>
> And it seems to me we can devise other "data compression methods"
> which do not lose information in the Color+SecondaryColor scenarios.
> For example we could amalgamate all ballots which both score X highest,
> and score Y highest among candidates getting a lower score than X;
> put all those ballots into bin(X,Y), then amalgamate all ballots
> within any one bin.
> This results in at most (C-1)*C ballots after compression -- and at most
> (C-1)*C^2 approval-style ballots after both the compression
> and a Pereira transform -- in a C-candidate election.
>
> The other brilliant thing about Simmonsesque data compression is,
> this permits multiwinner PR elections to be "counted in precincts."
>
> --
> Warren D. Smith
> http://RangeVoting.org  <-- add your endorsement (by clicking
> "endorse" as 1st step)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20151215/e13b06f5/attachment.htm>