[EM] STV version of VoteFair Representation ranking

Wed Feb 9 11:51:41 PST 2022

In the new "Hare clustering MAM-based PR, and a tie problem" thread,
Kristofer Munsterhjelm wrote:
 > .... It's based around the idea of making a
 > number of virtual constituencies (one for each seat), and
 > then trying to allocate voters to each so that the case for
 > the winner (according to the Ranked Pairs method run in
 > that constituency) is as strong as possible, i.e. the
 > choice of winner or winning order is as justified as
 > it can be.
 > ...
 > Then both Kemeny and RP, in the absence of ties, calls an election by
 > producing the social ordering that maximizes its respective score
 > function. That is, Kemeny chooses the one whose sum of consistent
 > pairwise victories is maximized, and RP the one whose largest pairwise
 > victory is maximized, and among these, the one whose next largest
 > pairwise victory is maximized, and so on.

Kristofer, what you're describing sounds similar to the STV version of 
VoteFair Representation ranking:

https://electowiki.org/wiki/VoteFair_representation_ranking#STV_version

I added this section to Electowiki recently because there is growing 
interest in using STV to elect city councils, yet STV is overly 
simplistic, and has flaws similar to IRV flaws (especially ignoring 
valid ranking data that's very relevant).

This STV version of VoteFair Representation ranking extends the two-seat 
version of it that I designed for partisan elections.  The result is an 
STV-like non-partisan method.

Of course it's based on the Kemeny method rather than Ranked Pairs. 
That's my bias for the reason you mention, namely that locking in a 
pairwise win can block additional info.  This is similar to the way IRV 
blocks lower-ranked preference info from being looked at.

Resolving ties is not a problem with this method.  When the Kemeny 
method declares a tie, it has already used all the available pairwise 
count numbers, so it's an actual tie, not a 
tip-of-the-iceberg-let's-look-deeper kind of tie.  In a governmental 
election just recounting the ballots usually resolves such ties.  Or the 
tie can be resolved by adding one ballot -- supplied by a judge or based 
on a random selection process -- that ranks all the candidates at 
different levels.

Whereas the simple version of STV reduces a ballot to zero influence 
when it helps elect one candidate, this "VoteFair" STV method reduces 
the influence of a ballot to a decimal amount after it helps one 
candidate win, and then reduces the ballot to zero influence after it 
fractionally helps a second candidate win.

Of course the Kemeny calculations are NP-hard, so it's more practical to 
use an estimation method (which I suggest to be a variation of the 
insertion-sort algorithm using the pairwise count matrix) to identify 
the top 12 candidates.  Then the full Kemeny calculations are done for 
those top 12.

So, Kristofer, I agree that you are on to something worth pursuing.

For the benefit of those wanting to see the forest for the trees:

Of course the Ranked Pairs method could be inserted as a drop-in 
replacement for the Kemeny method in my VoteFair STV method, but as 
Kristofer says, resolving ties becomes an issue.

For comparison with the simplistic version of STV that the FairVote 
organization promotes, that method's biggest flaw might be that it 
tosses out a ballot when counting reaches a point where two candidates 
are ranked at the same preference level.  That's crazy.  Even using 
their simplistic approach, those ballots can be set aside and checked 
again during each elimination round, and resume counting them when only 
one of the same-ranked candidates remains in the race.  Of course 
neither Kristofer's method nor the "VoteFair" STV method has this 
disadvantage.

Again, Kristofer, I appreciate that you take time to share your wisdom here.

Richard Fobes
The VoteFair guy

On 2/9/2022 7:57 AM, Kristofer Munsterhjelm wrote:
> I was going through some old posts of mine and found this idea for a
> MAM/Ranked Pairs-based PR method. It's based around the idea of making a
> number of virtual constituencies (one for each seat), and then trying to
> allocate voters to each so that the case for the winner (according to
> the Ranked Pairs method run in that constituency) is as strong as
> possible, i.e. the choice of winner or winning order is as justified as
> it can be.
>
> When I last had this thought, I stopped early because I couldn't make
> sure that it would indeed choose the optimum in each constituency; but
> recently, I realized that if it can't, then single-winner Ranked Pairs
> can't, either.
>
> But that doesn't make a lot of sense unless I describe just what
> single-winner RP's score/metric is, and how it maximizes it, and then
> how it (perhaps) doesn't. So I'll do that first, then get to the PR
> method afterwards.
>
>
> Suppose we have a social ordering, A>B>C. Then both Kemeny and RP assign
> a score to this ordering, that is a function of the pairwise victories
> consistent with this ordering (in this case A>B, B>C, and A>C). Kemeny's
> score is the sum of the strength of victories, while RP's is the value
> of the strongest victory, ties broken by the value of the next
> strongest, ties broken by the value of the third strongest, and so on.
>
> Then both Kemeny and RP, in the absence of ties, calls an election by
> producing the social ordering that maximizes its respective score
> function. That is, Kemeny chooses the one whose sum of consistent
> pairwise victories is maximized, and RP the one whose largest pairwise
> victory is maximized, and among these, the one whose next largest
> pairwise victory is maximized, and so on.
>
> But if there are ties, then we're in trouble. The Ranked Pairs algorithm
> works by starting at the strongest pairwise victory, locking that in,
> then proceeding to the next strongest. This is a greedy approach, and it
> maximizes the RP score when there are no ties. But if there's a tie,
> then it might be that earlier choices close off later choices in ways
> that we can't determine from the start.
>
> E.g. suppose the defeat strengths are as follows, in sorted order by
> strength:
>
> C>A: 100
> D>A: 99
> A>B: 80
> B>C: 80
> B>D: 75
> (something else): 60
> ...
>
> First we admit C>A and D>A. Then we have a choice of either admitting
> A>B or B>C; we can't admit both.
>
> So suppose we admit A>B. Then we have {C,D}>A>B; admitting B>C would
> produce a cycle, and so would B>D. So we skip, go to the something else,
> and the score vector is [100, 99, 80, 60, ...].
>
> On the other hand, suppose we admit B>C. Then we have D>A and B>C>A. Now
> we can admit B>D and get B>D>C>A with score vector [100, 99, 80, 75, 60,
> ...], which is better by the RP metric.
>
> But the problem here is that when we were considering how to break the
> strength-80 tie, we didn't know which would close off the ability to
> admit B>D. So ordinary RP only maximizes the score vector in the absence
> of ties.
>
>
> So now, the MAM/RP-based method. The idea is this: through linear
> programming we can find what's the greatest strength pairwise victory
> supported by a Hare quota of the voters without having to determine who
> the voters is.
>
> To maximize the RP score in a constituency (subject to the caveats
> above), we keep a list of pairwise victories that we've affirmed, just
> like in ordinary Ranked Pairs. Then for each unaffirmed pairwise
> victory, we use LP to determine which Hare quota consistent with what
> we've already locked will maximize the strength of that victory; and
> then we choose the one that doesn't produce a cycle, and that has
> maximum strength.
>
> For instance, suppose the Hare quota is 1000. When starting out, we find
> that it's possible to choose a Hare quota of voters so that everybody
> prefers A>B. So we lock A>B.
> Then we try every other pairwise preference, and in essence for X>Y ask:
> what's the largest support you can give X>Y by choosing a Hare quota of
> voters all of whom rank A>B?
> Suppose B>C has largest such support 850, while C>D has largest support
> 900. Okay, then we admit C>D.
> Then we ask for all other pairwise preferences X>Y: what's the largest
> support you can give X>Y by choosing a Hare quota of voters, all of whom
> rank A>B, and 900 of whom rank C>D?
>
> And so on. As we admit more and more pairwise preferences, the Hare
> quota's worth of voters becomes increasingly more distinct until the
> constituency is uniquely defined. At that point we assign the RP winner
> to this constituency, remove these voters (and the winner) from the
> election, and repeat with the next constituency.
>
>
> But the tie problem shows up here, too. What should we do if the number
> of seats is rather large so that (for the first step) it's easy to find
> a Hare quota all of whom support X>Y, no matter who X and Y are? We have
> to choose one of them, but we can't be sure we'll maximize the RP score
> because of the path dependence problem I mentioned. So that's where I
> got stuck.
>
> But then I realized that single winner RP has this problem too! So it
> shouldn't be a strike against the legitimacy of the RP procedure; we
> just need to break the tie somehow. Choosing the pairwise preference
> with most total support over the whole electorate is probably as good a
> way as any.
>
> So this multiwinner method should be implementable, although it would be
> very slow: for each step, we'd have to go through every pairwise
> preference to find the one with maximum attainable support; and worst
> case, we have to go through n^2 steps. That makes for O(n^4) calls to
> the LP solver, which, though it's polynomial in the number of voters,
> may have a very high polynomial degree indeed.
>
> Perhaps there is some dynamic programming trick that could be used
> instead of linear programming, but I don't see it at the moment.
>
> And perhaps there is a way to look ahead and actually maximize the RP
> score, but I doubt it is. I suspect it's related to that determining if
> candidate X can be made an RP winner by breaking ties in a particular
> way -- is NP-complete.
>
> -km
> ----
> Election-Methods mailing list - see https://electorama.com/em for list info
>