[EM] Even more complex Sainte-Lague: method description and why

Tue Aug 20 14:53:03 PDT 2013

So, discovering the compromising incentive in ordinary Sainte-Lague, I 
thought that if one's looking for real deluxe methods, ordinary CPO-SL 
might not be  enough. In the example above, CPO-SL wouldn't make a 
difference since every party gets a seat. Thus I looked  about in my 
mind, trying to think of a more thoroughly Condorcet generalization of 
Sainte-Lague, and I think  I have found an ordinary multiwinner 
generalization of Sainte-Lague. That is, the method deals with 
individual  candidates, not parties. One could in theory use it for 
party list, but doing that in a brute-force manner by  simply replacing 
party preferences by the party's slate would be unfeasible when the 
number of seats grow to  the kind of numbers you would see for a 
national count.

Thus I'd have to work further to find out how to make it feasible for 
party list. But since it might be of  interest as a candidate-based 
method, I'll describe it here, even if it still has a few problems.

The framework is simple enough. It's a set comparison method like 
CPO-STV and Schulze STV: you have a large  matrix where the different 
possible councils are "candidates", each contest between two possible 
councils cA  and cB is decided by some function c(cA, cB), and then you 
run an ordinary Condorcet method on the huge matrix and the winning 
council is the one you pick.

For this particular method (I could really use a name -- if anyone has a 
suggestion, feel free to say it), c(cA, cB) is the sum of another 
function across all ballots. For a ballot b in the set of all ballots B,
say c(cA, cB) = SUM 1..n rep(cA, cB, b). rep() is short for 
"representation" since that's what it's supposed  to measure.

Now, onto rep(). To calculate rep(cA, cB, b), we'll need some constants 
and derived variables. Say s is the  number of seats and w is the weight 
of ballot b.
Then the first step is to write down the rank numbers for each council 
with respect to the ballot. The rank  number for a candidate is 0 if 
that candidate is ranked first, 1 if he's ranked second, and so on. Ties 
share the highest rank (lowest rank number), but the next rank after the 
equal-ranks is incremented by the  number of equal-ranked candidates 
(e.g. A>B=C>D has rank numbers 0113).
So if cA is XYZ and cB is XYW, and the ballot is X>A>W>Y>Z, then we get:
	cA ranking numbers: 0 (X), 3 (Y), 4(Z) or 034.
	cB ranking numbers: 0 (X), 3 (Y), 2(W) or 032.

Once that's done, make a list of all distinct rank numbers. Here, they 
are 0234. Truncate this list at s (the  number of seats) entries. The 
truncated list here is 023 since there are three seats. Then count the 
number  of ranks in that list for each candidate council, and call these 
counts sA and sB. If a voter ranks q candidates and q < s, only those q 
candidates are listed.

Let's do that for the candidate council here. I'll replace the number 
with a hyphen for each candidate that's in the  list.
	cA: 034, i.e. --4, or 2 in the list, so sA = 2.
	cB: 032, i.e. ---, or 3 in the list, so sB = 3.

If the ballot instead had been X>Z, then only the listed candidates are 
being considered, and we'd have had
	cA: 04 and
	cB: 0
The only distinct rank here would be the 0 and 4, so
	cA: --
	cB: -
sA = 2 and sB = 1.

We're almost done. Define f(x) to be the divisor generation function for 
the divisor method of your choice.  For Sainte-Lague, f(x) = 2x+1. Then, 
rep(cA, cB, b) = w - w/f(sA), and rep(cB, cA, b) = w - w/f(sB).

-

As one may easily determine, in the one-seat case without ties, this 
reduces to ordinary Condorcet. There's  one seat, so rep(cA, cB, b) 
gives w - w / f(1) to the candidate that wins, and w - w / f(0) to the 
one that  loses. Since f(0) = 1, w - w / f(0) = 0, so in effect, c(A, B) 
is equal to the number of people who prefer A  to B, times 2/3 (which is 
a constant throughout and so cancels out).

There is a slight snag involving ties, though. Consider the one-seat 
case with rep(A, B, b), with ballot b  ranking A and B equal at top. The 
ranking number is 0 for both, so both rep(A, B, b) and rep(B, A, b) 
returns  w - w / f(1). In effect, the function behaves like A=B means 
"both A>B and B>A". Most people would consider  that to be a bug.
I haven't found a proper, "continuous" solution, so for the time being, 
return 0 whenever rep(cA, cB, b) = rep(cB, cA, b). Any suggestions to a 
continuous solution (one that generalizes "smoothly" instead of
abruptly) would be welcome.

(Perhaps a certain person who has now retired would have liked this 
method since it is Sainte-Lague based and also, by coincidence, behaves 
pretty Approval-ish. But I don't think it passes the FBC.)

-

The whole ranking number business may seem rather arbitrary, which it in 
a way is because I tried different approaches until I found one that 
seemed to work... but I also had some criteria in mind and checked the 
method against them.

The first is a subset independence criterion. This states that if you 
add a bunch of candidates to both councils, that should increase both sA 
and sB by the same amount. The ranking number function rep passes this 
criterion as long as none of the added candidates are truncated: if you 
add a bunch of candidates to both councils, s increases, so these 
candidates will either both be counted on the cA and cB side, or on 
neither. This criterion excludes the first function I tried: one where 
you sort both cA and cB's ranking numbers in the same order, then go 
from left to right, increasing sA when cA's number is less than cB's in 
the column in question, and cB when cB's number is less than cA's. With 
{cA: 012, cB: 123}, that idea fails this criterion.

The second is a one-candidate inferiority/monotonicity criterion. If the 
councils are identical except for a single candidate, and b considers 
the candidate in cB inferior, then it should always be the case that 
rep(cA, cB, b) > rep(cB, cA, b) when w > 0. This one mostly serves to 
exclude a minimax matching algorithm I tried before arriving at the 
current rep(). The minimax matching would line up cA's and CB's ranking 
numbers so that the number of times cA's column was greater than cB's 
would be as close to the number of times cB's column was greater than 
cA's as possible; and then for each column, cA would get a point if its 
number was lesser than the other (i.e. candidate closer to the top). But 
that violated this criterion and so I didn't use it.

The third is a reduction to Sainte-Lague party list when everybody 
plumps for their party's list. This post is already pretty long, but 
I'll sketch a proof. Say parties are A...N, the number of voters voting 
for these parties is w_A...w_N, and the number of seats the parties get 
in council x is s_N_x. Then CPO-SL indicates that the "nonrepresentation 
sum" [SUM (R in A...N) w_R/f(s_R_x)] is minimized when x is the council 
you would get by Sainte-Lague. Equivalently, [SUM (R in A...N) w_R - 
w_R/f(s_R_x)] is maximized.
So we just need to show that when the voters plump, if b_R is the ballot 
plumping for R and thus has weight w_R, rep(x, y, b_R) returns w_R - w_R 
/ f(s_R_x). This would then imply that the Sainte-Lague council is the 
CW and wins.
If each party has enough candidates to fill the whole council and there 
are no ties, then the proof is simple. Every non-R candidate will have 
ranking numbers greater than the number of seats. So those can never
contribute to sx (score in favor of council x). So sx is equal to the 
number of R-candidates in x, i.e. s_R_x, and this is put into w_R - w_R 
/ f(sx), giving w_R - w_R / f(s_R_X). QED.
If the voters only rank those they plump for, the proof is similarly 
simple. Since truncated candidates aren't considered, then by the same 
argument as above, only candidates in R can contribute to s_R_X. But if 
the ballot ranks a party and then another, the reduction may fail, even 
if the ballot puts all party members of the same party first.

-

So the method *almost* works. It has a few discontinuties when dealing 
with ties and truncation - particularly  the "wrong" reduction to 
single-seat Condorcet (giving both A>B and B>A a point when A=B) and the 
explicit  disregard of truncated candidates instead of this arising from 
the method itself. Those features both make this  method more 
Approvalish and hint that the reduction to Sainte-Lague with party list 
isn't particularly robust.
So I would prefer, say, that the method still reduces to Sainte-Lague 
with bloc plumping when the ballots are  symmetrically completed. But I 
can't see how to do that.

If you have suggestions as for how to make it be more robust, please let 
me know. The whole "representativity  per ballot" idea might have to be 
altered into something that uses the whole ballot set at once. I don't know.

It might also be possible to use more well-known pairwise optimality 
properties instead of w_x - w_x / f(sX). For instance, it could be 
possible to use the Webster minimization property Warren mentions in 
http://rangevoting.org/Apportion.html. One would then let the contest 
between A and B be w_x * (sX/w_x - s/V)^2, where sX is the score for X 
(the number of seats he "gets"), s is the number of seats in total, and 
V is the number of voters (sum of all weights). I don't know if it would 
generalize sufficiently, though; the best thing to do would be to check 
if it would pass the mentioned reduction and then compare performance on 
the proportionality/BR tradeoff curve.