[EM] Scoring (was Re: OpenSTV 2.1.0 released)

Tue Sep 18 08:03:04 PDT 2012

On 09/16/2012 02:35 PM, Juho Laatu wrote:
> On 16.9.2012, at 9.57, Kristofer Munsterhjelm wrote:
>

>>
>> (More precisely, the relative scores (number of plumpers required)
>> become terms of type score_x - score_(x+1), which, along with SUM
>> x=1..n score_x (just the number of voters), can be used to solve
>> for the unknowns score_1...score_n. These scores are then
>> normalized on 0..1.)
>>
>> It seems to work, but I'm not using it outside of the fitness
>> function because I have no assurance that, say, even for a monotone
>> method, raising A won't decrease A's score relative to the others.
>> It might be the case that A's score will decrease even if A's rank
>> doesn't change. Obviously, it won't work for methods that fail
>> mono-add-plump.
>
> What should "candidate's score" indicate in single-winner methods? In
> single-winner methods the ranking of other candidates than the winner
> is voluntary. You could in principle pick any measure that you want
> ("distance to victory" or "quality of the candidate" or something
> else). But of course most methods do provide also a ranking as a
> byproduct (in addition to naming the winner). That ranking tends to
> follow the same philosophy as the philosophy in selecting the winner.
> As already noted, the mono-add-plump philosophy is close to the
> minmax(margins) philosophy, also with respect to ranking the other
> candidates.

What should "candidate's score" indicate? Inasfar as the method's winner 
is the one the method considers best according to the input given, and 
the social ordering is a list of alternatives in order of suitability 
(according to the logic of the method), a score should be a finer 
graduation of the social ordering. That is, the winner tells you what 
candidate is the best choice, the social ordering tells you which 
candidates are closer to being winners, and the rating or score tells 
you by how much.

If the method aims to satisfy certain criteria while finding good 
winners, it should do so with respect to finding the winner, and also 
with respect to the ranking and the score. A method that is monotone 
should have scores that respond monotonically to the raising of 
candidates, too.

> I note that some methods like Kemeny seem to produce the winner as a
> byproduct of finding the optimal ranking. Also expression "breaking a
> loop" refers to an interest to make the potentially cyclic socielty
> preferences linear by force. In principle that is of course
> unnecessary. The opinions are cyclic, and could be left as they are.
> That does not however rule out the option of giving the candidates
> scores that indicate some order of preference (that may not be the
> preference order of the society).

I think most methods can be made to produce a social ranking. Some 
methods do this on its own, like Kemeny. For others, you just extend the 
logic by which the method in question determines the winner. For 
instance, disregarding ties, in Schulze, the winner is the candidate 
whom nobody indirectly beats. The second place finisher would then be 
the candidate only indirectly beaten by the winner, and so on.

>>
>> Turning rankings into ratings the "proper" way highly depends on
>> the method in question, and can get very complex. Just look at this
>> variant of Schulze: http://arxiv.org/abs/0912.2190 .
>
> They seem to aim at respecting multiple criteria. Many such criteria
> could maybe be used as a basis for scoring the canidates. Already
> their first key criterion, the Condorcet-Smith principle is in
> conflict with the mono-add-plump score (there can be candidates with
> low mono-add-plump score outside the Smith set).
>
> My favourite approach to scoring and picking the winner is not to
> have a discrete set of criteria (that we try to respect, and violate
> some other criteria when doing so) but to pick one philosophy that
> determines who is the best winner, and also how good winners the
> other candidates would be. The chosen philosophy determines also the
> primary scoring approach, but does not exclude having also other
> scorings for other purposes (e.g. if the "ease of winning" differs
> from the "quality of the candidate").

If you do that, you get into a problem when comparing methods, however. 
Every method can be connected to an optimality measure that it 
optimizes. That measure might be simple or it might be very complex, but 
still, there's a relation between the method and something that it 
attempts to optimize. Discussing methods could then easily end up on 
cross purposes where one person says: "but I think minmax is the obvious 
natural thing to optimize", and another says "but I think mean score is 
the obvious natural thing to optimize", and nobody gets anywhere.

At least with criteria, we have some way of comparing methods. We can 
say that this method behaves weirdly in that if some people increase the 
ranking of a candidate, the candidate might lose, whereas that method 
does not; or that this method can deprive candidates of a win if 
completely alike candidates appear, whereas that method does not.

Or perhaps it's more appropriate to say that if we want to compare 
methods by some optimization function or philosophy, we should have some 
way of anchoring that in reality. One may say "I think Borda count is 
the obvious natural thing to optimize", but if we could somehow find out 
how good candidates optimizing for Borda would elect, that would let us 
compare the philosophies. Yet to do so, we'd either have to have lots of 
counterfactual elections or a very good idea of what kind of candidates 
exist and how they'd be ranked/rated so that it may be simulated, 
because we can't easily determine how good society would be "if" we 
picked candidate X instead of Y.
(Well, we might in very limited situations: for example, one could take 
the pairwise matrix for a chess tournament once it's halfway through and 
use that to find the winner, then determine how often each election 
method gets it right. However, it's not obvious if "accuracy at 
predicting chess champions" is related to "being able to pick good 
presidents", say.)