[EM] ASCII maps

Fri Feb 25 08:14:02 PST 2011

Kevin Venzke wrote:
> Hi Kristofer,
> 
> --- En date de : Lun 21.2.11, Kristofer Munsterhjelm <km-elmet at broadpark.no> a écrit :
>> Then it tries to
>> come up with a nice map
>>> that minimizes inaccuracy.
>> You could try using synthetic coordinate algorithms for
>> mapping the distances to 2D. I did that for competing
>> entries in a programming game, once, using the centralized
>> Vivaldi algorithm as described in
>> pdos.csail.mit.edu/papers/vivaldi:sigcomm/paper.pdf.
>>
>> Another option would be to use principal components
>> analysis, but I know less about that.
> 
> Oh dear. That looks very complicated. I just used the number of scenarios
> where the methods agreed. So, the maximum distance between two methods is
> 27. When the methods are plotted I find the Euclidean distance, subtract
> from the desired distance, and square it. Add all that up and that's the
> measure of a poor fit.
> 
> To improve the fit it tries rerolling a few candidates, or tweaking their
> positions.

FWIW, I dug up that old code and made a map of my own using Kendall tau 
distance between the different methods' output orderings. The ballot 
generator was a combination of IIC (every ordering equally likely) and a 
spatial model - every other round used the one, every other the other.

The picture is here: http://munsterhjelm.no/km/elections/Methods.png

I don't see any obvious axial characteristics. Perhaps methods closer to 
the bottom are more indecisive, but it ranks Copeland in the "misc 
Condorcet methods" blob, ahead of VMedian-Ratings which does try to 
break ties quite a bit more. Or perhaps methods closer to the bottom 
make use of less information at once, which could explain why the 
eigenvector-type methods are near the top.

As for the abbreviated terminology I use, I'll explain some of the methods:

Eliminate[X] is a loser elimination method using X. Eliminate[Plurality] 
is IRV.

AVGEliminate[X] is a below-mean loser elimination method using X. 
AVGEliminate[Borda] is Nanson.

Keener is the Keener eigenvector method. Maximin is the pairwise (but 
not Condorcet) method where the candidate who beats whoever he beats by 
the smallest margin, by the greatest, wins. Minmin is the method where 
X's score is not the greatest win of some candidate Y against him, for 
an Y maximizing it, but the least win of some candidate Y against him, 
for an Y minimizing it, negated.

Ext-Minmax breaks Minmax ties by considering the next-to-greatest win 
and so on. Same thing with Ext-Minmin wrt ordinary Minmin.

HITS is https://secure.wikimedia.org/wikipedia/en/wiki/HITS_algorithm . 
NREM-Opt is Warren's optimal positional system (optimal in context of 
the every-ordering-equally-likely model, in that it produces the least 
Bayesian regret of all ranked methods under that model).

L-R are variations upon Least Reversal. LR-Defense is the simple least 
reversal method: least sum of opposing pairwise stats is better. 
LR-Offense is the symmetric version for offense: the candidate with the 
greatest sum of pairwise victory margins (or wv, etc) wins. LR-Both 
scores by the former subtracted from the latter.

Median Ratings are just that, either *norm*alized to 0-10 or with raw 
utilities (which is just the rank index for IIC). VMedian-Ratings break 
ties by considering ever expanding truncated means when the median fails 
to provide a result. Mode-Ratings is the silly method where the 
candidate with the greatest mode wins. It is not monotone, and I thought 
its strategy aspects and monotonicity weirdness would be similar to 
IRV's. Apparently not, unless the map's a bad fit.

Gradual-Set is a group of methods based on the Condorcet-Borda relation 
that Stephen Turner pointed out. It starts by calculating the Set 
(Smith, CDTT, whatever) using the pairwise matrix where A beats B by 
c(A,B), i.e. median values of what Stephen called s(A,B). This will 
produce a division; some are considered better than others, but since 
Set is a set, it won't be decisive, so the method expands the median to 
a truncated mean and breaks ties with the new ordering found by applying 
the set method to the new matrix. It continues alternating between 
expanding and refining the output until either all ties have been broken 
or there are no more data points to use.

Smith is the Smith set. Same for CDTT and CGTT, they are what they say. 
SDom ranks candidates strongly dominated by any other (as by the 
Independence of Strongly Dominated Alternatives criterion River passes) 
above those nondominated. MDD is the Majority Defeat Disqualification 
set: candidates who would be eliminated in MDDA are ranked below those 
who wouldn't be.