[EM] A method "DNA" generator, tester, and fixer

Wed May 19 10:06:31 PDT 2010

Kevin Venzke wrote:
> Hi Kristofer,
> 
> --- En date de : Mar 18.5.10, Kristofer Munsterhjelm <km-elmet at broadpark.no> a écrit :
>>> I've been working on a new method generator/tester/fixer. I did 
>>> this once before, and my approach is still the same, but now
>>> truncation is allowed (instead of strict ranking). The old
>>> simulation only defined methods on 8 scenarios, allowing 6561
>>> possible methods. The new simulation defines methods on 27
>>> scenarios, which means 7.6 trillion methods.
>>
>> Does this thing actually create new methods, as implied by
>> the term "generator"? If so, what approach do you use -
>> brute force, genetic programming, something different?
> 
> Yes, it makes new methods by
> 1. randomly generating the outcomes of the method
> 2. attempting to fix criterion problems that are not wanted. Though
> some criteria are harder to fix than others. If it can't fix it, the
> method is discarded and it starts from scratch.
> 
> #2 is pretty important because otherwise with 7.6 trillion possible
> methods you would spend forever looking for even slightly usable methods.
> For example, two-thirds of that count don't elect the first-preference
> winner when everybody bullet votes.
> 
> The program is unable to explain the method in English.

Perhaps there's a way to chaff out the wrong methods before having to 
evaluate them. One way of doing so would be to relate new methods to 
their common old methods - if the old tester's "DNA" is a subset of the 
new one, you might know that all methods that have the first three "old 
DNA characters" as A, B, and C respectively would fail Plurality.

More generally, this may lead the way to schemata where, say, "DNA" 
where the first character is A and the tenth is C must fail some 
criterion. Then, if you find that criterion invaluable (like the "bullet 
voting alone should make the Plurality winner win" criterion), you could 
simply  cut away all methods that do have A at first and C at tenth.

You could also try to find particularly "discriminating" tests for the 
methods and criteria by storing the examples that show a method fails a 
certain criterion. Say you have a priority queue of 1000 example ballot 
configurations that have proven prior methods to fail monotonicity. Go 
through all of them, increasing the count (priority) if any of them 
shows the method to fail monotonicity. If none do, generate random 
ballot sets, and if you find one that does show the method to fail 
monotonicity, replace the lowest priority example with this ballot set. 
If there are particularly discriminating examples (like a monotonicity 
failure that can be generalized to all weighted positional systems), 
those examples should rise to the top.

The whole point of that would be to make criterion checking much faster, 
though that speedup would only exist for the methods that do in fact 
fail the criterion in question. That is to be expected since one can't 
prove the absence of a failure by random sampling alone.

> Unfortunately the program doesn't have such high resolution that it
> can take arbitrary proportions. It only defines methods and tests
> criteria on 27 comprehensively interconnected scenarios.
> 
> I am not sure what the most natural way to expand the generator would
> be. If you add equal ranking at the top, you would support 125 scenarios
> instead of 27. You would pretty much need a program to help you find
> the definition of a known method. Furthermore I'm not sure how useful
> this would really be when you still can only have three factions.
> 
> Adding a fourth candidate and bloc would be much harder: 1296 if you
> require complete ranking. And that's if you can figure out how to have a
> fourth faction at all. I really don't know the easiest way to do that.
> But four factions with three candidates would probably be my choice.

I think a good way to extend it would be to have a hierarchy, so that 
each greater "DNA" contains the less detailed "DNA" in it. That way, you 
can exclude many methods at once, making it easier to search for needles 
in the haystack.

The case of three candidates appears to be a good place from which to 
observe methods' behavior. Generally speaking, I think few methods found 
at random would have failures that only appear with more than three 
compared to those that fail criteria at three. Thus, four factions with 
three candidates sounds sensible.

The other option would be to use very large spaces and some kind of 
black box search (like SA or genetic programming). These require some 
kind of fitness function, though, and they're not as simple to program 
as one might expect, particularly for relative comparisons where you may 
want "methods that return different results but satisfy criteria". 
Multiobjective optimization and (for GP) coevolution would be the 
subjects of interest here, and more generally, automatic programming 
(like http://www-ia.hiof.no/~rolando/ ).

Now that I think of it, there's another approach as well: use some 
prefix to collapse most situations to a small space, then define the 
method on that small space. Forest's "sprucing up" would be an example 
of this, as (given certain preconditions), it can reduce method design 
to specifying how it behaves in the case of a Condorcet cycle among 
three candidates. However, one pays for this by a reduction in variety - 
spruced-up methods are all Condorcet (elects uncovered candidates, 
actually), so that throws LNH* out the window, for instance.
One therefore arrives at the problem of determining which methods are 
different enough to be "interesting".

> Incidentally, speaking of monotonicity: I tried creating a weak form of
> LNHarm which says that adding a lower preference can't make *that*
> preference win. But I couldn't find any usable method which satisfied
> that (while failing LNHarm) and also my "raising to second place" version
> of monotonicity. That's too bad. I thought I might find something that
> would blow my mind.

Indeed, that's unfortunate.
That criterion also reminds me of Woodall's "weak IIA" criterion, which 
I've explained in an earlier post. Could your generator detect failures 
of that criterion? I haven't heard it mentioned elsewhere, so it would 
be interesting to know which methods (if any) pass it.