[EM] A method "DNA" generator, tester, and fixer

Wed May 19 16:52:02 PDT 2010

Hi Kristofer,

--- En date de : Mer 19.5.10, Kristofer Munsterhjelm <km-elmet at broadpark.no> a écrit :
> > --- En date de : Mar 18.5.10, Kristofer Munsterhjelm
> <km-elmet at broadpark.no>
> a écrit :
> >>> I've been working on a new method
> generator/tester/fixer. I did this once before, and my
> approach is still the same, but now
> >>> truncation is allowed (instead of strict
> ranking). The old
> >>> simulation only defined methods on 8
> scenarios, allowing 6561
> >>> possible methods. The new simulation defines
> methods on 27
> >>> scenarios, which means 7.6 trillion methods.
> >> 
> >> Does this thing actually create new methods, as
> implied by
> >> the term "generator"? If so, what approach do you
> use -
> >> brute force, genetic programming, something
> different?
> > 
> > Yes, it makes new methods by
> > 1. randomly generating the outcomes of the method
> > 2. attempting to fix criterion problems that are not
> wanted. Though
> > some criteria are harder to fix than others. If it
> can't fix it, the
> > method is discarded and it starts from scratch.
> > 
> > #2 is pretty important because otherwise with 7.6
> trillion possible
> > methods you would spend forever looking for even
> slightly usable methods.
> > For example, two-thirds of that count don't elect the
> first-preference
> > winner when everybody bullet votes.
> > 
> > The program is unable to explain the method in
> English.
> 
> Perhaps there's a way to chaff out the wrong methods before
> having to evaluate them. One way of doing so would be to
> relate new methods to their common old methods - if the old
> tester's "DNA" is a subset of the new one, you might know
> that all methods that have the first three "old DNA
> characters" as A, B, and C respectively would fail
> Plurality.

Yes, actually all the "static" criteria are defined this way internally.
Plurality for example is defined as "ADAADAEXEDDDDDDXXXEXEEXEEXE".

For "dynamic" criteria like LNH*, mono-add-top, or burial criteria,
the logic is hard-coded, comparing the results of connected scenarios.
But if you can find the error you can also fix it (at least until you
then try to fix something else).

The huge problem is not really speed (at least with this level of
complexity) but the question of how do you represent, as concisely as
possible, a "method" and a "scenario"? Currently a method is a list of
all outcomes, which clearly won't work if a scenario is allowed to be
anything you can think of.

One thing you could do is try to define methods in terms of metrics
(abstracted from actual ballots) that *might* be of interest, such as
pairwise contests. But it would be hard to make that very comprehensive:
I don't know how you would represent IRV this way, for instance. It
could also be difficult to be sure that you're doing justice to the
criteria you want to check.

> More generally, this may lead the way to schemata where,
> say, "DNA" where the first character is A and the tenth is C
> must fail some criterion. Then, if you find that criterion
> invaluable (like the "bullet voting alone should make the
> Plurality winner win" criterion), you could simply  cut
> away all methods that do have A at first and C at tenth.

That is basically what currently happens, except the method is immediately
fixed by changing that one outcome.

> You could also try to find particularly "discriminating"
> tests for the methods and criteria by storing the examples
> that show a method fails a certain criterion. Say you have a
> priority queue of 1000 example ballot configurations that
> have proven prior methods to fail monotonicity. Go through
> all of them, increasing the count (priority) if any of them
> shows the method to fail monotonicity. If none do, generate
> random ballot sets, and if you find one that does show the
> method to fail monotonicity, replace the lowest priority
> example with this ballot set. If there are particularly
> discriminating examples (like a monotonicity failure that
> can be generalized to all weighted positional systems),
> those examples should rise to the top.

Yes, I could see that being useful if the number of possible scenarios
got that high.

> The whole point of that would be to make criterion checking
> much faster, though that speedup would only exist for the
> methods that do in fact fail the criterion in question. That
> is to be expected since one can't prove the absence of a
> failure by random sampling alone.
> 
> > Unfortunately the program doesn't have such high
> resolution that it
> > can take arbitrary proportions. It only defines
> methods and tests
> > criteria on 27 comprehensively interconnected
> scenarios.
> > 
> > I am not sure what the most natural way to expand the
> generator would
> > be. If you add equal ranking at the top, you would
> support 125 scenarios
> > instead of 27. You would pretty much need a program to
> help you find
> > the definition of a known method. Furthermore I'm not
> sure how useful
> > this would really be when you still can only have
> three factions.
> > 
> > Adding a fourth candidate and bloc would be much
> harder: 1296 if you
> > require complete ranking. And that's if you can figure
> out how to have a
> > fourth faction at all. I really don't know the easiest
> way to do that.
> > But four factions with three candidates would probably
> be my choice.
> 
> I think a good way to extend it would be to have a
> hierarchy, so that each greater "DNA" contains the less
> detailed "DNA" in it. That way, you can exclude many methods
> at once, making it easier to search for needles in the
> haystack.

Yes, that would be useful. The big question though is how to encode
additional scenarios into DNA. It has to be systematic for it to be
possible to write code to traverse "ballot space."

One thing I considered this morning is providing the three factions
one to three "split vote" options representing dividing the bloc in
half, each half with a different second preference or lack of. But you
have a "half of what?" problem. We only know the bloc size order, not
what combinations of half-blocs are larger than which. We could make 
assumptions but the methods we'd end up with would probably not translate
very easily to the real world.

> The case of three candidates appears to be a good place
> from which to observe methods' behavior. Generally speaking,
> I think few methods found at random would have failures that
> only appear with more than three compared to those that fail
> criteria at three. Thus, four factions with three candidates
> sounds sensible.
> 
> The other option would be to use very large spaces and some
> kind of black box search (like SA or genetic programming).
> These require some kind of fitness function, though, and
> they're not as simple to program as one might expect,
> particularly for relative comparisons where you may want
> "methods that return different results but satisfy
> criteria". Multiobjective optimization and (for GP)
> coevolution would be the subjects of interest here, and more
> generally, automatic programming (like http://www-ia.hiof.no/~rolando/ ).
> 
> Now that I think of it, there's another approach as well:
> use some prefix to collapse most situations to a small
> space, then define the method on that small space. Forest's
> "sprucing up" would be an example of this, as (given certain
> preconditions), it can reduce method design to specifying
> how it behaves in the case of a Condorcet cycle among three
> candidates. However, one pays for this by a reduction in
> variety - spruced-up methods are all Condorcet (elects
> uncovered candidates, actually), so that throws LNH* out the
> window, for instance.
> One therefore arrives at the problem of determining which
> methods are different enough to be "interesting".

Actually I think this is sort of what I'm doing. All 27 scenarios can
be described in terms of 3 factions and 3 blocs, knowing first-preference
order, and whether and where there are majority-strength wins (which
happen conveniently to imply majority or unanimous approval also).

For example if A has majority wins over B and C, there is simply no other
way to construct this scenario, except that B and C blocs are both giving
the second preference to A.

> > Incidentally, speaking of monotonicity: I tried
> creating a weak form of
> > LNHarm which says that adding a lower preference can't
> make *that*
> > preference win. But I couldn't find any usable method
> which satisfied
> > that (while failing LNHarm) and also my "raising to
> second place" version
> > of monotonicity. That's too bad. I thought I might
> find something that
> > would blow my mind.
> 
> Indeed, that's unfortunate.
> That criterion also reminds me of Woodall's "weak IIA"
> criterion, which I've explained in an earlier post. Could
> your generator detect failures of that criterion? I haven't
> heard it mentioned elsewhere, so it would be interesting to
> know which methods (if any) pass it.

I can't test it because it requires adding a new ballot type (and also
candidate). As far as what would satisfy it, I try to ask what simple
measurement we could use to elect X that is preserved when he is dropped
to second on some of his ballots. But I have to stop myself and realize
that actually he could be dropped to lower than second, because we
could repeat the procedure again.

The only thing I can think of is the rank-ballot interpretation of
Approval: A candidate gets a full point for every explicit ranking. And
it must be possible to explicitly rank all the candidates, even last
place.

Kevin