[EM] A method "DNA" generator, tester, and fixer

Mon May 24 16:24:25 PDT 2010

Hi Kristofer,

--- En date de : Lun 24.5.10, Kristofer Munsterhjelm <km-elmet at broadpark.no> a écrit :
> Kevin Venzke wrote:
> > Hi Kristofer,
> > 
> > --- En date de : Mer 19.5.10, Kristofer Munsterhjelm
> <km-elmet at broadpark.no>
> a écrit :
> 
> >> Perhaps there's a way to chaff out the wrong
> methods before
> >> having to evaluate them. One way of doing so would
> be to
> >> relate new methods to their common old methods -
> if the old
> >> tester's "DNA" is a subset of the new one, you
> might know
> >> that all methods that have the first three "old
> DNA
> >> characters" as A, B, and C respectively would
> fail
> >> Plurality.
> > 
> > Yes, actually all the "static" criteria are defined
> this way internally.
> > Plurality for example is defined as
> "ADAADAEXEDDDDDDXXXEXEEXEEXE".
> 
> Out of curiosity (and possible interest in reproducing
> this), what does  that mean? I imagine "A" means "must
> be A at this spot" and X is "doesn't matter what is here",
> but what's D and E?

You're right about A/X. I'm fairly certain I coded D E F to mean A|B,
A|C, B|C as possible winners for that scenario. Note the lack of "F"
above would be consistent with this because there is no way that A could
be prohibited under Plurality.

> > For "dynamic" criteria like LNH*, mono-add-top, or
> burial criteria,
> > the logic is hard-coded, comparing the results of
> connected scenarios.
> > But if you can find the error you can also fix it (at
> least until you
> > then try to fix something else).
> > 
> > The huge problem is not really speed (at least with
> this level of
> > complexity) but the question of how do you represent,
> as concisely as
> > possible, a "method" and a "scenario"? Currently a
> method is a list of
> > all outcomes, which clearly won't work if a scenario
> is allowed to be
> > anything you can think of.
> 
> If the problem is not speed, it might be of interest to try
> all 7.6 trillion methods. While 7.6 trillion sounds like a
> lot, it's just 43 bits, and 40 bits crypto keys are easily
> cracked. The idea would be that  just checking and
> discarding is quicker than checking, fixing, and checking
> again.

What happens is eventually (within 5000 trials I'd say!) I become
confident that I'm not going to find any more matches for whatever type
of method I was looking for. I don't have a real way to be sure that
I've finished. Checking and fixing is slower at checking *all* the
possible methods, but it's faster at getting the methods you were trying
to gather. You find almost everything usable quite quickly and then
there is endless silence, until you give up on finding any more.

> Also, if you find, say, Plurality invaluable, you'd now
> know you only need to check methods starting in A and with
> #3, #4, and #6 set to A as  well. That cuts your space
> down from 3^27 to 3^23 - 37 bits.

Well, the question is what are you looking for when you try all these
methods? Your program has to have something it can point out as 
interesting. And if you can identify it, you can fix things up to have
it.

For example suppose I want to collect every possible red book that has
exactly 100 pages. If you show me a blue book with 230 pages, I can paint
the cover red and tear out 130 pages, and now this is exactly the kind
of book I was looking for. Show me a red book with 80 pages: I can't
easily fix that, so throw it away. It takes time to tear the pages out,
but if I only care about finding all of a type of book and not cataloging
all the books, I should just tear out pages and paint covers.

> As for the second part, that is indeed a problem. As far as
> I can see,
> there are two solutions: either to use a small number of
> scenarios/possible methods and generalize from them, or to
> use a large space (perhaps the 6D space of all three
> candidate preferences) and some sort of heuristic or proof
> mechanism to find reasonable methods. They both have
> strengths and weaknesses.
> 
> > One thing you could do is try to define methods in
> terms of metrics
> > (abstracted from actual ballots) that *might* be of
> interest, such as
> > pairwise contests. But it would be hard to make that
> very comprehensive:
> > I don't know how you would represent IRV this way, for
> instance. It
> > could also be difficult to be sure that you're doing
> justice to the
> > criteria you want to check.
> 
> If you can have conditionals, then you can define IRV for
> three
> candidates. Say A > B > C in Plurality order, then
> 
> if A beats B pairwise
>  then A
> if B beats A
>  then B
> else tie
> 
> because C must lose (by the definition, C is eliminated),
> and the
> ballots remaining (A vs B, B vs A) are equal to the
> respective pairwise
> counts.

Yes, that would be great. We would want a comprehensive way to generate 
those conditionals though.

> It might be possible to define methods working on the
> tournament matrix
> as an array itself. The possible outcomes would be:
>  Condorcet winners:
>   A beats B, A beats C
>   B beats A, B beats C
>   C beats A, C beats B
>  Cycle:
>   A > B > C > A (same as B > C > A >
> B, C > A > B > C)
>   A > C > B > A (same as B > A > C >
> B, C > B > A > C)
> 
> thus providing a Condorcet/tournament matrix method
> definition of 5 ternary digits (3 for the CWs, 2 for the
> cycles), where the candidates are sorted in your Plurality
> order. If you want to introduce strength of preference into
> it, you could expand on the cycles so that the winner with
> the greatest margin (WV, etc) appears first in the cycle,
> e.g:

By interesting coincidence the 27 scenarios map perfectly to a pairwise
diagram where each contest is either a majority in one of two directions,
or no majority. (When no majority, the Condorcet methods see a win for
the one with more first preferences.)

You can quickly see that three contests each with three possibilities is
27.

>  A > B > C > A (A beats B more than B beats C)
>  A > C > B > A
>  B > A > C > B
>  B > C > A > B
>  C > A > B > C
>  C > B > A > C
> 
> giving 9 ternary digits in total (who the winner is
> supposed to be, base 4 if ties are possible).
> 
> A further expansion would do the same for Condorcet winners
> to make non-Condorcet methods, e.g:
>  A beats B, A beats C (A beats B more than A beats C)
>  A beats C, A beats B
>  B beats A, B beats C
>  B beats C, B beats A
>  C beats A, C beats B
>  C beats B, C beats A
> 
> for a total of 6+6 = 12 "trits" - 6 for the CW case and 6
> for a cycle. To handle IRV, you'd have to add "incomplete
> beat parameters" that would either decide a winner (if the
> DNA is set to non-tie for that position) or just continue to
> the next in squence if it's set to tie; that would expand
> the DNA to base-4.
> 
> The burial resistant (but nonmonotonic) "BPW" method (in
> 3-cand case,
> elect the one who beats the Plurality winner the most)
> would require a
> mathematical operator as well, I think.
> 
> I'm not sure how you'd determine say, the schemata for
> Plurality upon such a DNA, though, since the DNA defines how
> the method works rather than its results on certain
> scenarios. 3^9 is 15 bits so it shouldn't be needed - you
> could just brute-force your way through by generating ballot
> sets until either it shows a failure or the program assumes
> it likely that the method passes the criterion. (3^12 is 20
> bits, and 4^12, 24.)
> 
> >> You could also try to find particularly
> "discriminating"
> >> tests for the methods and criteria by storing the
> examples
> >> that show a method fails a certain criterion. Say
> you have a
> >> priority queue of 1000 example ballot
> configurations that
> >> have proven prior methods to fail monotonicity. Go
> through
> >> all of them, increasing the count (priority) if
> any of them
> >> shows the method to fail monotonicity. If none do,
> generate
> >> random ballot sets, and if you find one that does
> show the
> >> method to fail monotonicity, replace the lowest
> priority
> >> example with this ballot set. If there are
> particularly
> >> discriminating examples (like a monotonicity
> failure that
> >> can be generalized to all weighted positional
> systems),
> >> those examples should rise to the top.
> > 
> > Yes, I could see that being useful if the number of
> possible scenarios
> > got that high.
> 
> It could also be interesting as an end in itself. If some
> examples are harder to break than others, these could be
> used in testing new methods. Most likely, there will be some
> examples that cover certain classes of methods and others
> that cover others - there wouldn't be a "one true
> monotonicity failure example".
> 
> >> I think a good way to extend it would be to have
> a
> >> hierarchy, so that each greater "DNA" contains the
> less
> >> detailed "DNA" in it. That way, you can exclude
> many methods
> >> at once, making it easier to search for needles in
> the
> >> haystack.
> > 
> > Yes, that would be useful. The big question though is
> how to encode
> > additional scenarios into DNA. It has to be systematic
> for it to be
> > possible to write code to traverse "ballot space."
> > 
> > One thing I considered this morning is providing the
> three factions
> > one to three "split vote" options representing
> dividing the bloc in
> > half, each half with a different second preference or
> lack of. But you
> > have a "half of what?" problem. We only know the bloc
> size order, not
> > what combinations of half-blocs are larger than which.
> We could make assumptions but the methods we'd end up with
> would probably not translate
> > very easily to the real world.
> 
> How do you get around this problem for the blocs you're
> already testing? 

For most methods it is fully adequate to know the size ranking of the
three factions. Some methods can't be implemented, though, without
stating assumptions about the absolute sizes of the blocs.

> The DNA can presumably code for which bloc
> is larger than the others, but not the extent to which one
> bloc is larger than the others. To some extent, the problem
> is limited by that A is always the plurality winner, but it
> wouldn't remove the problem completely.
> 
> If other approaches fail, one approach might be to try to
> determine what scenarios are most common - based on what
> real world ballot data we do know. I think Tideman has a
> database of that, as does OpenSTV for multiwinner
> elections.

A difficulty there is that a new method could have terrible incentives
resulting in elections that are not known to be realistic.

For example when I was looking for methods which failed LNHarm but
satisfied the criterion that when X wins, adding a preference for Y
can't make *Y* win (but could make Z win), at first I was optimistic
that there were a lot of these methods. But when I noticed that they
all failed "if X wins then raising X to second should preserve the win"
I realized that these methods were probably just swapping the 
significance of ranking a candidate 2nd vs 3rd!

Sorry I don't have more comments on this. It's too unclear to me how to
accomplish much more.

Kevin