[EM] How close to IIAC?
fsimmons at pcc.edu
fsimmons at pcc.edu
Fri Apr 23 13:51:16 PDT 2010
Suppose that voters fill out questionnaires of twenty or more yes/no answers. What is a good way to
calculate the “distance” between questionnaires? (Remember this is the key to getting an IIAC
compliant voting system.)
The big problem is that some of the questions are apt to be clones of each other. Suppose for example
that of the twenty questions on a questionnaire, the first fifteen were basically the same question in
disguise. Then almost all of the voters who voted yes on the first question would answer yes on the next
fourteen questions, which would make those fourteen questions not only redundant, but would also
distort the perception of distance between questionnaires if you used any of the standard metrics
(Hamming, Euclidean, etc.) on sets of vectors of zeroes and ones.
First suggestion:
Have each voter assign weights to the questions to reflect their relative importance to that voter. Then
normalize the weights so that they add to 100. Then given two questionnaires q1 and q2, the semi-
metric
rho(q1,q2) is the sum of the q1’s weights on all of the questions that q1 and q2 disagree. This is a
measure of how far the q1 voter thinks that the q2 voter differs from her on important questions. In this
first suggestion the proposed metric is
d(q1,q2)=rho(q1,q2)+rho(q2,q1).
Second suggestion:
1. Create a binary tree with the questionnaires as the leaves and a subset of the questions as
nodes as follows. The root node is the question on which the voters are most evenly balanced (break
ties randomly). Each subsequent node X is the question on which the voters that answered correctly to
arrive at that node are most evenly divided (breaking ties randomly).
2. Once all of the questionnaires have been classified as leaves on this binary tree. Assign to
each question a weight equal to the probability that a random leaf has that question as an ancestor.
3. The distance between two questionnaires is the sum of the weights of the questions on which
they differ.
Any other good ideas?
More information about the Election-Methods
mailing list