[EM] (no subject)

Rob Lanphier roblan at gmail.com
Wed May 26 23:38:14 PDT 2021

Hi folks,

There's an interesting discussion happening on reddit about ASCII
formats for aggregated ballot images.  I'll provide a deep link to my
comment here:


What the original reddit poster (/user/jman722) made me realize is
that it's possible to come up with a format that works for both range
ballots and ranked ballots.  The range ballots can be on a scale of
0-5, where 5 is "awesome", and 0 is "awful".  The ranked ballots can
be A>B>C.

I'm going to use the example that the original reddit poster made:

12: Allie/5, Billy/5, Candace/4, Dennis/3, Edith/3, Frank/2, Georgie/1, Harold/0
7: Allie/4, Billy/0, Candace/2, Dennis/3, Edith/1, Frank/0, Georgie/5, Harold/3
5: Allie/0, Billy/3, Candace/2, Dennis/3, Edith/4, Frank/5, Georgie/3, Harold/4

That format is good but not great.  It takes a careful eye to see that
Allie, Billy, Frank, and Georgie are the passionate favorites (earning
a "5" score), and another close look to see that Allie, Billy, Frank,
and Harold are listed as completely unacceptable (earning a "0" score)

My old format that I used for my 1996 Perl script that I wrote and
published in The Perl Journal would express those ballots this way:

12: Allie=Billy>Candace>Dennis=Edith>Frank>Georgie>Harold
7: Georgie>Allie>Dennis=Harold>Candace>Edith>Billy=Frank
5: Frank>Edith=Harold>Billy=Dennis=Georgie>Candace>Allie

With this format, it becomes clear that 12 voters really like Allie
and Billy and really don't like Harold.  The next 7 voters really like
Georgie, and really don't like Billy and Frank.  The remaining 5
voters really like Frank, but really dislike Allie.  One has to add up
12+7+5 to realize there are 24 voters in this election.

The ratings are stripped from my old 1996-ish format.  It only
provides the following parse tokens:

[quantity]: [cand5yay] [> or =] [cand4good] [> or =] ... [cand0boo]

It seems as though it would be possible to come up with a merged
format that would express the range ballots above like this:

12: Allie/5 =Billy/5 >Candace/4 >Dennis/3 =Edith/3 >Frank/2 >Georgie/1 >Harold/0
7: Georgie/5 >Allie/4 >Dennis/3 =Harold/3 >Candace/2 >Edith/1 >Billy/0 =Frank/0
5: Frank/5 >Edith/4 =Harold/4 >Billy/3 =Dennis/3 =Georgie/3 >Candace/2 >Allie/0

The ">", "=", and "," characters could all be optional delimiters
between the candidate/score tuples on each line (though at least one
of those three delimiters WOULD be required). If ">" or "=" is used as
a delimiter, then the candidates MUST be ordered by score (highest
score first). Candidate tokens can be one or more ASCII characters
([A-Z] or [a-z]) OR the candidate token MUST start with a square
bracket ([) and end with the closing square bracket (]), and the
intervening text can be any unicode character (e.g. [Doña García
Márquez] or [Ximena Peña] or [楊安澤]) . Whitespace can be discarded, but
SHOULD be included for legibility.

Linters could be created to deduplicate ballot lines, sort the
candidate by score on each line, convert commas to ">" and "=" (for
ranked ballot equivalents), and add whitespace for readability. They
could optionally normalize the candidates to a range of ASCII letters
(e.g. changing "Allie" to "A", "Billy" to "B", etc).

The goal would be to make it useful for two people debating whether
the Condorcet criterion or the Monotonicity criterion is more
important. They could both easily crank out a set of ballots that
could be fed into either a ranked-ballot counter or a rated-ballot
counter. Having the candidate tuples sorted in each line makes it
clearer what the preferences were of the set of voters represented by
the given line.

I think that parsers could be written for this format such that they
follow Postel's Law (a.k.a the "robustness principle"):

To quote that^: "be conservative in what you do, be liberal in what
you accept from others"

People trying to express ranked ballots could drop the scores, and
ONLY include ">" and "=" as a delimiter between candidates,  People
trying to express rated ballots could use commas (",") instead of ">"
and "=". Programmers trying to parse handcrafted scenarios could
figure out how to fill in the blanks.

I'm tempted to write a reference parser for this, but first, what do
you all think?  Let the list know!  Let me know!  Let reddit know!


p.s.  I'm thinking of calling my version "ABIF", standing for
"Aggregated Ballot Image Format".  I may just document it here:

More information about the Election-Methods mailing list