[EM] Ballot Data Format

John Karr brainbuz at brainbuz.org
Thu May 27 13:33:48 PDT 2021


As the author of Vote::Count, a standardized format for ballots would be 
a big plus. When I've been able to collect sample data, the first thing 
I need to do is convert it to my format. Currently Vote::Count has two 
formats, a text one for ranked ballots and a json/yaml format for range 
ballots. The documentation on my formats is here: 
https://metacpan.org/pod/Vote::Count::ReadBallots

I'm not on Reddit, but I think creating a working group of people with 
an interest to propose a standard would be  a great idea, and I'm 
interested in helping.

A standard format would allow creation of a library of data for which 
electowiki would seem to be a natural home.

On 5/27/21 4:02 PM, election-methods-request at lists.electorama.com wrote:

> Send Election-Methods mailing list submissions to
> 	election-methods at lists.electorama.com
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://lists.electorama.com/listinfo.cgi/election-methods-electorama.com
>
> or, via email, send a message with subject or body 'help' to
> 	election-methods-request at lists.electorama.com
>
> You can reach the person managing the list at
> 	election-methods-owner at lists.electorama.com
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Election-Methods digest..."
>
>
> Today's Topics:
>
>     1. (no subject) (Rob Lanphier)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 26 May 2021 23:38:14 -0700
> From: Rob Lanphier <roblan at gmail.com>
> To: election-methods at lists.electorama.com
> Subject: [EM] (no subject)
> Message-ID:
> 	<CAK9hOYn2T=ympC7gEd8wS_8S8yjzK==xsmEfNKWo99cBjaXDgA at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Hi folks,
>
> There's an interesting discussion happening on reddit about ASCII
> formats for aggregated ballot images.  I'll provide a deep link to my
> comment here:
>
> <https://www.reddit.com/r/EndFPTP/comments/nkm2cd/standardizing_cardinal_ballot_notation/gzls6pj/>
>
> What the original reddit poster (/user/jman722) made me realize is
> that it's possible to come up with a format that works for both range
> ballots and ranked ballots.  The range ballots can be on a scale of
> 0-5, where 5 is "awesome", and 0 is "awful".  The ranked ballots can
> be A>B>C.
>
> I'm going to use the example that the original reddit poster made:
>
> 12: Allie/5, Billy/5, Candace/4, Dennis/3, Edith/3, Frank/2, Georgie/1, Harold/0
> 7: Allie/4, Billy/0, Candace/2, Dennis/3, Edith/1, Frank/0, Georgie/5, Harold/3
> 5: Allie/0, Billy/3, Candace/2, Dennis/3, Edith/4, Frank/5, Georgie/3, Harold/4
>
> That format is good but not great.  It takes a careful eye to see that
> Allie, Billy, Frank, and Georgie are the passionate favorites (earning
> a "5" score), and another close look to see that Allie, Billy, Frank,
> and Harold are listed as completely unacceptable (earning a "0" score)
>
> My old format that I used for my 1996 Perl script that I wrote and
> published in The Perl Journal would express those ballots this way:
>
> 12: Allie=Billy>Candace>Dennis=Edith>Frank>Georgie>Harold
> 7: Georgie>Allie>Dennis=Harold>Candace>Edith>Billy=Frank
> 5: Frank>Edith=Harold>Billy=Dennis=Georgie>Candace>Allie
>
> With this format, it becomes clear that 12 voters really like Allie
> and Billy and really don't like Harold.  The next 7 voters really like
> Georgie, and really don't like Billy and Frank.  The remaining 5
> voters really like Frank, but really dislike Allie.  One has to add up
> 12+7+5 to realize there are 24 voters in this election.
>
> The ratings are stripped from my old 1996-ish format.  It only
> provides the following parse tokens:
>
> [quantity]: [cand5yay] [> or =] [cand4good] [> or =] ... [cand0boo]
>
> It seems as though it would be possible to come up with a merged
> format that would express the range ballots above like this:
>
> 12: Allie/5 =Billy/5 >Candace/4 >Dennis/3 =Edith/3 >Frank/2 >Georgie/1 >Harold/0
> 7: Georgie/5 >Allie/4 >Dennis/3 =Harold/3 >Candace/2 >Edith/1 >Billy/0 =Frank/0
> 5: Frank/5 >Edith/4 =Harold/4 >Billy/3 =Dennis/3 =Georgie/3 >Candace/2 >Allie/0
>
> The ">", "=", and "," characters could all be optional delimiters
> between the candidate/score tuples on each line (though at least one
> of those three delimiters WOULD be required). If ">" or "=" is used as
> a delimiter, then the candidates MUST be ordered by score (highest
> score first). Candidate tokens can be one or more ASCII characters
> ([A-Z] or [a-z]) OR the candidate token MUST start with a square
> bracket ([) and end with the closing square bracket (]), and the
> intervening text can be any unicode character (e.g. [Do?a Garc?a
> M?rquez] or [Ximena Pe?a] or [???]) . Whitespace can be discarded, but
> SHOULD be included for legibility.
>
> Linters could be created to deduplicate ballot lines, sort the
> candidate by score on each line, convert commas to ">" and "=" (for
> ranked ballot equivalents), and add whitespace for readability. They
> could optionally normalize the candidates to a range of ASCII letters
> (e.g. changing "Allie" to "A", "Billy" to "B", etc).
>
> The goal would be to make it useful for two people debating whether
> the Condorcet criterion or the Monotonicity criterion is more
> important. They could both easily crank out a set of ballots that
> could be fed into either a ranked-ballot counter or a rated-ballot
> counter. Having the candidate tuples sorted in each line makes it
> clearer what the preferences were of the set of voters represented by
> the given line.
>
> I think that parsers could be written for this format such that they
> follow Postel's Law (a.k.a the "robustness principle"):
> https://en.wikipedia.org/wiki/Robustness_principle
>
> To quote that^: "be conservative in what you do, be liberal in what
> you accept from others"
>
> People trying to express ranked ballots could drop the scores, and
> ONLY include ">" and "=" as a delimiter between candidates,  People
> trying to express rated ballots could use commas (",") instead of ">"
> and "=". Programmers trying to parse handcrafted scenarios could
> figure out how to fill in the blanks.
>
> I'm tempted to write a reference parser for this, but first, what do
> you all think?  Let the list know!  Let me know!  Let reddit know!
> :-D
>
> Thanks
> Rob
>
> p.s.  I'm thinking of calling my version "ABIF", standing for
> "Aggregated Ballot Image Format".  I may just document it here:
> https://electowiki.org/wiki/User:RobLa/ABIF
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Election-Methods mailing list
> Election-Methods at lists.electorama.com
> http://lists.electorama.com/listinfo.cgi/election-methods-electorama.com
>
>
> ------------------------------
>
> End of Election-Methods Digest, Vol 202, Issue 7
> ************************************************



More information about the Election-Methods mailing list