[EM] Guardian 100 Best Novels

Ross Hyman rossahyman at gmail.com
Tue May 19 13:03:15 PDT 2026


I thought the score would be of the form approval - weight*(rank sum)
as well but this is incompatible with My Antonia (approvals=4, total
rank = 4+5+10+5=24) at #100 and Left Hand of Darkness (approvals=4,
total rank = 8+7+10+9 = 34) at #84.
Even rank dependent weights can't help because each individual rank on
Darkness is lower or equal to a corresponding rank of Antonia.
If one removes books not in the top 100, the total rank for Antonia is
2+4+9+5 = 20, and for Darkness it is 7+5+5+7= 24. Darkness is still
lower ranked. But now there is the possibility that they choose
weights on the individual ranks in such a way as to make Antonia
lower, such as making the score for 9 count as 14. But that would be
weird.

On Tue, May 19, 2026 at 12:48 PM Kristofer Munsterhjelm
<km-elmet at munsterhjelm.no> wrote:
>
> On 2026-05-18 18:50, Ross Hyman via Election-Methods wrote:
> > The Guardian's 100 Best Novels
> > https://www.theguardian.com/books/ng-interactive/2026/may/12/the-100-best-novels-of-all-time
> > inlcudes 172 ballots ranking voters' 10 favorite novels.  I think it
> > would be interesting to use different voting methods on this ballot
> > set..
> >
> > The method the Guardian used to determine their l00 novel list from
> > the ballots is very mysterious to me.
> >
> > They state "We scored the titles according to how often they were
> > voted for, and then added a weighting based on individual rankings to
> > produce the overall list of 100 greatest books."
> >
> > I played around and could not find anything that fit this vague
> > description and would produce the presented list.
> >
> > For example, My Antonia in 100th place is ranked on 4 ballots while
> > 99, The Go-Between is ranked on 3. So rankings are not just being used
> > to break approval ties.
> >
> > Then look at 89 The Left Hand of Darkness, also ranked on 4 ballots.
> > But Antonia at 100 is higher ranked on its four ballots than Darkness
> > is on its four.
> >
> > What is their formula?
>
> Interesting. The raw data is defined in this large minified JS script:
>
> https://interactive.guim.co.uk/atoms/2026/03/2026-best-100-books-testing/best-100-books/v/1778864974/app.js
>
> I've pulled the data into JSON files[1][2], which can be parsed with a
> simple Python script[3]. After cleaning up the names, Quadelect gives
> the following top 20 with Schulze:
>
>   1. Middlemarch
>   2. Beloved
>   3. Ulyssses
>   4. To the Lighthouse
>   5. Anna Karenina
>   6. In Search of Lost Time
>   7. War and Peace
>   8. Jane Eyre
>   9. Bleak House
> 10. Pride and Prejudice
>      Madame Bovary
> 12. The Great Gatsby
> 13. Nineteen Eighty-Four
>      Moby Dick
>      Emma
> 16. One Hundred Years of Solitude
> 17. Mrs Dalloway
> 18. Persuasion
> 19. The Portrait of a Lady
> 20. Wuthering Heights
>
> and with Smith,Ext-Minmax (which is quite decisive, but not clone
> independent or ISDA):
>
>   1. Middlemarch
>   2. Beloved
>   3. Ulysses
>   4. To the Lighthouse
>   5. Anna Karenina
>   6. In Search of Lost Time
>   7. War and Peace
>   8. Madame Bovary
>   9. Emma
> 10. Pride and Prejudice
> 11. Mrs Dalloway
> 12. One Hundred Years of Solitude
> 13. Moby Dick
> 14. Bleak House
> 15. The Great Gatsby
> 16. Wuthering Heights
> 17. Nineteen Eighty-Four
> 18. Persuasion
> 19. The Portrait of a Lady
> 20. Jane Eyre
>
> The description does make it sound like they're using Approval and
> Borda, so I took a guess that their Borda doesn't eliminate the
> non-contenders first, hence that a candidate's Borda penalty (negative
> score) is just the sum of the ranks they're listed in.
>
> By linear programming, I found the following weighting that *almost* works:
>         score = 30 * approvals - borda penalty
>
> where the approvals and borda penalties are calculated like this:
>
>         from collections import defaultdict
>
>         approvals = defaultdict(int)
>         borda_penalties = defaultdict(int)
>
>         for voter in voting_info:
>                 for vote in voter["topTen"]:
>                         approvals[vote["name"]] += 1
>                         borda_penalties[vote["name"]] += vote["position"]
>
>         scores = []
>         best_fit_scores = []
>
>         for candidate in approvals.keys():
>                 scores.append((approvals[candidate], -borda_penalties[candidate],
> candidate))
>                 best_fit_scores.append((30 * approvals[candidate] -
> borda_penalties[candidate], candidate))
>
> with a tweak taking into account that "The Life and Opinions of Tristram
> Shandy" has two different names in the input data.
>
> This is accurate up to position 53. The first mismatch is "Orlando" and
> "The Transit of Venus". Exact ties (both Borda and Approval scores
> identical) seem to be broken in ascending alphabetical order.
>
> (It should be possible to determine the weights more accurately, but I
> couldn't be bothered to manually input 100 Approval/Borda score pairs
> into my solver.)
>
> So if I were forced to guess at the method, I would say it's a weighted
> Approval/Borda score. It is a very good fit, at least.
>
> I suppose that's a reasonable method in this case, since presumably the
> voters were honest men (to echo Borda) and there's little reason to
> suspect strategic nomination either.
>
> -km
>
> [1] https://munsterhjelm.no/km/elections/guardian_best_100_2026/pb.json
> [2] https://munsterhjelm.no/km/elections/guardian_best_100_2026/gb.json
> [3]
> https://munsterhjelm.no/km/elections/guardian_best_100_2026/investigate_votes.py


More information about the Election-Methods mailing list