[EM] round robin tournaments

Thu Jun 23 15:56:57 PDT 2011

In sports round robin tournaments where burial strategy is meaningless, what is the best measure of 
defeat strength?  In other words, if you were going to use, say, beatpath to find a total order of the teams 
from best to worst after the conclusion of a tournament, would you use wv, margins, or something else 
to measure defeat strength?

Let's think about basketball for example.  Let W be the winning points and L be the losing points from a 
match between two teams.  The margins defeat strength would be W-L.  The wv defeat strength would 
be just W.  Another possibility would be the point ratio W/L or equivalently log(W) - log(L).  Between 
margins and the point ratio would be  sqrt(W) - sqrt(L), which seems better to me from an intuitive point 
of view.

What rational basis could we use to decide between these measures?

It seems like we want the defeat strength to reflect the likelihood that if the match were repeated, the 
winner would not change.

Suppose team A beats team B thirty-six to twenty-five, while team C beats team D forty-nine to thirty-six.

Which is the stronger defeat?  If both teams had rematches, which of the rematches would be more 
likely to turn out the same?

Note that the C defeat over D has a larger margin, but the A over B defeat has the larger ratio.

By the sqrt(W) - sqrt(L) measure, both defeat strengths come out the same.

Here's a way to resolve it.

Let N=W+L.  Let p=W/N.  Let q=L/N.  Let sigma = sqrt(N*p*q).

Measure defeat strength by  S=(p-q)*sigma.

In terms of W and L we have

S=(W-L)*sqrt(W*L)/(W+L)^1.5

I don't have a calculator on me now, but somebody should check this out to see which of the two defeat 
strengths is stronger by this measure.  I suspect that they are pretty close.

What is the heuristic behind S=(p-q)*sigma?

The Binomial distribution has standard deviation sigma = sqrt(N*p*q).

>From the point of view of the winning team the fraction p is the proportion of favorable outcomes, while 
the fraction q is the proportion of unfavorable outcomes.  So (p-q) is the difference in estimates of the 
underlying fractions of favorable and unfavorable outcomes.  We multiply this estimate by the standard 
deviation to take into account "sample size."

If a real statistician, like Jobst were reading this, he (or she) could refine this estimate, or at least give a 
better explanation.

Thoughts?

Forest