The Collated Ladder takes in two inputs:
- The projected number of wins for each club from the Stocky
Put simply, the Collated Ladder is an average of these two numbers, with a 2:1 weighting towards the output of the Stocky, rounded to the nearest whole number.
The Ladder is then based on sorting each team by its Collated number of wins, then by its Pythagoras projection, which is a loose analogue for for-and-against (the greater the number of wins projected, the better the team’s for-and-against will be).
Why bother with this if both systems have limitations and inaccuracies? Aren’t we just compounding that?
The Stocky, which is short for stochastic simulation, is a Monte Carlo simulation of the season using Elo modelling to work out what the outcome of that season might be.
The basic premise of a Monte Carlo simulation is that if you have a few pieces of the puzzle, an idea of how they relate and then throw enough random numbers at it, you’ll get a pretty good idea of what the puzzle picture is.
Let’s say you have a circle inside a square with sides the same length as the circle’s diameter. Then throw a bunch of sand onto the square/circle combination and count how many grains of sand end up in the circle. If you know the length of the square’s side and the proportion of sand that ends up in the circle, you can work out a value for π.
(You want more detail? Fine: the side of the square can be used to calculate the area of the square, multiply that by the proportion of sand inside the circle will give you an estimate of the circle’s area, divide the circle’s area by square of half the square’s length and you will get an estimate of π).
The more grains of sand you throw at the square/circle, the closer the estimate will be to the actual answer.
In my previous primer on Elo ratings, I talked about different ways of calculating Elo ratings with a view of measuring form and/or class. This primer will look in a bit more depth at how I arrived at the specific numbers for the variables.
The main variables in an Elo model are:
- Starting ratings (discrete versus continuous)
- If continuous, then the reversion to mean discount of ratings
- Calculation method (margin vs result/WTA)
- K, weighting for each game
- h, homefield advantage
- p, margin factor
Some are derived from game data, others from optimisation. Let’s tackle them one by one.
Short answer: with a lot of time spent in Excel and Google Sheets.
Long answer: It depends on what you want to do.
I introduced the Elo rating system in a previous primer. Now it’s time to put it to work.
I think most sport’s fans would agree with the following definitions of form and class –
- Form – Short term performance, related to luck, match fitness, weather
- Class – Long term performance, related the structural competence of the team in question
Something like “Form wins games, class wins premierships” seems appropriate.
Pythagorean expectation is the idea that you can calculate a team’s winning percentage based solely on its for and against. It originated with baseball nerds but, according to its Wikipedia article, has been adapted for other sports. It is also where the name of this site came from. Pythago is basically what Pythagoras would have been if he had been Australian.
To rip straight from Wikipedia –
“The basic formula is:
Elo ratings originated in chess as a way to rate different players. A player starts with a rating of 1500 by convention and then the rating goes up or down depending on whether the players wins or loses. The player’s rating will change proportionally to the rating of their opponent: if the player beats a very highly rated opponent, the player’s rating will go up by more than if they beat a minnow.
This rating system has been adapted for soccer, NFL, AFL and a few systems have been developed for rugby league teams (1, 2 & there was a third but it seems to have gone missing). What intrigued me was how The Arc and 538.com were able to take Elo ratings and, instead of just ranking teams, were able to predict the outcome of matches in advance with a surprising degree of accuracy. I decided that I wanted to try the same thing for the NRL.
Here’s the soccer explanation for how it’s calculated: