After a dominating win over Great Britain, followed by a tough and exciting win over Australia, some extremely exuberant pundits decided that Tonga were the best nation in the world and worthy of Tier 1 status.
Putting aside the fact that a poor island nation of 100,000 is not capable of generating enough native talent to compete with Australia or England and so not at all suitable for Tier 1 status, I find it hard to believe that two wins is enough to reach the top of the rugby league pile. Then the IRL updated their rankings and put New Zealand at number one. It all got too much for me.
Fortunately, the draw for the 2021 World Cup restored some sanity, despite the slightly incongruous setting of Buckingham Palace and newsworthy presence of the Duke of Sussex. With 641 days to kickoff, I got a bit excited and looked at next year’s tournament.
Embed from Getty Images
Over the last month, we’ve been looking at rating players using a metric called Production Per Game, or PPG. We’ve used it to find players at the higher end, justifying million dollar salaries, and at the lower end, identifying fringe first graders.
The tricky thing about rating players is determining what information from the past can be used to project the player’s performance into the future. I hope it’s obvious why this might be interesting.
Within a player’s career, there is a noticeable amount of variation from season to season. On average, players get two pips (one pip is .001 of a PPG rating) worse, although the actual range is lies between improving by 96 pips or losing 86 somewhere (standard deviation of 24 pips) from season to season.
Embed from Getty Images
If you’re the Eels, probably a lot more than zero. But I’m getting ahead of myself.
Around the start of the finals last year, The Arc posted the probabilities of each finalist winning the AFL grand final. Some guy on Twitter (let’s call him Bill because I don’t remember who it was and I’m not digging out a throwaway tweet from six months ago) asked if the probabilities had been calculated for all finals series throughout history so we could see how many teams were expected to win against reality. They hadn’t but more on that next week.
I thought, in the true embodiment of the philosophy of this site, “That’s a great idea. I’m gonna do that but for NRL.”
The Collated Ladder takes in two inputs:
- The projected number of wins for each club from the Stocky
Put simply, the Collated Ladder is an average of these two numbers, with a 2:1 weighting towards the output of the Stocky, rounded to the nearest whole number.
The Ladder is then based on sorting each team by its Collated number of wins, then by its Pythagoras projection, which is a loose analogue for for-and-against (the greater the number of wins projected, the better the team’s for-and-against will be).
Why bother with this if both systems have limitations and inaccuracies? Aren’t we just compounding that?
The Stocky, which is short for stochastic simulation, is a Monte Carlo simulation of the season using Elo modelling to work out what the outcome of that season might be.
The basic premise of a Monte Carlo simulation is that if you have a few pieces of the puzzle, an idea of how they relate and then throw enough random numbers at it, you’ll get a pretty good idea of what the puzzle picture is.
Let’s say you have a circle inside a square with sides the same length as the circle’s diameter. Then throw a bunch of sand onto the square/circle combination and count how many grains of sand end up in the circle. If you know the length of the square’s side and the proportion of sand that ends up in the circle, you can work out a value for π.
(You want more detail? Fine: the side of the square can be used to calculate the area of the square, multiply that by the proportion of sand inside the circle will give you an estimate of the circle’s area, divide the circle’s area by square of half the square’s length and you will get an estimate of π).
The more grains of sand you throw at the square/circle, the closer the estimate will be to the actual answer.
In my previous primer on Elo ratings, I talked about different ways of calculating Elo ratings with a view of measuring form and/or class. This primer will look in a bit more depth at how I arrived at the specific numbers for the variables.
The main variables in an Elo model are:
- Starting ratings (discrete versus continuous)
- If continuous, then the reversion to mean discount of ratings
- Calculation method (margin vs result/WTA)
- K, weighting for each game
- h, homefield advantage
- p, margin factor
Some are derived from game data, others from optimisation. Let’s tackle them one by one.