Turns out that StatScore didn’t pan out the way I had hoped. There were some conceptual errors but the biggest was that I wanted a measure of rate and a measure of volume and you can’t have one statistic that does both. It’s like having one number that meaningfully states that a boxer is both the best in the world pound-for-pound but also the best boxer in the world who can beat anyone irrespective of weight classes. The world doesn’t work like that. As a result, there was some okay, but not great, player analysis. Unfortunately, the creation of a new tool requires that you use it for a while on new scenarios in order to evaluate it’s usefulness. Sometimes it doesn’t pan out as well as you would have hoped.
Also, the name sucked.
So I went back to the drawing board. There were some salvageable concepts from StatScore that have been repackaged, with corrections made for some fundamental mistakes, and repurposed into new player rating systems: PPG and WARG.
Poseidon ratings are a new team rating system for both the NRL and the Queensland Cup.
For those who don’t have time to read 2000+ words, here’s the short version: the purpose of Poseidon ratings is to assess the offensive and defensive capabilities of rugby league teams in terms of the number of tries they score and concede against the league average. By using these ratings, we can estimate how many tries will be scored/conceded in specific match ups and then use that, with probability distributions, to calculate an expected score, margin and winning probabilities for the match-up.
The biggest off-season story in the NRL was the transfers of Cooper Cronk from Melbourne to Sydney and then Mitchell Pearce from Sydney to Newcastle. From the Roosters’ perspective, for two players likely on similar pay packets, how did the Roosters decide one was better than the other? Then I wondered if it were possible to work out a way of judging value for money in player trades. It’s big in baseball, so why not rugby league? This led me to develop StatScore and Win Shares as ways to numerically evaluate rugby league players.
If you’re wired for numbers, like I am, it can be hard to deal with people’s feelings and understanding why they think the things that they do. That’s why I’ve decided to quantify the feelings a team generates into five distinct indices: Power, Hope, Panic, Fortune and Disappointment.
Each index has two components. There’s a main mechanism for ranking the teams and some minor tie-breaking stats. The main mechanism typically uses Elo ratings to make an estimation of what we expect from a team, whether or not they are meeting that expectation and what that means for the season ahead. The tie-breakers are statistics used to award a few points here and there to help rank the teams should they have similar mechanism results.
Editor’s note: As much of last season’s material was influenced by The Arc, much of this season’s material owes a debt to SB Nation, including the idea of panic/hope indices.
The Greeks is the collective name given to a series of Elo rating models for tracking performance of rugby league teams and forecasting the outcomes of games. I usually refer to them as if the philosopher himself was making the prediction, even though the Greeks have mostly been dead for a couple thousand years and certainly would never have heard of rugby league or Arpad Elo.
The differences between each Greek are on the subtle side, with the intention of measuring different things. You may want to revisit primers from last year:
While the recent RLWC was on, I couldn’t help but notice that the RLIF had Scotland pegged as the world’s fourth best team. Scotland hadn’t won a game since 2014 and even that was against Ireland. Since then, they’d lost to Australia, England, Ireland, Wales and France. I also got frustrated because a fifteen second Google didn’t reveal how the rankings are actually calculated.
So I figured I could come up with a better system. I did and this is how the Pythago World Rankings (PWR) work.
The Collated Ladder takes in two inputs:
- The projected number of wins for each club from the Stocky
Put simply, the Collated Ladder is an average of these two numbers, with a 2:1 weighting towards the output of the Stocky, rounded to the nearest whole number.
The Ladder is then based on sorting each team by its Collated number of wins, then by its Pythagoras projection, which is a loose analogue for for-and-against (the greater the number of wins projected, the better the team’s for-and-against will be).
Why bother with this if both systems have limitations and inaccuracies? Aren’t we just compounding that?