Tag Archives: nrl

Primer – TPR

For the third season in a row, I’m changing the player rating system. We mourn the passing of Statscore (not really) and PPG (again, not really) as we slowly converge on to a system that I can take for granted and don’t have to refine any further.

The core of the system hasn’t changed. The proposition is that there are important and unimportant statistics and that counting the important ones provides information about players and teams and can be predictive.

PPG was useful, and development and application through 2019 demonstrated that:

The last one should be taught in universities as a perfect example of ringing the bell at the top. Sheer narrative power subsequently forced Pearce back to mean and Brown onto the compost heap.

The mechanics of PPG have been preserved through TPR. My biggest issue is that when I wrote about production (that is, the accumulation of useful statistics), I didn’t have any units to work with. I originally didn’t think this would be a problem but it would make some things clearer if I did have units. So I took a leaf from the sciences and landed on naming it after the man that could do it all, David “Coal Train” Taylor.

Embed from Getty Images

“PPG”, which was Production – and not Points – Per Game, doesn’t make much sense now, so that’s been punted and replaced with TPR, or Taylor Player Rating. There has been a substantial change in the way I’d calculated WARG in the primer at the start of 2019 and the way I calculated it in Rugby league’s replacement player at the end. The latter method is now canonical but the name is going to stick.

In brief, TPR and WARG are derived through the following six steps:

  1. Run linear regressions to confirm which statistics correlate with winning percentage. The stats get distributed in to buckets and we review the success of teams achieving those statistics. One crucial change was to exclude any buckets from the regression with fewer than ten games in it. We end up with tries, running metres, kick return metres, post-contact metres, line breaks, line break assists, try assists, tackle busts, hit ups, dummy half run metres, missed tackles (negative), kick metres, forced drop outs, errors (negative) and, in Queensland only, penalties (negative) as having significant correlations out of the data provided by the NRL.
  2. Take the slope of the trendline calculated in the regression and weight it by its correlation (higher the correlation, the higher the weighting). Through this weighting, we develop a series of equivalences between stats. The below is shows the quantities required of each stat to be equivalent to one try in 2020:
    equivalences
  3. Players who accumulate these statistics are said to be generating production, which is now measured in Taylors, and is the product of the weighting/slope multiplied by the quantity of stats accumulated multiplied by 1000. However, due to the limitations of the statistics, some positions on the field generate significantly more Taylors than others.
    Average Taylors per game by position (1)
  4. To combat this, the production generated each game is then compared to the average production generated at that position (averaging previous 5 seasons of data in NRL, 3 seasons for State Cup). We make the same adjustments for time on field as in PPG and then divide by 10 for aesthetic purposes. The resulting number is the Taylor Player Rating, or TPR.
  5. We derive a formula for estimating win probability based on production for each competition and then substitute in a winning percentage of .083 (or two wins in twenty-four games, per the previous definition of a replacement-level team) and estimate the amount of production created by a team of fringe players against the competition average. This gives us a TPR that we can set replacement level at. The Taylors created over and above replacement level is added to the notional replacement level team’s production and the increase in winning probability is attributed to that player as a Win Above Reserve Grade, or WARG. Replacement level in TPR for the NRL is .057, Queensland is .072 and NSW is .070. The career WARG leaders are currently:
    career warg
  6. Finally, we go back and check that it all makes sense by confirming that TPR has some predictive power (~61% successful tipping rate, head-to-head) and there’s a correlation with team performance (~0.60 r-squared for team season production against team winning percentage).

For a more in-depth explanation, you can refer back to the original PPG primer. The differences between last year’s system and this year’s are slight and, for most intents and purposes, PPG and TPR are equivalent. Some of the changes are small in impact but important.

The most obvious change is the addition of NSW Cup data to the Queensland Cup and NRL datasets. This was driven by my interest in assessing the farm systems of each NRL club and you can’t make a decent fist of that if you’re missing twelve feeder clubs from the picture. It will also allow me to better test talent identification in the lower levels if I have more talents to identify and to better set expectations of players as they move between competitions.

For the most recent seasons, TPR only uses past data to calculate its variables, whereas PPG used all of the data available and created a false sense of success. A system that uses 2018 data to create after-the-fact predictions for the 2018 season isn’t going to give you an accurate view of how it will perform in 2019.

Finally, projecting player performance into the future is a pretty powerful concept, even if the tools for doing so are limited. I went back and re-derived all of the reversion-to-mean formulas used in The Art of Projection. It turns out that the constants for the projection formula don’t change much between seasons, so this is fixed across the datasets for now. It also turns out adjustments for age and experience are different and largely useless under the TPR system, such is the ephemera of statistical analysis.

One application for projections is that I’ll be able to run season simulations using the winning probability formula and team production that will be able to measure the impact of including or excluding a player on the outcome of a team’s season. It may not be super-accurate (the projections have large average errors) but it will be interesting. I also like the idea that out- or under-performance of projections as an assessment of coaching.

Finally, to reiterate things that I think are important caveats: TPR is a value-over-average rate statistic, while WARG is a volume statistic. No, statistics don’t tell the whole story and even these ones don’t measure effectiveness. Yes, any player rating system is going to have a certain level of arbitrariness to it because the system designer has to make decisions about what they consider important and unimportant. I’m fully aware of these things and wrote 1500 words accordingly at the end of the PPG primer.

A thing I’m trying to do this season is publish all of my rating systems on Google Sheets so anyone can have a look. You can see match-by-match ratings for NRL and the two State Cups if that’s your jam.

The coaches that fucked up your club

When a coach arrives at a major league club, fresh and excited to make his own mark in the history books, you’d have to think that, as a minimum threshold for success, he’d want to leave the place in better shape than when he arrived. Sometimes, the vagaries of reality make it difficult to assess a coach’s legacy but we can definitely ignore nuance and simplify things down to a nice looking line on a graph.

For this, we use Class Elo ratings. Over this kind of time frame, you can think of the rating as a glorified win-loss stock ticker. It goes up when the team wins and it goes down when the team loses. The rating goes up more for unexpected wins and goes down more for unexpected losses. Grand finals are weighted the heaviest, then finals and then regular season games. Challenge Cup results are included for Super League teams. You can see each team’s class Elo rating history for NRL and Super League.

This post compares different coaches at each club and see how they improved the club’s rating from their first game. I’ve included most, but not all of, the coaches for each club over the last two decades. Caretakers have generally been excluded. I used rugbyleagueproject.org (DONATE TO THE PATREON) to determine the extents of careers but it may not be 100% complete for coaching details and career lengths may be out by a few games. It is very hard to find out which round a coach was sacked from a club in 2003 if it’s not on RLP. 

Embed from Getty Images

Read more

Ranking every rugby league team in the world

If you’re not interested in how the rankings work and just want to see the outputs, click here.

It began as a simple exercise to try and rate Super League players, much in the way that I rate NRL and Queensland Cup players. It turns out that the Super League website makes that an impossible task because it is a garbage fire for stats. Moving on from the wasted effort, I thought I might still do team ratings for the RFL system, mostly out of my increased interest with the Toronto Wolfpack’s promotion into Super League.

Then I thought about the Kaiviti Silktails of Fiji entering into the New South Wales system and wondered if I should take a look at the leagues there, despite my dubiousness about whether anyone in NSW cared about lower grade football when they could follow the Dragons, the Tigers or the Knights in so-called first grade.

From there I spiralled into a mishmash of US college football tradition, websites in Serbian and copying and pasting. When I came to, I had a neatly formatted spreadsheet covering a decade of world club rugby league.

Embed from Getty Images

Ranking the world

Invariably, creating any sort of evaluation system requires judgements by the evaluator about who to include or exclude and what the evaluation system considers to be “good”. I’ll explain my position and you can decide whether or not you like it.

Scoring the teams uses an average of four similar rating systems that look at performance over different time intervals.

We’ve long had form and class Elo ratings for the NRL and Queensland Cup. Form is about the short term performance of clubs, and can represent anywhere from four to eight weeks of results depending on the draw and league, while class is about long term performance, and can represent the average of years of performance. Form is a better predictor of match results, class is a better predictor of fan disappointment.

I created similar systems for another ten leagues in NSW, PNG, France (see also my Elite 1 season preview), the UK and the USA. They work along the same lines as the NRL and Queensland Cup editions. The average rating within an Elo system is approximately 1500 and the disparity in ratings can be used to estimate match outcome probabilities.

Both sets of Elo ratings are adjusted by a classification system I borrowed from baseball. To acknowledge the fact that a 1700 team in the BRL is not likely to be as good as a 1300 team in Super League, we adjust the team ratings so we can attempt to compare apples to apples –

  • Majors: NRL (ratings adjusted by +500) & Super League (+380)
  • Triple-A (AAA): QCup, NSW Cup and RFL Championship (all +85)
  • Double-A (AA): Ron Massey, RFL League 1, FFR Elite 1 (all -300)
  • High-A (A+): Brisbane RL, FFR Elite 2 (all -700)
  • Low-A (A-): USARL (-1000)

In Elo terms, a difference of 120 points between teams, like between an average NRL and an average Super League team, makes the NRL team 2:1 favourites. A 415 point gap gives the less favoured team a 8.4% chance of winning (equivalent to the replacement level), 800 points 1%, 1200 points 0.1% and 1600 points 0.01%. Consider the improbability of the Jacksonville Axemen beating the Melbourne Storm and you get an idea of where I’m coming from.

Between short term form and long term class, we’re missing a medium term component that represents roughly a single year of performance. I originally was going to create Poseidon ratings for the leagues, so I took a simpler approach and used points scored per game and points conceded per game over a regular season in lieu.

I then made my simplification much more complicated by doing a linear regression of winning percentage across all leagues compared to points scored per game and a second regression against points conceded per game. This gives a formula that converts the components of for and against into winning percentage, which is in turn converted to an equivalent Elo rating, which is then adjusted per the above. It also allows me to compare points scored per game – as a measure of competitiveness or quality or both? – across different leagues.

Competitiveness or quality.png

This specifically is just trivia but from an overall analytics perspective, the risk is if only the top league is analysed and analysts assume that the same principles apply to all leagues, incorrect conclusions will be drawn about the sport.

The ranking is decided by which team has the highest average score across the four rating components, which are given equal weighting. I call it the Global Rugby League Football Club Rankings, or GRLFC for short.

While it’s possible for teams to game a single system, it would be nigh on impossible to game all components, so I feel relatively comfortable that the highest ranked team is the “best”.

That said, form ratings and the for-and-against components only work on regular season results. Class ratings are the only component that takes into account playoff (and Challenge Cup, where applicable) performance. You may think finals footy deserve more weighting but I would put it to you that “the grand final winner is always the best team” and “any rugby league team can win on their day” are two mutually exclusive thoughts and I prefer to believe the latter. If you want to further mull it over, consider that Newtown finished seventh on the ladder in the twelve team NSW Cup in 2019 and then went on to win the Cup and then the State Championship.

Each club (as represented by their combination of name, colours and logo) is only represented once in each year’s rankings, by the version of that club in the highest league. For example, Wentworthville have been in the NSW Cup and the lower tier Ron Massey Cup. To date, Wenty have been represented in the rankings by their state cup team. However, as the Magpies will be replaced in the NSW Cup by Parra reserve grade in 2020, and while this doesn’t change much in reality, they will be henceforth represented in the rankings by their Ron Massey team. This is mostly because it makes the rankings a little more interesting, not having been clogged up by a half dozen clones of the NSWRL clubs.

I would like to have included the Auckland Rugby League’s Fox Memorial comp as a double-A league but it seems to be impossible to find scores. I also would have liked to add more low-A comps, like those in Serbia or Netherlands or maybe even Nigeria or Kenya, but scores for these comps are even more difficult to find or have incomplete results or don’t really play enough games. As a result, we may never know whether the Otahuhu Leopards are better than the Villeneuve Léopards.

I drove myself mad enough to trying to get the results that I did. I don’t feel the need to delve further into district comps in Australia but, who knows, I may well change my mind on that. It would be nice to go further back on some comps, particularly in France and PNG, but we have what we have. A big thanks to rugbyleagueproject.org, leagueunlimited.com and treizemondial.fr for hosting what they do have, because we can’t possibly rely on federations to have curated their own records and history.

A full season of results is required for a club to be ranked. This is only a problem for French clubs, with both Elite 1 and 2 running through their winter and the date the ranking is nominally calculated is December 31. A French club’s first part season is given a provisional place in the rankings, converting to a ranking the year after, based on the previous twelve months’ worth of results.

The rankings can be seen for 2009 through 2019 here. Your current top seeds in each competition are –

  • NRL (Major): nrl-mel Melbourne Storm (1)
  • Super League (Major): esl-shl St Helens (5)
  • Championship (AAA): esl-tor Toronto Wolfpack (29)
  • Queensland Cup (AAA): qcup-scf Sunshine Coast Falcons (30)
  • NSW Cup (AAA): nsw-nwt Newtown Jets (40)
  • Ron Massey (AA): nsw-mry St Marys (63)
  • League 1 (AA): rfl-old Oldham Roughyeds (64)
  • PNG NRLC (AA): png-lae Lae Tigers (66)
  • Elite 1 (AA): el1-alb Albi Tigers (69)
  • Elite 2 (A+): el2-vgh Villegailhenc-Aragon (101)
  • BRL (A+): qld-wsp West Brisbane Panthers (105)
  • USARL (A-): usa-jax Jacksonville Axemen (109)

Women’s Rankings

In an ideal world, we’d have a women’s ranking to complement the men’s. But the NRLW has only completed 14 games, which is not a sufficient sample although we may see that double in 2020. The QRLW will only commence this year and it remains to be seen what the NSWRL is going to do with their women’s premiership, whether this becomes the equivalent of a Ron Massey Cup to a new NSWRLW/women’s NSW Cup or if, as is usually the case, the Sydney comp will be promoted to be the state comp.

In the more enlightened Europe, the women’s Super League has completed its first season, comprising 14 rounds, and the Elite Feminine has just commenced its second season, the previous being 12 rounds. The bones are there for a women’s club ranking, but it will take time for Australia to catch up a little and make the rankings more balanced. With any luck, I should be able to deliver the first rankings at the end of this year.

The World Club Challenge

International club football is a rare thing, indeed. The ridiculously lopsided 1997 World Club Challenge (Australian clubs scored 2506 points to the Europeans’ 957) largely put paid to the idea that there could be a competition on an equal footing between the two major leagues of football. Other than a short lived World Club Series, which was overly reliant on the charity of big Australian clubs, all that remains of the concept is the World Club Challenge match-up between the winners of the Super League and the NRL.

First held irregularly since 1976 and annually since 2000, the match suffers from the disparity in the quality of the leagues – obviously driven by money – and a lack of interest – largely driven by a lack of promotion and lack of commitment from most Australian clubs. The advantage has ebbed and flowed, generally in favour of the Australian sides but in the late 2000s, the English fought back before being pummelled back into submission more recently.

World Club Challenge For and Against.png

Incidentally, I arrived at a 120 point discount between the NRL and Super League based on Super League clubs’ for and against in the WCC over the last twenty years. The application of Pythagorean expectation and then converting that (approx. 33% win percentage for SL) into Elo rating points.

Still, I believe that the WCC should be one of the centrepieces of the season, not unlike an abbreviated World Series or Super Bowl. A match day programme could be filled out by play-offs from the champions of the men’s, women’s and secondary men’s comps – perhaps with the winners of the NRL State Championship and the winner of a play-off of the premiers of the RFL Championship and Elite 1 – in the Pacific and the Atlantic. Such an event could be saleable to broadcasters, sponsors and hosts.

Of course, if successful, the WCC would then undermine the respective competitions’ grand final days, so there’s an obvious conflict of interest. The conflict is difficult to resolve when the stakeholders are more interested in maintaining their own position than making money or securing a commercial future. While cash may be a corrupting influence, the game will not survive as a professional sport without it.

Given the absence of international club fixtures, you could fairly wonder what the applications of this ranking system might be, other than to have a rough guess at whether the Gold Coast Titans are better or worse than the Sunshine Coast Falcons (the answer is: slightly better). My feel is that the final score is a rough proxy for a singular globalised Elo rating system. Consequently, it may not be very good but I looked back to the last ten WCCs.

GRLFC vs WCC.png

It was successful in predicting the higher ranked team winning eight of the ten matches but not particularly predictive in terms of the gap between the teams (the trendline above shows basically zero correlation) nor in the scale of favouritism (favourites won 80% of the time compared to 65.9% predicted probability). Still, it’s only a sample size of ten games where the Super League sides have been beaten pretty comprehensively.

In the meantime, this gives the English something to work towards.

Who will win the 2019 NRL Premiership?

At this time of year, is there anything else you want to know more than the answer to this question?

For our crystal ball, we turn to Monte Carlo simulations. These simulations work on the principle that if we know the inputs to a complex system and how they relate to each other, then we can test the outcomes of that system using random numbers to simulate different situations.

At its most basic, just imagine if you simulated the outcome of football matches by rolling dice. Numbers one and two might represent a win for the Gold Coast and numbers three through six might be a win for Wests. If you repeat that a couple of thousand times, not only will you be extremely bored but the Gold Coast will “win” about 33% of the time and Wests 66%.

Now take the same approach for the nine finals games, with the winner advancing per the NRL’s system, but instead of using dice, you generate a random number between zero and one and calculate the win probability using Archimedes (form) Elo ratings. Then repeat it 5,000 times over. The number of times that the Storm or Roosters or Broncos or Eels “win” the premiership across your simulations should give you some insight into the probability of that happening in real life. I call this the Finals Stocky and I present its findings.

Embed from Getty Images

Read more

The Art of Projection

Over the last month, we’ve been looking at rating players using a metric called Production Per Game, or PPG. We’ve used it to find players at the higher end, justifying million dollar salaries, and at the lower end, identifying fringe first graders.

The tricky thing about rating players is determining what information from the past can be used to project the player’s performance into the future. I hope it’s obvious why this might be interesting.

Within a player’s career, there is a noticeable amount of variation from season to season. On average, players get two pips (one pip is .001 of a PPG rating) worse, although the actual range is lies between improving by 96 pips or losing 86 somewhere (standard deviation of 24 pips) from season to season.

Embed from Getty Images

Read more

NRL Tips – Round 25, 2019

I wrote about Sydney as an obstacle to expansion yesterday.

It started as an intro to this post but ended up being over 1000 words and I thought it should stand alone. It’s the result of thoughts that have been bubbling since I started paying proper attention to rugby league when I started doing this in 2017, clarified somewhat this year by League Digest (you should go listen), crystalised by Heartland by Joe Gorman (you should go buy and read it) over the last few weeks. I’m far from the only one who thinks this way: Nick Campton from the Daily Tele and NRL Boom Rookies touched on very similar themes this very week.

Whether this is having any impact in the real world is unlikely but at least we can all furiously agree with each other.

Here’s some other stuff:

Read more

On Expansion and its relationship with Sydney

Another week, another expansion bone has been tossed to the ravenous dogs that are NRL nerds on social media to endlessly chew over. I say that like I wasn’t in there first and not still gnawing on it. I just can’t help myself.

This week, the target was the south-east Queensland expansion team dropped into the competition in 2007 that hasn’t turned out to be another Broncos, denying the broadcasters an opportunity to have multiple games with one million viewers each week.

In true Australian fashion, instead considering the historical accidents that have led to this point (i.e. basing the footprint of a supposedly national competition on the demographics of Sydney circa 1908 whose growth has then been fuelled by pokie dollars or previous south-east Queensland franchises that have failed, undercut by a hostile media and inept management) and attempting to rectify them or improve the presentation of the product so that it might appeal beyond Nine’s core audience of decrepit boomers, an executive contacted a buddy in the extremely accommodating media to have a good old fashioned whinge, Gerry Harvey-style. The consequence was the publication of several of the same think pieces we’ve seen before about why Sydney clubs must be protected at all costs.

Embed from Getty Images

Read more

« Older Entries