Exploring data analysis in rugby union

Neil Watson, Statistical Sciences lecturer at the University of Cape Town, provides a series of articles for OptaPro exploring data analysis in rugby union. Within these articles, Neil considers the topic from an academic perspective combined with discussion on real-world application in the professional game.

In this first post, Neil introduces the existing analytical work in this space and discusses how this can influence team strategy.

To date, the majority of (public) analytical work in rugby union has concentrated on simple metrics assessing either individual or team performance. Whilst these are interesting and can provide valuable information, there is little confirmation of their relevance to the professional game and how these methods can be effectively applied.

Many of the key performance indicators (KPIs) identified as useful differentiators between successful and less successful teams in these studies are often referred to as ‘common-sense’ findings in that they do not provide teams with actionable information they can use to improve performance.

Following an extensive review of existing analytical work, 66 unique team KPIs exist that differentiate between winning and losing teams. The questions within this article are:

– Which of the 66 team KPIs are still valid differentiators between winning and losing teams?

– Of these KPIs, which are valid differentiators across different competitions?

In other words, are there some KPIs that consistently discriminate between winning and losing teams, irrespective of the competition or passing of time? Which KPIs should teams closely consider in implementing their strategies? The summarised results of a part (only 20 out of 66) of the tested KPIs are in the two tables that follow.

Table 1: Distribution of statistically significant KPIs

Table 2: Winning and losing team averages and mean differences

i. 56 of the 66 unique KPIs show statistically significant[i] differences in at least one competition.

ii. Only 6 KPIs are significant across all competitions. A further 11 were significant across four of the five competitions, 6 across three competitions, 14 across two competitions and 19 in only one competition. 55 were significant when all the competitions were grouped together into one dataset (‘All Data’ column in Table 1).

The above two points serve to demonstrate that while most of the KPIs considered in the literature do represent areas of play that differentiate between winning and losing teams, the majority of these KPIs are not always reliable. This is an important insight and is something that performance analysts and coaches should bear in mind when presented with analyses of performance data.

Points of interest

The vast majority (85%) of KPIs found to be significant represent differences that are either negligible or small. This would appear to support the notion that often there is little that separates winning and losing teams across the different facets of play.

Those KPIs that display medium-to-large effects of winning vs. losing teams revolve around scoring tries, holding possession, territorial dominance and creating and finishing scoring opportunities when entering the opposition 22m.

– Most of these KPIs would fall into the category of ‘common sense’ knowledge. It is widely accepted that every game strategy involves a team placing itself in a good territorial position to create scoring opportunities. These KPIs confirm this hypothesis – winning teams enter and gain possession in their opponent’s 22m area more often, and this, combined with a greater ability to convert scoring opportunities, results in them accumulating more points.

– It is also important to note that many of these KPIs are correlated with one another (e.g. all the points scored KPIs are related to tries scored). This should be taken into account when interpreting these results as a whole.

I think that these results serve to strengthen the ‘common-sense’ hypothesis put forward at the beginning of the blog. Examining the 20 KPIs here, there isn’t anything unusual. For example, one would generally expect winning teams to concede fewer turnovers and make more metres per carry on average.

Two KPIs worth discussing are that of % possession and Kicks out of hand. Whilst there is a great deal of debate surrounding the most effective game plan, the evidence here does support the theory that winning teams kick more and enjoy greater possession than losing teams. If we interpret the combination of these two KPIs along with the rest that have medium-to-large differences, they point to winning teams having a more effective kicking game in general. Winning teams execute kicks that either gain territory or are contestable, enabling them to regain possession more regularly (kicks where possession was regained KPI is not shown above but was significant in both the Six Nations and Super Rugby).

Another area of team performance that is important is the ability to convert scoring opportunities into points. Teams who have a high conversion rate in this regard are often referred to as being ‘clinical in their finishing’, and coaches frequently refer to this ability in their post-match comments. If we consider the two KPIs: Points scored when possession starts inside opposition’s 22m area and Points scored when possession started outside opposition’s 22m area, we see that winning teams averaged double the number of points than losing teams in each case. This confirms that winning teams spend more time in the opposition’s 22m area (see Times opposition 22m area was entered and Times possession began inside opposition’s 22m area) and that being clinical in converting scoring opportunities when they arise is a key differentiator between winning and losing teams.

What next?

Neil’s next blog will focus on comparing the teams who finished at the top/won these competitions with those who finished at the bottom in order to gain further tactical insight into those areas of play that are important in determining successful performance.

[i] Here statistically significant means that it is highly unlikely (less than 5% chance) that the difference was observed due to chance.

[i] Due to the small number of games in the Rugby Championship, fewer KPIs showed statistically significant differences.

Exploring data analysis in rugby union

Points of interest

What next?

We Also Recommend

Sign up to The Scoreboard