Using Network Science to Quantify the Identifiability of Football Teams

Javier Buldú and David Garrido exhibited a poster presentation at the 2020 OptaPro Forum, which introduced a Network Science approach to quantifying how the playing style of a team is maintained over the course of a season, based on the persistence of passing patterns, using Opta data.

In this guest blog they outline the methodology behind their presentation, together with a summary of the key findings.

Click here to view their poster.

Introducing Football Teams as Complex Systems

“A complex system is a system composed of many components which may interact with each other. Complex systems are systems whose behaviour is intrinsically difficult to model due to the dependencies, competitions, relationships, or other types of interactions between their parts or between a given system and its environment”

The brain, the earth climate and ecosystems are obvious examples of complex systems. We would also argue that football has a strong case for being categorised in this way as well.

Why?

Well, during a football match twenty-two components, called players, interact with each other in a complex way, creating dependencies, competing and, more importantly, generating emerging properties such as “playing patterns”. For these reasons, complexity sciences are a viable alternative to analyse football datasets, introducing new perspectives on the analysis of the beautiful game.

The reasons for this lie in the complex nature of football, which, paraphrasing the foundational paradigm of complexity sciences, “cannot be analysed by looking at its components individually (i.e, players) but, on the contrary, considering the system as a whole”. Even the most successful player in a match recognises that “it’s not just me, it’s the team.”

Translating team activity into a complex network is one of the many approaches based on complexity sciences. The organisation of a team can be analysed considering the interaction between its players through passes. We can construct passing networks, which contain information about how the ball has been moved, from player to player, during the whole match.

Passing networks are “complex networks” for two main reasons:

1. They are composed of nodes (players) and links (passes) between them;

2. The interplay between nodes follows certain “complex” rules.

Furthermore, these networks are not easy to analyse given they are directed (i.e, links between players have a certain direction), weighted (the weight of the links are the number of passes between players), spatially embedded (i.e, the Euclidean position of the ball and players is highly relevant) and time evolving (i.e., the network continuously changes its structure).

Figure 1 below shows an example of the Real Madrid passing network during a match against Barcelona from the 2017/18 season. In the plot, player sizes are proportional to their importance in the passing network. Players are placed in the average position from where their passes were made. The widths of the links are proportionate to the number of passes between two players. Finally, substitutions are highlighted in green.

From this graphic, we can rapidly get an idea of how Real Madrid played, how they occupied the field, how their players were interacting between each other, and how the organisation changed following substitutions.

Figure 1: A schematic illustration of a Passing Network, Real Madrid vs Barcelona, 2017/2018 season.

This is just a snapshot of how translating team activity into a network can help us to understand team organisation. A diversity of metrics can be extracted from the network structure at different spatial and temporal scales, leading to a better comprehension of how a team is organised and how players contribute to the team’s performance (Buldú et al., 2018).

This is a task for Network Science, the branch of complexity sciences that analyses network structures and dynamics. Network Science is yet to be fully adapted for data analysis in football, but has the potential to provide new perspectives on performance in years to come.

Pitch Passing Networks

There are other ways of creating passing networks. If we are more concerned about the spatial organisation of a team, instead of the role of the players, we can construct and analyse pitch passing networks. In this instance, the nodes of the network are not player specific, but specific regions of the field, which are connected through passes made by the players occupying them.

Figure 2 shows examples of Barcelona’s pitch passing networks against Real Madrid.

Figure 2: Plots, from L to R, are the 3×3, 5×6 and 10×10 passing networks for Barcelona, where nodes are regions of the pitch and links account for the number of passes between them.

Why do we plot three networks instead of one? The reason is that the pitch can be divided into areas of different sizes, leading to pitch networks of different scales. In this way, the three networks of Figure 2 correspond to the same team during the same match and the only difference is the size of the partitions. However, note that the structure of the network is different depending on the number of divisions, indicating that analysis of the network properties at different scales is required.

The Identifiability of Football Teams

Now that we have defined the framework, it is time for the questions:

Is it possible to quantify to what extent a team has a defined playing style?
What teams adapt to their opposition and what teams remain loyal to their style?
Can we quantify which teams impose their style on an opponent during a match?
Which teams behave differently when playing away?

To answer all these questions, we applied Network Science to analyse the organisation of pitch passing networks.

Using the event data from a match, we constructed the multi-scale passing networks associated to each team and analysed their structures using different methodologies coming from Network Science.

In this way, we were able to identify:

Which teams imposed their playing style over their opponents;
What to expect from their opponents before a match;
How to evaluate whether a team played in line with what was expected.

Our only source of information was the way each team passed the ball, disregarding the number of shots, goals, tackles, dribbles or any other action. However, as we will see, passing patterns are still able to capture the essence of a team’s organisation.

We divided the pitch into n x m regions (with n =1,2,3,…10 and m =1,2,3,…10) and constructed the pitch passing networks, where nodes corresponded to the N=(nxm) regions of the pitch and a_ij accounted for the number of passes from region i to region j.

We analysed, across the whole season, the properties of the resulting connectivity matrices A{a_ij} at different spatial scales. The elements of the connectivity matrices are the number of passes between the regions of the pitch, i.e, the mathematical abstraction of the passing patterns of each team. We calculated the consistency parameter (C) of each team by quantifying how similar the connectivity matrices were of a given team during the season.

In short, teams with a high consistency maintained the structure of their passing networks throughout the season, while teams with a low consistency changed their organisation from match to match.

Next, we quantified how unique the pitch passing networks were of each team. This can be done by comparing the structure of the passing networks of a given team, with those of the rest of the teams in the competition. We call this parameter the rival similarity R. Finally, we defined the identifiability parameter (I) of a team as the consistency parameter C minus the rival similarity R, i.e., I=C-R.

Teams with a high identifiability parameter are those who are consistent and, at the same time, different from the rest.

Our methodology has both descriptive and prospective applications. On the one hand, we were able to identify which teams maintained their playing style (“high identifiability”) throughout the season and those that, on the contrary, did not have a consistent style (“low identifiability”).

In collaboration with LaLiga, we computed the identifiability parameter of the 2017/18 Spanish top-flight teams.

Figure 3 shows the values of the identifiability of Barcelona and Málaga, the teams who finished top and bottom of the table respectively. On the horizontal axis, we have plotted the number of nodes into which the pith is divided since, as we explained, all scales must be analysed. Interestingly, we observed how pitch divisions of around 50 areas (nodes) were the ones leading to a better identification of the playing style of Barcelona. Concerning Málaga, we can see how their identifiability was rather low at all scales.

Figure 3: The graphic on the left plots the identifiability parameter of Barcelona, based on the number of divisions (nodes) of the pitch. On the right, the same analysis is displayed for Málaga.

Application

Crucially, this information can help coaching teams prepare for a match through identifying the expected approach of their opponents.

For example, it is possible for a team to evaluate the identifiability of its next opponent and decide whether or not to adapt their own approach based on the opposition’s playing style (when the opposing team has a high identifiability) or try to impose their own style on them (in the event of facing a team with a low-identifiability).

We can also use identifiability to quantify, for every single match, which team played most similarly to their own style.

Table 1 shows the match-by-match identifiability difference between home and away teams during the 2017/2018 LaLiga season.

The matches where the home team, listed on the vertical axis, imposed its own playing style (i.e., had a higher identifiability) are highlighted in yellow, while the green cells correspond to away teams imposing their styles. The teams have been ordered based on the final league standings, with the aim of showing the connection between identifiability and the performance of a team. The yellow cells mainly appear above the diagonal matrix line, indicating that when two teams play, the one ranked in a higher position has a higher probability of imposing its own playing style.

If we highlight individual teams, we can see that Barcelona won the “identifiability contest” in more matches, both at home and away, followed closely by Real Madrid. However, it is worth stressing that differences in identifiability is not always an indicator for the match result, since there are some matches where identifiability was higher for the team that lost the match. This is the nature of football, where playing your way does not always guarantee success.

Table 1: Home teams are listed on the vertical axis, arranged by their final league position. Away teams are on the horizontal axis, arranged in the same way. The match result displays inside each cell. In yellow, the home team imposed its style; In green, the visiting team. Cells in blue correspond to matches where there was no clear difference between the identifiability of both teams.

To conclude, it is also worth highlighting that it is possible to obtain a real-time estimation of the identifiability parameter as a game is taking place, highlighting when a team, or their opponent, is playing as expected. This is valuable information, which could potentially inform key in-game decision making from the bench.

Further applications of this methodology would also allow analysts the opportunity to evaluate which teams behave differently when playing at home or away, or identifying those regions of the pitch where deviations from a team’s expected passing patterns occur during a game.

References

Buldú, J. M., Busquets, J., Martínez, J. H., Herrera-Diestra, J. L., Echegoyen, I., Galeano, J., & Luque, J. (2018). Using network science to analyse football passing networks: dynamics, space, time and the multilayer nature of the game. Frontiers in Psychology, 9, 1900.

Buldú, J. M., Busquets, J., & Echegoyen, I., & F. Seirul.lo (2019). Defining a historic football team: Using Network Science to analyze Guardiola’s FC Barcelona. Nature Scientific Reports, 9(1), 1-14.

Possessing a PhD in Applied Physics, Javier Buldú is the coordinator of the Complex Systems Group at the King Juan Carlos University in Madrid, as well as being the Principal Investigator of the Laboratory of Biological Networks at the Center for Biomedical Technology.

He can be contacted via email at: javier.buldu@urjc.es

David Garrido is a PhD student, studying at the Center for Biomedical Technology & King Juan Carlos University, Madrid, Spain.

Using Network Science to Quantify the Identifiability of Football Teams

Introducing Football Teams as Complex Systems

Pitch Passing Networks

The Identifiability of Football Teams

Application

Sign up to The Scoreboard