Skip to Main Content

Telling Better Stories through Data

By: Andy Cooper

As the saying goes, “Numbers don’t lie,” right? It turns out that phrase isn’t wholly true, especially when those numbers are telling a story concerning sports.

Take the 2014 FIFA World Cup match between Germany and Brazil. While Germany took home the win, Brazil far outstripped them in total shots, shots on target, saves, dangerous attacks and deliveries in pen area – typical indicators of a high win probability. Brazil’s likelihood of winning appeared greater because those numbers are missing one key piece: context.

The crucial part of this match’s story is told is in the context of those numbers. STATS has crafted the Expected Goal Value (EGV) to look at the bigger picture of what happened on the field and why this match ultimately turned out the way it did. By reviewing nearly 10,000 shots from across recent soccer seasons, and analyzing the 10 seconds before each shot, STATS created an algorithm that can break down the large chunks of data – like total shots, saves and passes – into context-specific classifiers.

These classifiers allow for a detailed understanding of what happened. Brazil took more total shots than Germany, but where were Germany’s defenders in proximity to Brazil’s shooter? What part of the field was the shooter at? Was it an open-play formation or a counter-attack? All of these factors need to be taken into consideration to provide an accurate understanding of what happened.

The EGV isn’t only valuable in understanding what has happened; it can be used to determine more realistic probabilities of future matches and the effectiveness of teams. With these contextualized data points mapped into clusters, STATS can drill down and create formations to model behavior in a significantly improved manner. The goal is to find the formative data, which leads to results like understanding how teams have interacted over time, the most-used formations of one team in home or away games, and whether “home-field advantage” is actually real (it is!).

With context in place and an algorithm that addresses data points on multiple levels, STATS is proud to have developed a more accurate predictor of the success of teams. You can read more about the EGV in our paper HERE, or watch our webinar HERE.