Skip to Main Content
Pro Clubs & Colleges

From the Pro Forum Judging Panel: Five Potential Abstract Ideas


With just over a month to go until the deadline for Pro Forum submissions, competition judge Devin Pleuler shares his thoughts on potential areas of research which could be considered by data scientists and analysts in the open submission categories.


By: Stats Perform

If you are reading this, then it is very likely you are interested in submitting a research proposal for the 2022 Pro Forum.

As you consider the research area you want to focus on, I thought it may be useful to share a few thoughts on what may catch the eye of the judges when it comes to evaluating abstracts.

Here are five half-baked ideas that have been knocking around in my head for a while, which I would be excited to see someone chew on a bit more.

1. Synthetic Event Data

Using event data for training, is it possible to build a realistic game simulator that produces synthetic event data that is indistinguishable from the real thing?

If so, this opens the gateway for longitudinal simulations. For example, you could tweak starting conditions and measure the potential impact across multiple seasons.

2. Player Parameterisation of PCF

Pitch Control surfaces have become one of the foundational methods for analysing full tracking data as they elegantly capture important tactical context in a visibly interpretable fashion.

However, most PCF implementations assume that all players are equal – with the same top speeds and ability to accelerate. Is there a way to build and validate a parameterised pitch control function that takes into account individual physical characteristics?

3. Body Position Inferencing

Knowing the orientation of on-ball players is a critical hole in performance evaluation. Players that find their teammates while facing forward are typically more effective than those who only pass the ball to their teammates with their back to goal.

Can a model be built and validated that estimates which direction a player is facing when they receive the ball? Is it possible to build remotely reliable heuristics for this based on event data?

4. Efficiently Storing Value Surfaces

This one is more on the soccer data engineering spectrum. A 25 hertz sampling rate across 90 minutes for 22 players gives you roughly three million player coordinates per game. This is already a pretty difficult to store in its raw form.

If you wanted to store a value surface, like pitch control evaluated in square-meter bins, this value explodes to around 1.3 billion data points per game. That’s a 1000x increase. Are there efficient ways of storing this data? Perhaps using an auto-encoder to find a representation vector at a lower dimensionality?

5. Player Health Benefits of Possession

In general, the greater physical load put on a player during a match, the more likely they will miss time through squad rotation or injury. Additionally, it seems that players are required to cover more distance when they are trying to regain possession.

Are these assumptions true? And is it possible to quantify the overall player health benefits of certain possession-positive styles of play?

It is worth noting these ideas are absent of any real technical details, so copying and pasting these into a proposal aren’t going to get you very far.

Having a robust experimental design and methodology is really what sets good Forum abstracts apart from the rest.

Devin Pleuler is Director of Analytics at Toronto FC. He is one of five judges who will be reviewing each Pro Forum proposal submitted ahead of the 29th November deadline.

Click here for full submission information, including details of each proposal category, the judging criteria and data samples available.