For part one please see our story here.
What can the Language Models and Transformers behind ChatGPT do in sport?
Beyond chatbots, is there a place for language models and Generative Artificial Intelligence in sport?
The answer is a resounding yes, and in our opinion, it is the next revolution in sports analytics, to provide deeper than ever understanding of the game to teams and fans, especially in soccer.
To start, we have to specify the language of sport. For text, which is the required input for ChatGPT the language already exists. As we’ve seen, the ChatGPT language model utilizes raw text data, which has its own inherent structure (think letters, words, sentences, paragraphs). The transformer architecture learns the statistics and correlations by predicting missing words by utilizing the context within the sentence, paragraph and overall story narrative.
Sports data has a different inherent language structure. In soccer for example it is 11 vs 11 (or 10 on-field players vs 10 on-field players most of the time). Each of these players has a role, which can evolve or change during a game.
The representation of the player information during the game within a team is extremely important. We could say that each player is a letter (instead of using the name or jersey number, we utilize their in-match location and statistics, as well as their recent and long-term playing statistics), and each on-field event could be described as a word, and these letters have to be correctly ordered for us to understand that word.
Each play could be a sentence, and each possession could be seen as a paragraph. The chapter could be the match, and the season could be seen as a book. In addition to creating the right structure (or grammar), the input of sports data is both spatiotemporal (i.e., the x’s and o’s of players positions) and event-based (i.e., type of event and outcome of possessions) – which requires significant pre-processing. So we might think of sports data as a movie, rather than a book – the input is multimodal rather than a single input source.
Now that the input language of soccer has been defined, we can start learning a “soccer language model” which can enable us to generate outputs which we couldn’t do before. Now given the inherent issues that AI chatbots have in “hallucinating” facts, instead of aiming to generate outputs to any question asked, we can use ‘smart prompt engineering’ to answer questions which couldn’t be answered previously. For more on smart prompt engineering please refer to the first article in this series.
We give a host of practical examples below.
Opta Vision is an AI-enriched data feed that utilizes both computer vision data and our human event data. This is then processed through our graphical neural network to provide predictions on every event that describes both the team and individual player decision-making and execution ability.
For example, with our underlying formation representation (or language), we can detect which formation a team is playing (e.g., 4-4-2, 3-4-3, 4-3-3 etc.) and how it changes with and without possession.
We are also able to assign the role of a player in every frame, which enables us to see when players overlap, or if there is a tactical change during the match.
Using the same underlying representation, we can also predict the likelihood of a player making a pass, the probability of the pass creating a scoring chance in the next 10 seconds, or assess if it was the right option.
Using our soccer language model, we predict all the possible options at the same time; in other words, predict the sentence, instead of predicting each letter separately. We do this by using the locations, dynamics and events of all the players as the input sequence, and then mapping that to the output sequence which corresponds to the pass difficulty, availability etc.
Previously, we analyzed each player option independently, but using our large soccer language modelling approach, we can now analyse all players at once. This is obviously valuable in understanding the decision-making ability of each player for each event, but we can use a similar approach to predict the statistics of each player at the end of the game.
Transformer-Based Player Performance Prediction
Imagine you are a soccer coach, and you want to know which player will have the biggest impact on changing the result during the game. Previous predictions of a player’s performance were done independently of other players and opponents. Using our transformer approach with our enormous database of soccer data, we have created a model which can predict all the player and team outputs at the same time.
Again using the machine translation analogy, the idea is to map an input sequence of player and team information at the start of the game (or during the game) as the input sequence. We then use our transformer network to map this to the most likely output sequence, which in this case is the final match statistics. The power of using the transformer network, is that it can rapidly generalize to unseen situations. In soccer this is common-place, as teams in a regular season play home and away, and the lineups are often different (as well as managers), as well as recent form.
This is a new innovation, and in 2023 we will be showcasing the power of this technology. To train this model, we have utilized our uniquely deep Opta database of just event data which is over 1.5TB in size (which does not include our sizeable tracking data archive).
We can also utilize similar techniques to enhance our ability to do ghosting (i.e., simulate where players should have been in a play which was work we co-authored on previously in soccer and basketball).
Previous approaches learnt a supervised policy network to predict the behaviors of teams in a deterministic fashion, but with the advancements in language modelling, more creative outputs (some which may not have been seen before), could be generated. But it is worth noting that sometimes coaches/analysts will prefer to have the deterministic prediction (i.e., what did happen) vs what are different ways that the play could have been run.
Another benefit of using language modeling is that it can be used as an assistive aid in our sports data collection process (similar to the coding assistant mentioned previously), where it is used within our computer vision-based player- and ball-tracking systems, or to highlight a potentially erroneous data point for our human operations or integrity teams to assess.
ChatGPT is a fantastically ambitious and incredibly well-executed tool. Whilst it may not have as many direct applications in sports or news reporting as in other fields, the underlying Generative AI approaches are already in use at Stats Perform, using our own proprietary language of sport as the input. They are already powering many applications in the team performance arena and are sure to augment many aspects of sports content and analysis in the future.
Dr. Patrick Lucey is the Chief Scientist at sports data giants Stats Perform, leading the AI team with the goal of maximizing the value of the company’s deep treasure troves of sports data. Patrick has studied and worked in the AI field for the past 20 years, holding research positions at Disney Research and the Robotics Institute at Carnegie Mellon University as well as spending time at IBM’s T.J. Watson Research Center while pursuing his Ph.D. Patrick hails from Australia where he received his BEng(EE) from the University of Southern Queensland and his doctorate from Queensland University of Technology. He has authored more than 100 peer reviewed papers and has been a co-author on papers in the MIT Sloan Best Research Paper Track, winning best paper in 2016 and runner-up in 2017 and 2018.