Skip to Main Content
AI in Sport, Broadcasters & Connected TV, Federations & Rights Holders, Pro Clubs & Colleges

How Sport is Leading the Next Wave of Game Changing GenAI Advancements

By: Patrick Lucey

Chief Scientist Patrick Lucey is back for the final Latest Trends in AI in Sport instalment of 2024. In this update, Dr. Lucey discusses Specialized Enterprise GenAI, and how the latest game changing innovation is applied to us in the world of sport.


The timing is appropriate for Stats Perform to launch our 2025 Sports Fan Engagement, Monetisation and AI Trends survey since it marks the two year anniversary since ChatGPT was released, and it changed everything. AI went from being a niche tool, only used for specific tasks, to a general purpose utility which is used by hundreds of millions of people every week (ChatGPT just announced that they have 300 million weekly users worldwide).

Although there are still issues around hallucinations, for most knowledge workers worldwide – who use it for tasks like general question/answering, proofreading, translation, brainstorming and coding – it is the ultimate assistive tool, as it enables workers to do much more than before. Indeed, one of the many findings in our survey reveals an increasing number of sports media execs across broadcast, teams, leagues, federations, sponsors and sportsbooks are adopting AI in various ways to help grow their audiences and commercialize their content, and are finding it easier to do so than those who are lagging behind.

Of course, the AI innovations didn’t stop with the initial ChatGPT release. It seems as if new innovations are coming out weekly. For example, over the last couple of months, there have been some amazing new innovations. In addition to Nobel prizes going to AI pioneers Geoff Hinton and Demis Hassabis for physics and chemistry respectively, the recent product launch of Apple Intelligence, improvements to Meta’s Ray Ban smart glasses. OpenAI o1’s reasoning model for complex tasks has been astonishing, and most recently Google’s Gemini 2.0 release.

However, even as we anticipate the latest model release of GPT-5 (or Orion) from OpenAI, there are growing rumors that innovation is drying up, and we are hitting a wall – that the initial rapid improvements from using more data, and bigger models are seemingly reaching a limit. Google’s CEO echoed this sentiment, stating that the “hill is steeper” for AI advancements with current LLMs.

Whilst there is some substance here, contrary to what you may initially think, however, this does not signify the end of innovation in the GenAI space. Far from it!

Instead, we believe it heralds a new phase of GenAI innovation. One which centers on enterprise use cases – which we call Enterprise GenAI. In this article, we highlight what this means, and how it applies to us in the world of sport.

Are Current LLMs Hitting a Wall? Why?  

To a degree, we are reaching a threshold of sorts for the current text-based LLM use-cases (e.g., ChatGPT). And the reason is simple: it is due to the lack of new data for the models to learn from.   

The large language models (LLMs) used today in popular Gen AI applications are trained on massive volumes of data; mostly text but also augmented by audio, images and video, which is taken mostly from the internet – but these models are close to learning as much as they can from this data, and no meaningful new sources of public data exist at scale.  

Essentially, these models have maximized what they can get from these public data sources.  

However, there is so much more information outside of the text and image data we find on the internet.   

Extending Model Applications and Performance 

Instead of training larger models, companies are now looking both make current models more efficient and quicker (see Meta’s Llama 3.3 release), while extending the types of tasks these models can do using new, supplementary, domain-specific data sources. With these new data sources, new tasks and solutions can be created.

This means that LLMs can, for example, now venture into the more complex domains of math/geometry and physics, as OpenAI did recently with their “o1” model. The o1 model exceeds PhD-level accuracy on a benchmark of physics, biology, and chemistry problems, as well as placing among the top 500 students in the US in a qualifier for the USA Math Olympiad. Google’s new Gemini 2.0 model, also enables AI assistants to go off and accomplish tasks such as search the web and write detailed reports via their “Deep Research” tool.

New tasks like these are far more complex than most, and hence current approaches need to evolve to enable the model to solve them. For solving maths/geometry/physics problems or researching complex topics, the model needs to map out a series of steps (something called “chain-of-thought”) before providing an answer.

These types of models are called “reasoning” models as they appear to mimic how humans “think” before answering (although as highlighted recently by Yann LeCun in his recent lecture at Columbia University in New York, such models still lack the ability to plan effectively and are more of an approximation).

But as above, rather than being more intelligent (i.e., learning new tasks from the same data), reasoning models are just being expanded to new tasks by first including a new set of data which is specific to those tasks (i.e., math/physics/chemistry).

Then, they are optimized to reach top performance on a set of benchmark tests.

So essentially to improve the perceived performance or types of tasks a model can do, the key is to train the existing models on new datasets and then optimize them for those new tasks.

We are seeing this in the computer vision realm with the various segmentation models, which require detailed segmentation maps (i.e., each pixel labelled in the training set with a label to which object/segment it is assigned too), video game agents that can suggest what to do next, or the start of embodied computing, where models are adding the mode of capturing the clicks/typing.

And, the only real way to improve or extend the capabilities of today’s large language models is to use differentiated data.

But where do these new differentiated datasets exist?

One area is “Sovereign AI”, where countries have access to their own unique data (think of healthcare, transport and defense) and can use that data as fuel to build models which can address country-specific questions. Another area is in the world of business, where businesses or “Enterprises” have their own unique data, and can address their own Enterprise-specific questions – hence the name “Enterprise GenAI”.

Enterprise GenAI 

According to IBM, less than 1% of data available in Enterprise (i.e., the data that companies collect as part of their day-to-day), is available on the internet.  

The remaining 99% of Enterprise data represents of course a vast information pool, containing vastly rich patterns and insights, that could potentially be used to help perform new, specific tasks and fuel human innovation more efficiently and effectively.  

Enterprise data therefore represents fertile land and using Enterprise data for Generative AI seems like the most likely route to continue the growth in the field.  

In terms of Enterprise GenAI applications, there are two key use cases,  dependent on the type of data: 

  1. Generic Enterprise Data: This refers to generic types of text, audio, and image/video data that are private to a business. For text data, this would include internal communications, customer interactions, operational documents, sales and marketing materials, product and technical documentation, legal and financial records, HR data, and external communications.  Current text-based LLMs, enhanced with Retrieval-Augmented Generation (RAG) techniques, provide an excellent starting point for interrogating, accessing, searching, and translating these documents. These capabilities can be further improved through model fine-tuning. Similarly, current LLMs can be used for audio transcription and summarization, while Visual Language Models (VLMs) can handle tasks such as generic object detection within this data category. 
  2. Specialized Enterprise Data: This encompasses data types unique to the business or its operations, such as sensor-generated data, spatiotemporal data (e.g., GPS coordinates or event tracking), and data from machinery, engines, or other equipment. We explore some of these in more detail below. These datasets often require specialized processing and analysis techniques. Unlike generic data, specialized data is highly domain-specific and tailored to the company’s operational or industrial context and often represents the company’s more valuable IP. The steps to utilizing this data consist of: i) collecting data, ii) transforming it to a language, and iii) utilizing that language.  

In the remainder of this article, we will focus on the GenAI applications that utilize Specialized Enterprise Data.

How is Sport Related to Specialized Enterprise Data? 

Although generic data exists in sports, the sports data that captures the live, dynamic performance of the players on the field/court of battle is one of the most interesting, unique and “specialized” datasets that exist in the world. It is dynamic and has massive value if it’s able to be accessed live, but it also needs to be broad, deep, uniform and consistent to be able to be efficiently and effectively used in modelling – and of course, it needs to be accurate. The value behind this data is that it objectively reconstructs the story of performance – the more granular the data, the better the reconstruction. Through another lens, this data can be seen as the universal language of sports – and at Stats Perform, we have created this language.    

Like most sophisticated languages, the language of sports is multi-modal. The prime modes being both “event data” and “ball/player tracking motion” data (see Part I and Part II of our previous Latest Trends in AI in Sport for a description).  

Although sport is responsible for over 10% of internet searches every day, the data that currently exists on the web in a form that could be utilized to train a model is out-of-date, fragmented and often shallow and narrow in nature.  

A proprietary “enterprise” dataset like Stats Perform’s massive Opta database on the other hand, is up-to-date, complete, comprehensive in terms of breadth and depth, consistently collected, and contains information not available anywhere else at scale, such as highly detailed event data and off-the-ball positional and movement data.  

Due to these characteristics, our “specialized” sports dataset is similar to the data collected in the logistics, manufacturing, transport, autonomous vehicles, weather and biology domains, and represents the fuel for the next wave of future AI applications.  

What Are Some Examples of Specialized Enterprise GenAI Outside of Sport?  

In the first wave of Generative AI (e.g., ChatGPT), the fuel driving this wave was large amounts of generic text data. Text data is a great place to start, as there exists an enormous amount of it available publicly and the data is sequential in nature, which are two key attributes for LLMs to thrive. Outside of sport, there are many domains that exist that have enormous amounts of sequential data which are also prime for LLMs to thrive – and they are potentially going to change the world (or already are changing the world). Below we have highlighted four.  

 In the first example, we have the use of autonomous vehicles which are now utilized in some cities in the US for driverless taxis -most recently launched in Los Angeles last month and will be in Miami soon. The sixth-generation Waymo Driver includes 13 cameras, four lidars, six radar units and an array of external audio receivers, as well as high-resolution maps to monitor the environment and safely navigate autonomously. From these rich sources of input data, these robotaxis use an autonomous vehicle-specific foundation model which maps all these information sources into one model to measure and predict behaviors specific to the world of autonomous vehicles.  

The second example is around weather prediction. Accurate prediction of weather is vital for all industries, whether it is transport, agriculture, public safety or just day-to-day life. Current methods of weather prediction require super-computers to do complex physics equations which take time and compute. Additionally, to have the most accurate predictions, you need high-resolution imagery which is hard to obtain at scale. However, recent work has shown that accurate predictions can be made using a foundation model that requires less computation but also can utilize low-resolution inputs and achieve the same accuracy. This week Google’s DeepMind just released a model called GenCast which can predict weather more accurately than the best system currently in use – and does it in minutes compared to hours which is the time it takes current models to generate their forecasts. 

This connects nicely to the third example around Robotics. Whether it is a single robotic arm identifying, sorting and handling your packages, or a robot monitoring a farm and identifying and picking your fruits or vegetables at optimal yield, key breakthroughs are occurring due to the sensors’ ability to measure domain-specific attributes, in addition to other inputs such as accurate weather prediction. The impact of this work is that packages can be accurately delivered to you in a timely fashion (meaning that it will be cheaper and received in a more timely fashion), and food will be not only picked at the optimal time but also more food can be generated without waste.

The fourth example is around chemistry and biology. As mentioned at the start of the article, the lead scientist from the DeepMind team won a Noble prize in chemistry for their work on AlphaFold, which accurately predicts the 3D structures of proteins in hours rather than years. This is important because this method can be used for drug development for diseases, as well as targeted drug therapy by utilizing the various contextual factors of a person – both not possible with and hence a major drawback of current methods. The potential of these methods can also be applied on the creation of new clean bio-fuels to address energy shortages in a clean and renewable way or break down waste products such as plastics which is currently an issue for the planet. 

What these four examples have in common is that they rely on enormous amounts of sequential data.  For autonomous vehicles, the inputs are not textual words but point-clouds from LIDAR, the images from the RGB cameras, the fine-grained maps as well as the information from within the car. For weather, the input is the various sensor inputs from the various locations, for Robotics – it is the depth-sensors, robotic sensors and dictionary of possible products, and with the biology example, instead of words it is the protein structures, and/or DNA, RNA.  Each domain has its own language – once that language has been established, you can do language modelling (preferably large language modelling (LLMs). These models then can accurately represent, describe and predict what occurs in these specific “Specialized Enterprise” worlds.

Transformers – The Universal Learner: “Just Add Sequential Data” 

Once you have a large amount of sequential data, you need to use the right machinery to learn from this data. The key piece of machinery is the “transformer neural network,” which can contextualize information much better than previous machine learning methods. ChatGPT and other LLMs have shown that transformers are great learners of generic sequential data (e.g., text, images/video, audio). But what is often missed, is that these models can work on other forms of sequential data, such as sports data, which will we show later.   

However, to get an intuition of how these transformers work, let’s use two example sentences using text data (this example has been adapted from the blog post that introduced the paper original “Attention is All You Need Paper 

  1. “The man deposited money at the bank” 
  2. “The man sat on the bank of the river” 

To get a computer to understand the sentence, we first tokenize the sentence which is just converting words (or sub-words) into numbers. Prior to transformers, we would represent these words independently, which would mean the computer would represent the word “bank” with the same numbers.

But if you look at the words in the sentence which are around the word “bank”, we as humans understand that it has a different meaning. Using a transformer model, we can effectively learn from the words which are around the word of interest. When this happens, the model will learn that these words have different meanings, so the numbers representing the words will be different (see below).

Specialized Enterprise GenAI in Sport: Making Use of the Sequential Nature of Sports Data 

Now you are probably asking, how is the above example important in sport? Well, first of all, our specialized sports dataset is sequential. If we look at the starting line-up of a team, such as Manchester City, the team is essentially a sentence. Each player is a word, and we can order those words from goalkeeper up to striker. Some players (i.e., words) have a stronger impact than others, such as Erling Haaland. When he is playing, he is going to impact what the other players are doing (i.e., the players will try to set him up for goal-scoring chances), and he will also impact what the opponents are doing. But if Haaland is rested or injured, and Jack Grealish comes on (see below) – he will impact how the players are playing (i.e., changing that one “word” has a massive impact on the meaning of the sentence – or how that team will be playing). Like the weather example highlighted previously, using a transformer with a sequential representation of player performance – we can yield much better prediction performance of future player performance compared to current approaches which predict players independently of each other.

Additionally, the event data which captures what happens on the ball is like a sentence, but instead of words we have the action taken by a player (e.g., pass by Player A, at location X,Y at time T) and we have a sequence of these events until the half or match is over. Tracking data, which captures the position and motion of the players and the ball at every frame, is also sequential both in terms of space and time. Using transformers not only helps us model the sequential nature of the data much more effectively, but it also enables us to get both information streams in the same frame of reference, which enables us to do things such as our trajectory generation, which we previously highlighted in Part II (see below).

Once we have these “foundation” models set, we can append other information sources or modes to these models. What we are doing here in sport is a great example of how to utilized Specialized Enterprise Data – which leads to Specialized Enterprise Language Models – which in our case helps with better predictions, simulations and also better measurements of performance – which ultimately benefits sports fans.  

It has been a thrilling 2024 and 2025 has even more exciting advances in store. Thanks for reading and if it’s your first time, check out Part I and Part II of our earlier updates on AI in Sport and request access to our 2025 Sports Fan Engagement, Monetisation and AI Trends survey here.