Text-Mining Formula 1

Formula 1 is often described as the pinnacle of the motor-sport and only the best drivers can compete in this exclusive competition. The best 20 drivers divided in 10 teams form the entire grid in Formula 1. To get the best performance from the single seater cars, a lot of sensors and communication materials are fitted on the car. The data generated from these sensors can range from several measurements within the engine to a sensor which measures the tire temperature of the car. All of this data is communicated between the driver and his race engineer during the race to get the best performance from the car.

Mexico F1

The communication between driver and his engineer is called team radio. Other things that can be discussed on this team radio are providing mechanical and race strategy information, relaying complaints about other drivers and warnings about the track itself. Team radio is also used to issue warnings, complaints and penalties from and to the stewards. The research into this topic was mostly focused on exploratory search, meant to show what can be done with the data and to show the results thereof. Hence many different methods were applied and tested.

Formula 1 Communication.png

So what are they talking about and can compare the results of text-mining to the outcome (or even better: predict the outcome) of the race? Is the overall sentiment of the race reflected in the text-mining results? How do different drivers behave and what topics do they communicate about? This is exactly what the students H. Bongers and L. Linders asked themselves when they selected a topic for the practical assignment of the course Text-Mining at the Department of Data Science and AI of Maastricht University.

They downloaded all communication of the 2017 Formula 1 races and started their text-mining research. In their results they analyzed the overall content of the Radio communication in particular races. Here are some interesting results.

Race Results and Driver Behavior

As in 2018, Lewis Hamilton secured his World Championship in Mexico. Like 2018, Max Verstappen won the race. Interestingly, a Topic cluster visualization shows Lewis and Race as the dominant topics, with Max as a good third one.

Mexico F1.png

Looking at the drivers themselves, Louis Hamilton (#44), is one of the more calculating drivers who is currently driving, often asking his engineers on updates on his tires, strategy and racing pace. This also reflects in the topic modeling, where these topics show up prominently:

Hamilton F1.png


Sentiment Analysis and Event Detection

Sentiment analysis is the practice of labeling sentences by how positive or negative they are in regards to the vocabulary used. Event detection focuses on identifying important events by identifying large variations in sentiments or emotions. Using this technique, the 2017 Brasil race, where no major incidents occurred looks as follows:

Brasil F1.png

The 2017 Singapore race, on the other hand, had 3 major incidents, which occurred exactly at the moments of the 3 local minima in the graph:

Singapore F1.png

So, we can definitely observe a relation between the reality of the races and the results of the text-mining process. A very good project and a great step up to more future  research, especially with all the great results of Max Verstappen!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s