Text-Mining Donald Trump

For many obvious reasons, text-mining Donald Trump’s communications was by far the most popular topic that students choose for the practical assignment of the information retrieval and text mining course of Maastricht University’s Department of Data Science and AI in 2018. Here we will show some of the results.

Maartje Stokvis analyzed 33518 tweets sent by Donald Trump for positive and negative sentiments. Obviously, the negative sentiments were mostly about his opponent, Hillary Clinton. Because he sign all his tweets with Donald Trump, his own name also shows up in the following word cloud. Some additional pre-processing could have prevented this.

Looking at the positive tweets, one can clearly observe what one would expect, including the “Make America Great Again” hashtag. Why Hillary Clinton shows up here too, is probably because he compares his positives against to her.

Julian Gorfer and Xi Chen looked at hundreds of thousands of tweets about Donald Trump, analyzing hashtag frequencies and identifying the 6 main topics in Trump’s communication using topic modeling using Non Negative Matrix Factorization (NMF).

As expected, most hashtags are on MAGA (Make American Great Again), but impeachment is #2 together with Stormy Daniels, as can be found below.

When taking a closer look at the six main topics, one can find a word cloud representation of the dominant words used in these topics below. Here too, we can clearly identify the main topics of the 2016 elections.

More interesting is when an analysis is made we identify the cynical tweets (using a special classifier which was trained to recognize cynical language from a large data sets of examples), word choosing are not that careful as one can observe!

In the following graph, a representation over time of both the sentiments and the levels of cynicism during the 2016 elections is expressed, although it is a bit complex visualization and hard to understand at first sight, one can observe that during the progress of the elections, positive sentiments are replaced by more negative ones and the levels of cynical language increases as well, which is all as one would expect.

S. Drenckberg and J. Schmidt looked at all the presidential speeches, extract topics and measures Trump’s sentiment on these topics.

Overall very negative, only positive after he was elected and a maximum of positive messages in the election speech in January 2017.

Tobias Stumm and Alexander Lukas combined all of the above by looking at both the transcripts of speeches and reactions to those: news articles from Reuters and thousands of tweets. Their approach was to measure the topics of Trump’s speeches and then relate these to sentiments expressed in the Reuters articles and public tweets on these specific topics. An overview of the topics over time (aka topic rivers), can be found below.

After extracting topics per tweet, they created a so-called Topic Co-Occurrence diagram, showing how often certain topics were used in the same communication. The thicker the line, the stronger the co-occurrence.

Per topics one can also determine the overall sentiment of the topic: positive, neutral or negative. Combining these insights can be found below, where a clear overview of the sentiments per topic is visualized. Here too, all conformation of what we would expect, so text-mining really works on these data sets!

Next, a temporal visualization of the overall sentiments during the elections was derived, which can be found below. Not surprisingly, the number of positive tweets reduced significantly from 2016 to 2018 and negative tweets start to dominate. This also includes Donald Trump’s tweets, which clearly become more negative in nature as the campaign progresses.