Social Media Sentiment Analysis of Brexit
The Cyxtera research team was searching for a use case to test the continuous multimodal learning (CMML) insights featured in Brainspace, our investigative analytics platform. CMML is a predictive modeling capability that speeds discovery of insights.
We opted to dig into social media sentiment around Brexit, a topic of interest worldwide and one that appears frequently in online conversations. We focused our research on followers of Theresa May’s Twitter account. At the time of writing this research, Theresa May was the Prime Minister of the U.K. and a central figure in Brexit. As a control subject, we selected U.S. President Donald Trump’s Twitter account due to the amount of traffic created. We sought to contrast and compare highly ideological political viewpoints involving both Brexit and Donald Trump to look for bot versus human activity.
We ingested Twitter data into Brainspace and utilized the Communication Analysis tool and the Brainspace API. Additionally, we performed manual external OSINT research to obtain more metrics about automated account activity. After training the predictive models, we were able to analyze thousands of Twitter accounts in seconds. We now have portable models that can be imported into other politically-oriented Twitter datasets for similar research.
Within a dataset of Theresa May’s Twitter account, we discovered bots that amplify Pro-Brexit themes and bots that amplify Pro-Trump themes. On average, bots make up about half of highly political discussion (both Pro-Trump and/or Pro-Brexit) on Twitter. We expected to find a much higher bot percentage in accounts discussing both ideologies but did not. This may be evidence of how those that manage political bots are evolving their tactics and procedures to avoid detection. This could lead to a potential classification methodology in that humans tend to hold multiple ideologies instead of a singular focus.
In our research dataset, we determined that bots were responsible for 50% of the most Pro-Trump accounts and 66% of the most Pro-Brexit accounts; 38% of accounts scored highly on both.
As expected, many of the previously useful automated heuristics for bot detection have started to become less relevant as bot creators have updated their operational security (OPSEC) and tactics, techniques, and procedures (TTP) to account for common detection mechanisms. Bots that display obvious methods, like posting at odd hours, creating extremely high amounts of interaction, retweeting without ever posting original content, etc., are routinely suspended by Twitter’s internal team. This has produced a Darwinian effect on inauthentic accounts making them harder to discern using automated statistics.
What remains, especially for researchers without access to internal Twitter telemetry (such as log-in IP addresses or associated metadata such as email addresses or phone numbers), is analysis and classification of accounts based purely on behavior and content. Bots are useless to their creators unless influencing a conversation - aside from building some historical record. As a result, visually exposing accounts that attempt to communicate in a largely broadcast manner has traditionally been the most useful classification used by Brainspace and other data analytics tools (aka, the “star-pattern" analysis, see Figure 1).
However, content-based sentiment analysis can also prove useful. Many sentiments are unusual for humans to have in conjunction with one another, simply because humans have a limited set of interests in which they Tweet about. An early example of this was discussed by Cyxtera analysts in research pertaining to seemingly Pro-Trump bots posting heavily about the U.S. leaving the Syrian war.
We ingested 4.8 million tweets posted by followers of Theresa May’s Twitter account and isolated her most recent 100,000 followers. Of those, 77,000 accounts had content and 33,000 accounts were dormant. “Peripheral” accounts were also included in the data. These are accounts which are referenced by another account, but don’t contain any Tweets. For example:
- @AccountA is tagged in a tweet by @AccountB
- @Account B is an account included in the 100,000 followers scraped
- @AccountA would be included as a peripheral account because there aren't any @AccountA tweets in the dataset
The data we analyzed consisted of the Tweet bodies, the sender, recipient, the date and time, possible threading information, and in some cases the unshortened URLs in the Tweet.
Judging Botometer Results
As part of this work, we gathered scores from the "Botometer" formerly BotOrNo service, and judged it for accuracy. Botometer assigns a score, called the complete automation probability (CAP), to any account which can be used as an assessment of their confidence that the account is a bot (CAP > 50% equals a bot). Botometer agreed with our manual analysis 78% of the time. When it didn’t, it invariably thought an account was a human when in fact it was a bot. We can conclude that while services like Botometer are useful in helping a layman be more cognizant of social media verification, they do not keep up with evolving tactics, techniques, and procedures (TTPs) utilized by bot makers who are under constant pressure to outsmart these services. It is necessary to have subject matter experts and social media/OSINT analysts manually inspect samples of accounts to determine if the account is controlled by a bot or a human.
Top Pro-Brexit and Pro-Trump Accounts
We found 21,000 accounts in our dataset that discussed Brexit a significant amount of time (at least 5 Tweets). We then looked at the top 30 most Pro-Brexit accounts in the set as classified by Brainspace’s machine learning tool and a small Python script. Of those, 67% were bots. (Figure 2)
We found 10,000 accounts in our dataset that talked about Trump a significant amount of time (at least 5 Tweets). We then looked at the top 30 most Pro-Trump accounts and found 50% of them were bots. (Figure 3)
Overlap Between Pro-Brexit and Pro-Trump Accounts
It has been well researched that Russian Twitter accounts amplified both the Trump campaign and Brexit, respectively. But what are the implications when you find accounts promoting both at the same time?
We found an overlap between accounts that post Pro-Brexit Tweets and accounts that post Pro-Trump Tweets. In our data, accounts talked about both topics. From the top 500 Pro-Brexit and Pro-Trump accounts we found 29 accounts in both lists. Of those, 38% were suspicious and exhibited bot-like or automated activity.
We also note that most accounts (62%) in our sample weren’t bots and appeared to be normal human activity.
Top Anti-Trump Accounts and Top Anti-Brexit Accounts
We found 2,014 accounts that talked about being Anti-Trump a significant number of times (at least 5 Tweets). We then reviewed the top 30 most Anti-Trump accounts in the dataset (as classified by Brainspace’s built-in machine learning tool). Of these, 17% were bots.(Figure 4)
We found 4,130 accounts that discussed Brexit in an opposition manner. Of those, we determined that 17% of the top 30 most Anti-Brexit accounts were bots. (Figure 5)
Note, we think it’s coincidental that both showed 17% bot-related activity.
Overlap Between Anti-Trump and Anti-Brexit Accounts
Looking at the top 1,035 Anti-Brexit and Anti-Trump accounts respectively, we found 30 accounts in both lists. Of those, 23% our manual analysis determined that 23% were bots. Compared to the Pro-Trump and Pro-Brexit data, accounts in this subset seem to be more authentic although there is a slight increase in bot-load for accounts that combine the ideologies.
Overlap Between Pro-Brexit and Anti-Trump Accounts
Looking at the top 1,250 Pro-Brexit and Anti-Trump accounts respectively, we found 30 accounts in both lists. At the intersection of these two groups, we found 20% of accounts with automated behaviors. This is lower than the Anti-Trump + Anti-Brexit automation (23%) and Pro-Trump + Pro-Brexit accounts (38%).
Overlap between Anti-Brexit and Pro-Trump Accounts
Looking at the top 1,430 Anti-Brexit and Pro-Trump accounts respectively, we found 30 accounts in both lists. For accounts that are both Pro-Trump and Anti-Brexit, 10% display automated behaviors.
Our research has shown that often a large percentage of highly ideological Pro-Brexit + Pro-Trump Tweets were bot-related. This result wasn’t a surprise as prior research has concluded the same. On the other hand, bots are pushing out unexpected combinations of ideologies (e.g. Pro-Brexit + Anti-Trump). We are currently unsure of why a bot network would push competing ideologies and it’s worth additional research.
Appendix I: Identifying Bots with Our Classifier using the Brainspace GUI
How the Pro-Brexit Classifier works:
Finding Bots with Star-Pattern Analysis
Visualizing the graph, we can find potential bots by looking for star patterns. @Oluwastevens is an account our research team picked at random.
This account is a Nigerian who claims to live in America. The account likes and retweets Trump. They also advertise for a social media platform that pays users to post inauthentic activity on other social media sites, clearly violating terms of service for Facebook and Twitter. So, what about Brexit? The account tweets directly to the Telegraph and to Nigel Farage often (see Figure 11). At the time our analysts reviewed the account manually, these tweets to the Telegraph and Nigel Farage had been deleted indicating an attempt by the account to cover its tracks after tweets have been posted for a set amount of time.
Mapping the Top Brexit Bots
Back to the dashboard we can see the top accounts exhibiting bot-like behavior and to whom they tweet at:
Picking an account at random from the dashboard list @president_the
These accounts mostly retweet U.S. government verified accounts such as USAID, Department of State, The White House, FBI, and various U.S. embassies and missions in Africa. Strangely they also retweet the Ministry of Foreign Affairs Russia (Figure 16), which indicates they are not official accounts of the U.S. government. In Brainspace we can select the top bot accounts and visually see relationships. We selected 11 accounts from the dashboard list. We can see how bots are connected based on Tweet targets. (Figure 17)
These bots frequently tweet/retweet to Donald Trump and Jacob Rees-Mogg. Previous research has shown that people in the Brexit movement, like Jacob Rees-Mogg, have been amplified by bots. Statistics below:
These bots tweet/retweet Donald Trump:
The Top “Remain” Bots
On the opposite side of the spectrum, “remain” or anti-Brexit sentiment exhibits the following features/characteristics, as shown by Brainspace. These are the bottom 1% to 30% of the least Pro-Brexit accounts in the dataset. Of note, at 23,000 tweets we have about one quarter of the quantity of “remain” tweets versus Pro-Brexit tweets.
In 2019, the terms “brexit” and tweets to @jeremycorbyn (the head of the UK Labour party) are among the most amplified and anomalous terms.
Represented here are the tweets and top terms broadcasted toward Jeremy Corbyn (Figure 23). Most of these accounts oppose Brexit.
These are the tweets and top terms broadcasted to Theresa May. (Figures 24 & 25)