Modeling latent topics in social media using Dynamic Exploratory Graph Analysis: The case of the right-wing and left-wing trolls in the 2016 US elections

The past few years were marked by increased online offensive strategies perpetrated by state and non-state actors to promote their political agenda, sow discord and question the legitimacy of democratic institutions in the US and Western Europe. In 2016 the US congress identified a list of Russian state-sponsored Twitter accounts that were used to try to divide voters on a wide range of issues. Previous research used Latent Dirichlet Allocation (LDA) to estimate latent topics in data extracted from these accounts. Howerver, LDA is has characteristics that may pose significant limitations to be used in data from social media: the number of latent topics must be specified by the user, interpretability can be difficult to achieve, and it doesn’t model short-term temporal dynamics. In the current paper we propose a new method to estimate latent topics in texts from social media termed Dynamic Exploratory Graph Analysis (DynEGA). We compare DynEGA and LDA in a Monte-Carlo simulation in terms of their capacity to estimate the number of simulated latent topics. Finally, we apply the DynEGA method to a large dataset with Twitter posts from state-sponsored right- and left-wing trolls during the 2016 US presidential election. The results show that DynEGA is substantially more accurate to estimate the number of simulated topics than several different LDA algorithms. Our empirical example shows that DynEGA revealed topics that were pertinent to several consequential events in the election cycle, demonstrating the coordinated effort of trolls capitalizing on current events in the U.S. This demonstrates the potential power of our approach for revealing temporally relevant information from qualitative text data.

Last updated on 11/26/2020