Invasion@Ukraine: Providing and Describing a Twitter Streaming Dataset that Captures the Outbreak of War Between Russia and Ukraine in 2022


Social media can be a mirror of human interaction, society, and historic disruptions. Their reach enables the global dissemination of information in the shortest possible time and, thus, the individual participation of people worldwide in global events in almost real-time. However, these platforms can be equally efficiently used in information warfare to manipulate human perception and opinion formation. Within this paper, we describe a dataset of raw tweets collected via the Twitter Streaming API in the context of the onset of the war, which Russia started in Ukraine on February 24, 2022. A distinctive feature of the dataset is that it covers the period from one week before to one week after Russia invasion of Ukraine. This paper details the acquisition process and provides first insights into the content of the data stream. In addition, the data has been annotated with availability tags, resulting from rehydration attempts at two points in time: directly after data acquisition and shortly before manuscript submission. This may provide information on Twitter moderation policies. Further, we provide a detailed list of other published dataset covering the same topic. On the content level, we can show that our dataset comprises several distinct topics related to the conflict and conspiracy narratives – topics that deserve more profound investigation. Therefore, the presented dataset is also made available to the community in an extended version with pseudonymized tweet content upon request.

Proceedings of the 17th International Conference on Web and Social Media.Association for the Advancement of Artificial Intelligence (AAAI)
Dennis Assenmacher
Computational Social Scientist doing research on harmful online communication