TwitterSecurity dataset

Dataset Information

We collect tweet samples using the Twitter Streaming API for four months (May 12–Aug 1, 2014). We filter the tweets containing Department of Homeland Security keywords related to terrorism or domestic security. After named entity extraction and resolution (including URLs, hashtags, @ mentions), we build entity-entity co-mention temporal graphs on daily basis (80 time ticks). We compile the ground truth to include major world news of 2014, such as the Turkey mine accident, Boko Haram kidnapping school girls, killings during Yemen raids, etc.

Source (citation)

Less is More: Building Selective Anomaly Ensemble with Application to Event Detection in Temporal Graphs. Shebuti Rayana, Leman Akoglu, SIAM SDM, Vancouver, BC, Canada, April 2015

Less is More: Building Selective Anomaly Ensemble. Shebuti Rayana, Leman Akoglu, Transactions on Knowledge Discovery from Data (TKDD), May, 2016


Files: TwitterSecurity, GroundTruth

Description: The dataset contains three columns, the first column is the timestamp (date) and other two columns contain two entity names co-mentioned at that timestamp. GroundTruth contains dates for major world incidents with description.