EnronInc dataset

Dataset Information

There are four years (1999–2002) of Enron email communications data used for event detection in Enron Inc. In the temporal graphs, the nodes represent email addresses and directed edges depict sent/received relations. Enron email network contains a total of 80, 884 nodes. We analyze the data with daily sample rate skipping the weekends (700 time points). The ground truth captures the major events in the company’s history, such as CEO changes, revenue losses, restatements of earnings, etc. Access full EnronInc dataset from here.

Source (citation)

Less is More: Building Selective Anomaly Ensemble with Application to Event Detection in Temporal Graphs. Shebuti Rayana, Leman Akoglu, SIAM SDM, Vancouver, BC, Canada, April 2015

Less is More: Building Selective Anomaly Ensemble. Shebuti Rayana, Leman Akoglu, Transactions on Knowledge Discovery from Data (TKDD), May, 2016

Download

Files: Enron

Description: Date_weekend_cropped.mat contains timestamp (in Date), 

Source_weekend_cropped.mat contains  Source IP, and, 

Destination_weekend_cropped.mat contains Destination IP with weekends cropped.

In total there are data for 899 unique dates but for the final results we only consider the last 700 unique dates.