Shades of gray: A closer look at emails in the gray area

Isacenkova, Jelena; Balzarotti, Davide
ASIACCS 2014, 9th ACM Symposium on Information, Computer and Communications Security, June 4-6, 2014, Kyoto, Japan

Every day, millions of users spend a considerable amount of time browsing through the messages in their spam folders. With newsletters and automated notifications responsible for 42% of the messages in the user's inboxes, inevitably some important emails get misclassified as spam. Unfortunately, users are often unable to take security related decisions, and tools provide no assistance to easily distinguish harmless commercial messages from the ones that are most certainly malevolent. Most of the previous studies focused on the detection of spam. Instead, in this paper we look into the often overlooked area of gray emails, i.e., those messages that cannot be clearly categorized one way or the other by automated spam filters. In particular, we analyze real-world emails by grouping them into clusters of bulk email campaigns. Our approach is able to automatically classify and reduce by half the gray emails area with only 0.2% false positives. Moreover, we identify a number of campaign features that can be used to predict the campaign category and we discuss their effectiveness and their limitations. Our experiments show that a large fraction of emails in the gray area are composed of legitimate bulk emails: newsletters, notifications, and marketing offers. The latter appears to be a large e-marketing business industry that has grown into a complex infrastructure for sending legitimate bulk emails. To the best of our knowledge, this is the first real-world empirical study of such emails.


DOI
Type:
Conference
City:
Kyoto
Date:
2014-06-04
Department:
Digital Security
Eurecom Ref:
4251
Copyright:
© ACM, 2014. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ASIACCS 2014, 9th ACM Symposium on Information, Computer and Communications Security, June 4-6, 2014, Kyoto, Japan http://dx.doi.org/10.1145/2590296.2590344

PERMALINK : https://www.eurecom.fr/publication/4251