This paper proposes a method and empirical pieces of evidence to investigate the claim commonly made that proxy services used by web scraping bots have millions of residential IPs at their disposal. Using a real-world setup, we have had access to the logs of close to 20 heavily targeted websites and have carried out an experiment over a two months period. Based on the gathered empirical pieces of evidence, we propose mathematical models that indicate that the amount of IPs is likely 2 to 3 orders of magnitude smaller than the one claimed. This finding suggests that an IP reputation-based blocking strategy could be effective, contrary to what operators of these websites think today.
Botnet sizes: when maths meet myths
CFTIC 2020, 1st International Workshop on Cyber Forensics and Threat INvestigations Challenges in Emerging Infrastructures, held in conjuction with the 18th International Conference on Service Oriented Computing (ICSOC 2020), 14-17 December 2020, Dubai, UAE (Virtual Conference) / Also published in Lecture Notes in Computer Science (LNCS 12632, 2021)
Type:
Conference
City:
Dubai
Date:
2020-12-14
Department:
Digital Security
Eurecom Ref:
6411
Copyright:
© Springer. Personal use of this material is permitted. The definitive version of this paper was published in CFTIC 2020, 1st International Workshop on Cyber Forensics and Threat INvestigations Challenges in Emerging Infrastructures, held in conjuction with the 18th International Conference on Service Oriented Computing (ICSOC 2020), 14-17 December 2020, Dubai, UAE (Virtual Conference) / Also published in Lecture Notes in Computer Science (LNCS 12632, 2021) and is available at : https://doi.org/10.1007/978-3-030-76352-7_52
PERMALINK : https://www.eurecom.fr/publication/6411