We have been
crawling the peers in the region 0x5b of the KAD peer-to-peer for almost
half a year every 5 minutes. Moreover we have been crawling the full KAD
network for more than one year once a day.
On this website
you may download the full datasets.
Dataset of the
Zonecrawl
|
The crawl
started on ended on crawl
duration crawl
frequency KAD IDs
seen IP
addresses seen |
2006-09-23 2007-03-20 179 days 5 minutes
(288 snapshots a day) 400,375 3,228,890 |
download zonecrawl.txt.bz2 33 MB
(uncompressed 39 GB !!!)
The file contains a compressed matrix. Under Linux it can be uncompressed with the command: bunzip2 zonecrawl.txt.bz2
The matrix
contains
NEW: this
dataset is now also available in the .avt format. For more information about
the .avt format please have a look here: http://www.cs.illinois.edu/~pbg/availability/
.
download zonecrawl.avt.gz
54MB (uncompressed 147MB <<< 39GB). To uncompress please use the
command: gzip -d
zonecrawl.avt.gz
Thanks to
Lluís Pàmies i Juárez for providing the data in this
format.
Dataset of the
Fullcrawl
|
The crawl
started on ended on crawl
duration crawl
frequency KAD IDs
seen IP
addresses seen |
2007-03-20 2008-05-25 433 days Once per
day 64,146,397 97,380,532 |
Since the full
dataset is too big to big (604,439,668 lines) to be downloadable in one single
file we split it up in 6 files with approximately 750 MB (4.8 GB uncompressed)
each:
download
Each file contains a compressed list. Under Linux it can be uncompressed with the command: bunzip2 fullcrawlX.txt.bz2
Every line
contains a quadruple of <anonymized IP><port #><KAD ID
64 bit><date>.
The entries are ordered by the KAD ID and the date.
The entry 234567891234 3456 ABCDEF12345678 12-03-2008 says that on March 12th
2008 the peer with the KAD ID ABCDEF12345678
(the trace contains the
first 64 bits of the 128 bit KAD IDs only) was online using the anonymized IP
address 234567891234 on port 3456. The
anonymization scheme we used does loose all prefix information of the IP
address.
Please cite the data set the following way:
|
Moritz Steiner, Taoufik En-Najjary, and Ernst W. Biersack |
The most
detailed analysis of the trace can be found in the following Technical Report:
|
Moritz Steiner, Taoufik En-Najjary, and Ernst W. Biersack |
Previous
publications on the crawl methodology and analysis of the results obtained:
|
Damiano Carra, and Ernst W. Biersack |
|
Moritz Steiner, Taoufik En-Najjary, and Ernst W. Biersack |
|
Moritz Steiner, Taoufik En-Najjary, and Ernst W. Biersack |
|
Moritz Steiner, Ernst W. Biersack, and Taoufik En-Najjary |
This site has
been last modified on 2010-04-17