Distributed hash tables (DHTs) have been actively studied in literature and many different proposals have been made on how to organize peers in a DHT. However, very few DHTs have been implemented in real systems and deployed on a large scale. One exception is KAD, a DHT based on Kademlia, which is part of eDonkey, a peer-to-peer file sharing system with several million simultaneous users. We have been crawling a representative subset of KAD every five minutes for six months and obtained information about geographical distribution of peers, session times, daily usage, and peer lifetime. We have found that session times are Weibull distributed and we show how this information can be exploited to make the publishing mechanism much more efficient.
Peers are identified by the so-called KAD ID, which up to now was assumed to be persistent. However, we observed that a fraction of peers changes their KAD ID as frequently as once a session. This change of KAD IDs makes it difficult to characterize end-user behavior. For this reason we have been crawling the entire KAD network once a day for more than a year to track end-users with static IP addresses, which allows us to estimate end-user lifetime and the fraction of end-users changing their KAD ID.