Data-Driven Protocol Design: What it Is and What Are the Benefits
Yiannis Psaras
About this talk
Yiannis Psaras introduces ProbeLab's data-driven approach to measuring and improving IPFS and libp2p protocols, framing the team's workflow of hypothesis-driven experiments using crawlers, controlled probes, and node logs. He shares early findings from running the Nebula crawler since 2021, including roughly 200,000 peers seen, a median peer churn of about one hour, and the surprising result that only around 3% of IPFS nodes run on centralized cloud infrastructure. The talk walks through the IPFS DHT provide and lookup process and highlights two performance bottlenecks identified through measurement: a slow provide operation taking tens of seconds despite peers being found in under half a second, and a content lookup pipeline where Bitswap adds a roughly one-second overhead before the DHT walk begins. Psaras closes with pointers to open studies on routing table health, provider record liveness, and the upcoming hole-punching measurement campaign for libp2p.