— On stage

Talks

Conference talks and presentations from the ProbeLab team.

Year Any 2026 2025 2024 2023 2022 2021 2020

34 talks

Featured

Thumbnail for A Toolbox for libp2p Network Monitoring

libp2p-day Berlin Jul 2025

A Toolbox for libp2p Network Monitoring

Dennis Trautwein · Mikel Cortes

Dennis Trautwein and Mikel introduce ProbeLab's mission of rigorous P2P protocol measurement and walk through the full suite of libp2p monitoring tools they have built, including Nebula and Ants for network topology, Bemo for uptime and IPFS website monitoring, Hermes for protocol-level tracing, Ukla for bandwidth measurements, Parsec for DHT lookup monitoring, and newer tools like Akai for data availability. Mikel then dives into Hermes, a lightweight gossipsub tracer, and presents findings on the GossipSub 1.2 IDONTWANT control message: while it does reduce duplicates, around 60% of IWANT requests arrive within 10 milliseconds of the actual message and roughly half of duplicates still arrive within 500ms of an IDONTWANT, partly because GossipSub uses a single stream per peer that cannot cancel an in-flight transfer. The talk closes with plans for Hermes++, a lighter version that can scale horizontally without eclipsing peers, and the goal of turning probelab.io into the reference point for P2P metrics across multiple networks.

Thumbnail for The Eternal Research of Broadcasting Messages - Limits of GossipSub

Protocol Berg v2 Jun 2025

The Eternal Research of Broadcasting Messages - Limits of GossipSub

Mikel Cortes

Just when we thought the biggest bottleneck in P2P networking was reaching complex consensus mechanisms or generating efficient proofs, we realized that the fundamental networking protocol stack itself is one of them. Join this session to explore the latest open proposals for optimizing GossipSub in libp2p.

2026

Thumbnail for Mapping Spy Node Dominance in the Monero P2P Network

MoneroKon 2026 Jun 2026

Mapping Spy Node Dominance in the Monero P2P Network

Yiannis Psarras

Monero's cryptographic privacy is well-established but its P2P layer is a different story. Using Nebula, ProbeLab's open-source network crawler, we conducted the first large-scale topological crawl of the Monero network, discovering over 29,000 nodes and successfully handshaking with more than 16,000. The findings are stark: over 81% of reachable nodes exhibit the peer ID mismatch pattern the Monero Research Lab associates with surveillance infrastructure while every flagged node tracing back to a single provider: Spruce Creek Networks LLC. Force-directed graph analysis reveals a bifurcated overlay: a dense spy node core surrounding unprotected peers, and a self-segregated cluster of ban-list-enforcing nodes. We present the methodology, the topology, ban list adoption rates, and outline next steps: measuring how this surveillance density impacts Dandelion++ propagation and transaction origin anonymity in practice.

2025

Thumbnail for A Toolbox for libp2p Network Monitoring

libp2p-day Berlin Jul 2025

A Toolbox for libp2p Network Monitoring

Dennis Trautwein · Mikel Cortes

Dennis Trautwein and Mikel introduce ProbeLab's mission of rigorous P2P protocol measurement and walk through the full suite of libp2p monitoring tools they have built, including Nebula and Ants for network topology, Bemo for uptime and IPFS website monitoring, Hermes for protocol-level tracing, Ukla for bandwidth measurements, Parsec for DHT lookup monitoring, and newer tools like Akai for data availability. Mikel then dives into Hermes, a lightweight gossipsub tracer, and presents findings on the GossipSub 1.2 IDONTWANT control message: while it does reduce duplicates, around 60% of IWANT requests arrive within 10 milliseconds of the actual message and roughly half of duplicates still arrive within 500ms of an IDONTWANT, partly because GossipSub uses a single stream per peer that cannot cancel an in-flight transfer. The talk closes with plans for Hermes++, a lighter version that can scale horizontally without eclipsing peers, and the goal of turning probelab.io into the reference point for P2P metrics across multiple networks.

Thumbnail for The Eternal Research of Broadcasting Messages - Limits of GossipSub

Protocol Berg v2 Jun 2025

The Eternal Research of Broadcasting Messages - Limits of GossipSub

Mikel Cortes

Just when we thought the biggest bottleneck in P2P networking was reaching complex consensus mechanisms or generating efficient proofs, we realized that the fundamental networking protocol stack itself is one of them. Join this session to explore the latest open proposals for optimizing GossipSub in libp2p.

Thumbnail for The Surprising Challenges of Counting Nodes

Protocol Berg Jun 2025

The Surprising Challenges of Counting Nodes

Dennis Trautwein

Dennis argues that counting nodes in a P2P network is far less straightforward than it appears and uses Ethereum as the primary case study. He breaks down the question along several axes — what is being counted (execution vs. consensus clients, validators, IPs, peer IDs, ENRs, IP+port tuples), how it is counted (passive traffic tracing, structured or unstructured DHT crawls, on-chain activity), and over what time window and fork — and shows how published Ethereum node counts from Etherscan, ethernodes.org, Migalabs, Nodewatch, and ProbeLab range from roughly 8,000 to 14,000 depending on methodology. He walks through ProbeLab's own structured DiscV5 crawl pipeline, which finds about 200,000 ENRs that filter down to around 8,100 reachable mainnet nodes under their liveness criteria. Comparing structured crawls to traffic tracing on Celestia via the Ants deployment reveals roughly 20% additional nodes (including light Lumina nodes) that crawls miss entirely, suggesting Ethereum likely has hidden NATed nodes too, and concludes that the goal is not one definitive number but understanding the limits of each approach.

2024

Thumbnail for A Toolbox for Monitoring the Health of the Ethereum P2P Network

Devcon SEA Nov 2024

A Toolbox for Monitoring the Health of the Ethereum P2P Network

Yiannis Psaras · Dennis Trautwein

Yiannis presents ProbeLab's toolbox for monitoring the health of the Ethereum P2P network, with Dennis joining for a live demo. The talk covers Nebula for crawling and liveness monitoring across DiscV4, DiscV5, and libp2p networks, surfacing findings such as Nimbus being the consensus client least dependent on cloud infrastructure and Unichain being the most popular chain on Optimism. It then walks through Ants, a honeypot-style DHT measurement system that places nodes roughly every 20 keyspace positions to capture client requests, followed by Hermes, a lightweight gossipsub listener that traces grafts, prunes, IHAVE/IWANT messages, and peer scoring, revealing that around 55% of blob messages are delivered four times or fewer and that IHAVE messages account for about 33% of total bandwidth. Yiannis also introduces Ukla, a bandwidth measurement tool that connects to live peers via Hermes and downloads carefully sized data to probe available bandwidth across the slot, showing dips in cloud-node bandwidth that are relevant for PeerDAS and larger blob counts. Dennis closes with a live demo against Ethereum mainnet from the venue, walking through the resulting per-visit records of compressed and uncompressed bytes, retries, and throughput, and outlines plans for a public data API and an updated NAT traversal study.

Thumbnail for Insights from Block Propagation in the Ethereum P2P Network

Devcon SEA Nov 2024

Insights from Block Propagation in the Ethereum P2P Network

Mikel Cortes

Mikel Cortes presents ProbeLab's analysis of block propagation in Ethereum's consensus layer, which relies on the GossipSub libp2p protocol to disseminate validator duties across the network. He explains how external factors such as MEV-driven proposer timing games, block size, and network topology shape propagation latency, and shows measurement results indicating that 55% of blocks arrive with at least three duplicates and that some edge cases see the same message up to eight times. The talk introduces Hermes, an open-source libp2p-based event tracer that connects to Ethereum nodes to stream internal GossipSub events, and a public dashboard built in collaboration with the Ethereum Foundation and EthPandaOps that visualizes block arrival times at sentry nodes. Cortes argues that continuous monitoring is essential for anticipating regressions across client releases and for guiding scaling work toward PeerDAS and FullDAS.

2023

Thumbnail for pcp: A Fully Decentralized Peer-to-Peer File Transfer Tool

libp2p-day Istanbul Nov 2023

pcp: A Fully Decentralized Peer-to-Peer File Transfer Tool

Dennis Trautwein

Dennis introduces PCP (peer copy), a decentralized command-line file transfer tool he built as an entry project into libp2p, positioning it as a decentralized alternative to centralized tools like croc and Magic Wormhole. The talk walks through how PCP encodes a short channel identifier as four BIP-39 words combined with a truncated timestamp, hashes that into a CID, and writes a provider record into the Amino DHT so the receiver can rediscover the sender's multiaddrs and establish a direct connection. He covers the supporting mechanics: AutoNAT v2 for address discovery, mDNS for local-network peers, Circuit Relay v2 reservations and DCUtR-based hole punching for NAT traversal, and a PAKE-style exchange over the remaining words to derive a session key and authenticate the transfer. Dennis closes by noting that the current 0.4 release is outdated, that an in-progress 0.5 rewrite to adopt newer libp2p features has stalled due to time constraints, and he calls on the community to help drive features like CID-based fetches, transfer resumption, and a JS or browser port.

Thumbnail for Nebula: A Network Agnostic DHT Crawler

libp2p-day Istanbul Nov 2023

Nebula: A Network Agnostic DHT Crawler

Dennis Trautwein

Dennis presents Nebula, ProbeLab's network-agnostic DHT crawler, and demonstrates it live against several supported networks including IPFS/Amino, the Ethereum consensus and execution layers, Filecoin, Polkadot/Kusama/Rococo/Westend, and the newly added Celestia mainnet. He explains the crawl algorithm — connecting to bootstrap peers, generating keys that fall into each Kademlia bucket, and recursively asking peers for ever-closer nodes until the reachable set is exhausted — and walks through the CLI flags, dry-run mode, and the four newline-delimited JSON output files that capture crawl metadata, per-peer visits with protocols/agent versions/errors, and optional neighbor data for full topology reconstruction. He shows how the monitor subcommand reconnects to discovered peers with exponential backoff to track uptime and churn, feeding studies on routing-table health, replication factor, and the IPFS DHT incident earlier in the year. The talk also describes how Nebula was recently refactored to abstract over different DHT implementations so it can map disc-v5 ENR node IDs to libp2p peer IDs and extract Ethereum-specific fields like fork digest into a free-form properties map, and ends with a roadmap pitch covering IPNS, GossipSub, and bandwidth measurements.

Thumbnail for Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web

IETF 118 Nov 2023

Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web

Dennis Trautwein

Dennis presents the SIGCOMM 2022 paper "Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web," giving an overview of IPFS as a transport-agnostic, content-addressed system built around CIDs and walking through the anatomy of a CID (multibase, version, multicodec, multihash). He explains the content lifecycle in which a publisher writes a provider record mapping a CID to a peer ID into the Kademlia DHT while the data itself stays on the origin node, and a retriever first asks connected peers opportunistically before performing a DHT walk to fetch the provider and peer records. He then covers the measurement methodology — DHT crawls every 30 minutes, controlled probe nodes deployed across seven AWS regions, and public gateway logs — reporting roughly 464,000 unique IPs across 150+ countries and 2,700 ASes, heavy AS-level centralization with the top five hosting over 50% of IPs, churn distributions by country, and lookup latency where 80% of EU retrievals resolved under 500 ms. He contrasts 2021 publication latency of up to two minutes against current medians around six to seven seconds, attributes the improvement to a much larger stable-peer baseline (85-90% today vs. 55-60% in 2021), and closes by pointing to ProbeLab's ongoing weekly measurements at stats.ipfs.network and open future work on content availability, adverse network conditions, and routing latency.

Thumbnail for The Best of Both Worlds: Exploring the Role of Centralization in IPFS

Protocol Berg v1 Sep 2023

The Best of Both Worlds: Exploring the Role of Centralization in IPFS

Dennis Trautwein

This Protocol Berg 2023 talk examines how IPFS uses semi-centralized "hybrid" components to address three structural challenges of a fully decentralized network: massive content publication, content retrieval performance, and adoption on resource-constrained mobile clients. Dennis walks through the design of Interplanetary Network Indexers (IPNI), Hydra Boosters, and HTTP gateways, then presents measurement results from ProbeLab showing IPNI indexes roughly two orders of magnitude more provider records than the DHT and reduces lookups from seconds to hundreds of milliseconds (single-digit milliseconds when CDN-cached). The talk also covers the trade-offs each component introduces — single-points-of-failure for indexers, route poisoning and eclipse-attack surface for Hydras, and loss of end-to-end CID verification plus operator-side privacy asymmetries for gateways. It closes with the decision to unplug the Hydra Booster shared database as the underlying DHT matured, framing this as a successful pattern of using centralized scaffolding to bootstrap a decentralized system that can later stand on its own.

Thumbnail for DHT Double Hashing Updates & Migration Plan

IPFS þing Apr 2023

DHT Double Hashing Updates & Migration Plan

Yiannis Psaras · Guillaume Michel

Yiannis Psaras and Guillaume Michel present the migration plan for the IPFS DHT reader privacy upgrade, formerly known as double hashing and specified in IPIP-373. The scheme combines a CID-agnostic DHT lookup using a salted second hash, prefix-based requests that return multiple provider records to give k-anonymity, and provider record encryption keyed on the CID itself, so intermediate DHT servers can no longer link requesters to the content they fetch. Because this is a breaking protocol change, the talk focuses on coordinating a synchronized switch using a hard-coded IPNS key in a Kubo release that nodes poll periodically, with bootstrappers, content providers, DHT clients, and DHT servers each following defined behaviors during a transition period in which both old and new DHTs run side by side. The presenters discuss timeline targets through Q2 and Q3 of 2023, the role of IPNI and cid.contact as a fallback that may simplify the dual-stack period, and threat model limitations: the upgrade defeats passive traffic sniffing and bulk surveillance but does not hide lookups when the CID is already publicly known.

Thumbnail for Data Driven Protocol Design and Optimisation

IPFS þing Apr 2023

Data Driven Protocol Design and Optimisation

Yiannis Psaras

Yiannis Psaras opens the IPFS þing 2023 measurement track with an overview of ProbeLab's data-driven protocol design work across the IPFS network, walking through their core methodologies of DHT crawling, controlled probe fleets, and infrastructure log analysis. He recaps recent measurement campaigns including the Hydra dial-down experiment, the global hole-punching study with over six million attempts, the January 2023 incident in which more than 50% of IPFS DHT nodes became unresponsive, and the provider record liveness study that justified halving reprovide traffic in production from December 2022. The talk previews probelab.io as a public home for these results alongside stats.ipfs.network, and surveys ongoing optimization work including the double-hash DHT for reader privacy (IPIP-373), optimistic provide, and the IPFS magic numbers effort. Psaras closes by flagging GossipSub measurement, used heavily in Filecoin and Ethereum 2.0, as the team's next major focus area.

Thumbnail for State of Content Routing through the DHT: Latest Developments & Measurement Results

IPFS þing Apr 2023

State of Content Routing through the DHT: Latest Developments & Measurement Results

Yiannis Psaras

Yiannis Psaras presents measurement results from two major IPFS DHT events: the Hydra dial-down and a January 2023 incident where roughly 60% of network nodes became unresponsive due to misconfigured Resource Manager defaults. He describes ProbeLab's controlled experimental setup of geographically distributed probes that publish and retrieve random CIDs, and reports that disabling the 2,000 Hydra heads produced a roughly 10-13% latency increase, broadly matching expectations. The Resource Manager incident caused an additional 9-15% slowdown on put operations and 7-20% on get operations, which was largely recovered after Kubo v0.18.1 shipped better defaults and around 8,000 nodes upgraded within ten days. Psaras also shares current DHT lookup performance numbers showing significant regional disparities between EU/US East/US West clients and other locations, along with observations on peer ID rotation churn and the shift toward a more stable population of long-lived online nodes.

Thumbnail for Measuring IPFS

IPFS þing Apr 2023

Measuring IPFS

Yiannis Psaras

In this short lightning talk at IPFS þing 2023, Yiannis Psaras introduces ProbeLab's data-driven approach to protocol design and optimization for IPFS. He presents probelab.io as a contextualized collection of measurement plots — including DHT lookup latency across multiple geographic regions — published alongside experiment descriptions and links to the GitHub repositories of the underlying tooling. He also describes weekly network health snapshots that compare time-to-first-byte for a curated set of websites loaded via Kubo versus HTTP, noting that IPFS frequently outperforms HTTP thanks to content addressing and universal caching. Current measurement focus is on the public IPFS DHT, Bitswap, and Hydras, with plans to consolidate everything under stats.ipfs.network and to extend coverage to additional content-routing subsystems in collaboration with other protocol teams.

2022

Thumbnail for Unplugging the Hydra-Booster DB

PL Demo Day Dec 2022

Unplugging the Hydra-Booster DB

Dennis Trautwein

A PL Demo Day 2022 update on the decision to "unplug" the shared database behind Protocol Labs' Hydra Booster fleet, which at the time consisted of around 135 Hydras with roughly 2,000 heads making up about 12% of the IPFS DHT. Dennis explains that the heads were retained for their network-bridging role but stripped of provider-record storage and serving to cut operating costs, with a preceding A/B experiment running two six-node Kubo fleets across regions — one ignoring Hydra responses, one not — to estimate the latency impact at the 50th, 90th, and 95th percentiles. After the cutover on December 1st, observed retrieval latency degraded slightly more than predicted, partly because skipping non-responsive Hydras still costs an extra DHT hop. The change yielded roughly 36% cost savings, with AWS egress (largely TCP handshakes, RTTs, and multistream-select negotiation) identified as the next-largest remaining cost driver and a candidate for further optimization.

Thumbnail for NAT Hole Punching Campaign Update

PL Demo Day Dec 2022

NAT Hole Punching Campaign Update

Dennis Trautwein

A mid-campaign update from PL Demo Day 2022 on ProbeLab's distributed NAT hole punching measurement, in which volunteers run a "punchr" client that attempts libp2p hole punches against random peers across the public IPFS network. By mid-December the campaign had around 80–90 active clients per day across all continents except Africa, contributing roughly 150,000 hole-punch results daily and over three million in total. Dennis shows that the remote peers being punched span a geographically diverse set of countries, which compensates for client-side location skew, and reports a preliminary success rate fluctuating between 60–70% — slightly below earlier controlled-network measurements, as expected for a heterogeneous deployment. The talk closes with a call for additional participants and points to both the menu-bar macOS client and command-line tooling for joining the campaign.

Thumbnail for Data-Driven Protocol Design: What it Is and What Are the Benefits

IPFS Camp Oct 2022

Data-Driven Protocol Design: What it Is and What Are the Benefits

Yiannis Psaras

Yiannis Psaras introduces ProbeLab's data-driven approach to measuring and improving IPFS and libp2p protocols, framing the team's workflow of hypothesis-driven experiments using crawlers, controlled probes, and node logs. He shares early findings from running the Nebula crawler since 2021, including roughly 200,000 peers seen, a median peer churn of about one hour, and the surprising result that only around 3% of IPFS nodes run on centralized cloud infrastructure. The talk walks through the IPFS DHT provide and lookup process and highlights two performance bottlenecks identified through measurement: a slow provide operation taking tens of seconds despite peers being found in under half a second, and a content lookup pipeline where Bitswap adds a roughly one-second overhead before the DHT walk begins. Psaras closes with pointers to open studies on routing table health, provider record liveness, and the upcoming hole-punching measurement campaign for libp2p.

Thumbnail for Decentralized NAT Hole-Punching

IPFS Camp Oct 2022

Decentralized NAT Hole-Punching

Dennis Trautwein

This talk presents a measurement study of decentralized NAT hole punching in libp2p, focusing on the DCUtR (Direct Connection Upgrade through Relay) protocol that enables peers behind NATs and firewalls to establish direct connections without centralized infrastructure. Dennis explains how DCUtR uses a relay to coordinate synchronized connection attempts via Connect and Sync messages, then describes a measurement setup combining a honeypot DHT server, a coordination server, and a fleet of clients to observe hole punch outcomes in the wild. The results show roughly an 80% success rate for Go clients, with most successful punches completing on the first attempt, while clients running over VPNs and the Rust IPFS implementation exhibit notably lower success rates, particularly over TCP. The talk also examines the no-stream error pattern in rust-ipfs nodes and discusses transport differences between TCP and QUIC, closing with a call for community participation in a hole-punch measurement campaign.

Thumbnail for DHT Lookup Latency Performance

IPFS Camp Oct 2022

DHT Lookup Latency Performance

Dennis Trautwein

This talk investigates the role of Hydra boosters in IPFS content routing using the DynamoDB backing store of Protocol Labs' production Hydra deployment, which spans roughly 2,000 heads across 135 instances and handles tens of thousands of RPCs per second. Dennis verifies that Hydras achieve about 97% coverage of the 20-closest-peer neighborhoods across the hash space, then joins provider and peer records to map CIDs to geographic locations, finding that around 55% of unique CIDs are provided from the United States, followed by the Netherlands, France, and Germany. The analysis surfaces a high CID churn rate of roughly 50% per day, that 85% of CIDs have only a single provider, and that the top ten providers account for over half of all CIDs in the network, with one peer alone providing 13%. Latency experiments comparing lookups with and without Hydras show only a marginal speedup and in some regions no improvement at all, suggesting the Hydras' contribution to DHT lookup performance is smaller than intended. The talk also reports on prefetching effectiveness and the dominance of the negative cache among prefetch outcomes.

Thumbnail for ProbeLab 2023 Roadmap

IPFS Camp Oct 2022

ProbeLab 2023 Roadmap

Yiannis Psaras

Yiannis Psaras walks through ProbeLab's planned 2023 measurement and protocol-improvement milestones for IPFS and libp2p. The roadmap covers quantifying whether Hydra boosters still add value, consolidating tools like the Nebula crawler and CID Hoarder into a continuous monitoring backend with a unified data lake and dashboards, revisiting the one-second Bitswap-before-DHT delay, finalizing new provider-record expiry and republish intervals, and integrating Thunderdome for automated regression testing of Kubo releases. Further items include shipping Optimistic Provide, replacing magic-number DHT timeouts with adaptive parameters, measuring hole-punching success rates via an opt-in user experiment, and rolling out double-hashing for DHT privacy with attention to the non-backward-compatible transition path. The session closes with GossipSub work focused on instrumenting Filecoin block propagation latency and verifying that the score-function-based mitigation strategies actually exclude misbehaving peers in production.

Thumbnail for A Deep Dive into Provider Records

IPFS Camp Oct 2022

A Deep Dive into Provider Records

Mikel Cortes

In this IPFS Camp 2022 talk, Mikel Cortes presents a study on provider record liveness in the IPFS DHT, conducted in collaboration with ProbeLab. He explains how content publication actually distributes provider records (links between CIDs and provider PeerIDs) rather than the content itself, and walks through a methodology that publishes 10,000 CIDs without republishing, then probes the K=20 holders every 30 minutes for 36 hours. The results show that liveness is healthy, with a median of 18 successful initial connections and roughly 13 holders still serving records before the 24-hour expiry, while Hydra nodes prune more slowly due to their large databases. He also compares K values from 15 to 40, finding that publication latency scales roughly linearly with K while reachability stays similar, and demonstrates a Hydra-blacklisting experiment showing only modest cost in lookup hops and publish time. The takeaways argue for potentially lowering K from 20 to 15 and extending the 12-hour republish interval to reduce network overhead without harming retrievability.

Thumbnail for The IPFS Network from the Hydras' Point of View

PL Demo Day Oct Oct 2022

The IPFS Network from the Hydras' Point of View

Dennis Trautwein

This short demo presents measurement results from analyzing the Hydra boosters' view of the IPFS network using their DynamoDB store of provider and peer records. Dennis verifies that Hydra heads are uniformly distributed across the hash space and appear within the 20 closest peers for over 97% of nodes found by the Nebula crawler, then uses the joined provider and peer records to map CIDs to geographic locations, showing that more than 50% of CIDs originate from the United States, followed by the Netherlands and France. The data reveals roughly 1 billion unique CIDs per day with about 50% daily churn, equivalent to around 120 terabytes of content rotating in and out of the network. He also highlights provider concentration, with a single peer providing 13% of all CIDs, and introduces Antares, a tool that publishes random content and queries it through gateways and pinning services to identify which peer IDs correspond to which operators. The talk closes by previewing follow-up experiments that exclude Hydras from content retrieval and publication to quantify their actual performance impact.

Thumbnail for Decentralized Storage with IPFS: How Does It Work Under the Hood?

code.talks Sep 2022

Decentralized Storage with IPFS: How Does It Work Under the Hood?

Dennis Trautwein

A code.talks 2022 introduction to IPFS that walks through how the protocol actually operates beneath the user-facing commands. Trautwein covers content addressing and CIDs, how files are chunked and assembled into Merkle DAGs via IPLD, and how the Kademlia DHT distributes provider records across the network using XOR distance and bucketed routing tables. He explains the full content publishing and retrieval flow, including bootstrapping, routing-table construction, and why provider records are replicated across the 20 closest peers. The talk closes with results from his Nebula-based measurement campaign, including network crawls every 30 minutes, peer churn analysis showing ~50% of nodes leave within an hour, and findings that roughly 97% of IPFS nodes are not hosted on centralized cloud providers.

Thumbnail for IPFS Provider Record Liveness: Final Results

PL EngRes Demo Day Sep 2022

IPFS Provider Record Liveness: Final Results

Yiannis Psaras

Yiannis Psaras presents the final results of a joint ProbeLab and Barcelona Supercomputing Center study on IPFS provider record liveness, motivated by churn rates as high as 70% within two hours that could leave content unreachable before the 12-hour reprovide cycle. Using the CIDHoarder tool, the team published provider records and tracked how many of the 20 replica holders remained online and continued serving the records over time. Results show that roughly 15 of the 20 nodes keep records reachable for more than 35 hours even when Hydra boosters are excluded (around 12 nodes), and 15 of the originally selected peers remain among the K-closest to the CID over the same window. Based on these findings, the talk recommends reducing the replication factor K from 20 to 15 and at least doubling the reprovide interval from 12 to 24 hours, which together would cut provider-record-related overhead on content publishers by a substantial margin.

Thumbnail for Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web

SIGCOMM '22 Aug 2022

Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web

Yiannis Psaras

In this SIGCOMM 2022 technical session, the authors present the design and evaluation of IPFS (the InterPlanetary File System), a community-driven storage layer powering a growing slice of the decentralized web. The talk walks through how IPFS combines content-addressable storage, a Kademlia-based distributed hash table for peer and content discovery, and the Bitswap block-exchange protocol to let any participant publish, locate, and retrieve data without relying on centralized servers. Drawing on large-scale measurements of the live network, the speakers examine real-world performance — including content-routing latency, retrieval times, and the geographic distribution of peers — and discuss the trade-offs between decentralization, scalability, and user-perceived speed. They close by reflecting on lessons learned from operating a production decentralized storage system and outline open challenges for making the decentralized web competitive with traditional content delivery infrastructure.

Thumbnail for libp2p NAT Hole Punching Success Rate

IPFS þing Jul 2022

libp2p NAT Hole Punching Success Rate

Dennis Trautwein

A talk from IPFS þing 2022 on measuring the success rate of libp2p's hole punching protocol, DCUtR (Direct Connection Upgrade through Relays), which became broadly viable after Kubo 0.11 turned every node into a limited relay. Trautwein explains how NATs and firewalls block direct peer connections and walks through the DCUtR handshake, where peers exchange Connect and Sync messages over a relay and use round-trip-time measurements to coordinate simultaneous dial-outs. He then describes the punchr measurement setup: a honeypot that walks the DHT to attract NATed peers, a server exposing a gRPC API, and Go and Rust clients that perform hole punches and report outcomes. Initial results from a single home-network client show a ~72% success rate across ~13,300 attempts to ~2,500 unique peers, with 97% of successful punches landing on the first attempt and a median duration of around 0.9 seconds. He closes with next steps around comparing TCP vs. QUIC behavior, deploying more vantage points, and a call for participants to run clients from their own networks.

Thumbnail for Provider Records: Do They Stay Alive Long Enough?

IPFS þing Jul 2022

Provider Records: Do They Stay Alive Long Enough?

Mikel Cortes

At IPFS þing 2022 in Reykjavík, Mikel Cortes presents an early version of ProbeLab's measurement study on whether IPFS provider records actually stay alive for their intended 24-hour lifetime. He details the publication mechanics (K=20 closest peers via XOR distance, 12-hour republish interval) and his measurement tool, which publishes random CIDs, pings the original holders every 30 minutes, and re-runs DHT lookups to track the in-degree of those holders over time. The data shows publication is slow (median around 12 seconds per CID, 95th percentile up to 44 seconds), roughly two of the 20 chosen peers are unreachable at publish time, and the original holders remain stable and close to the CID well beyond 24 hours, with Hydra nodes notably continuing to serve records past the expiry due to slower garbage collection. Comparing K=15, 20, and 25, he finds little practical difference in retrievability and only marginal latency gains, leading him to argue that extending the republish interval beyond 12 hours is a more promising optimization than lowering K. The talk closes with an open discussion on dynamic per-node K values driven by content popularity and observed reachability.

Thumbnail for IPFS Network Measurements and Improvement Opportunities

Paris P2P #1 Apr 2022

IPFS Network Measurements and Improvement Opportunities

Yiannis Psaras

Yiannis Psaras introduces the IPFS Observatory, a measurement effort aimed at identifying bottlenecks and quantifying improvements in libp2p and IPFS protocols. He walks through two concrete findings: the content provide process takes tens to over 100 seconds but the relevant peers are typically discovered within the first second (motivating the Optimistic Provide work), and the Bitswap broadcast step adds a roughly one-second delay before the DHT lookup begins despite a low discovery rate. The talk also covers results from the Nebula crawler, including a roughly two-hour median session length, agent version uptake comparisons showing Kubo 0.9 as more stable than newer releases, churn behaviour of Hydra boosters, and the surprising finding that under 2% of IPFS nodes run on major cloud providers while the top 10 ASes hold around 65% of observed addresses. Psaras outlines further studies on hole-punching success rates, GossipSub performance, and DHT routing table health, and points to the dgm.xyz grants platform and an upcoming academic workshop for community involvement.

Thumbnail for Optimistic Provide: Optimize the IPFS DHT

Paris P2P #1 Apr 2022

Optimistic Provide: Optimize the IPFS DHT

Dennis Trautwein

Presented at P2P Paris in April 2022, this talk introduces "Optimistic Provide," a proposal to make the IPFS DHT provide operation roughly an order of magnitude faster. Trautwein first reviews how Kademlia, routing tables, and the lookup algorithm drive content advertisement, then shows measurement data where the median provide takes around 30 seconds and the 95th percentile exceeds two minutes, even though the closest peers are typically discovered in under one second. He proposes two approaches: running multiple parallel lookups from different seed peers and terminating once their query queues intersect, or estimating peer proximity using a local network-size estimator (based on a published technique that avoids gossip) to decide when a discovered peer is "close enough" to receive the provider record. Preliminary results show the multi-query approach intersects too early and yields peers that are too far in XOR distance, while the proximity-estimation approach produces closeness in the same order of magnitude as the status quo and a network-size estimate that tracks crawler ground truth. Next steps include sweeping parameters, evaluating the network-size estimator across a distributed fleet, and measuring the actual end-to-end publication-time improvement.

Thumbnail for Gossipsub: A Gossip-Based Pubsub Protocol

Paris P2P #1 Apr 2022

Gossipsub: A Gossip-Based Pubsub Protocol

Yiannis Psaras

Yiannis Psaras explains GossipSub, the libp2p pubsub protocol used by Filecoin and other blockchain networks, and the security extensions developed to harden it against Sybil, eclipse, censorship, cold-boot, and covert flash attacks. He details the protocol mechanics including the global mesh, local mesh degree D with D_low and D_high thresholds, heartbeat-driven graft and prune operations, and the eager-push plus lazy-pull gossip exchange where only message IDs are forwarded outside the mesh. The core hardening is a per-peer score function combining time-in-mesh, first-message deliveries, mesh failure penalties, invalid messages, an application-specific term, and IP colocation, complemented by mitigation strategies such as controlled mesh maintenance, flood publishing of first-seen messages, and prune backoff. Using Testground simulations with 5,000 peers and a 20:1 attacker-to-honest ratio, Psaras shows that GossipSub keeps propagation under the six-second Filecoin deadline across all attack scenarios while the Bitcoin and Ethereum 1 pubsub equivalents degrade significantly, and demonstrates how honest peers progressively prune attacker connections to reconstitute an honest-only mesh.

2021

Thumbnail for Measuring the Web3.0 Stack

Web3 Summit Oct 2021

Measuring the Web3.0 Stack

Yiannis Psaras

Yiannis Psaras introduces ProbeLab's systematic measurement program for the Web3 stack, built around the Nebula crawler that snapshots the IPFS DHT roughly every 30 minutes across a probes/storage/processing architecture. The talk reports IPFS DHT churn comparable to early BitTorrent, with around 60% of server peers staying online for 1.5 hours or less and 80% for under three hours, and classifies the network into roughly 14% always-on, 85% dangling, and 0.6% always-off nodes. Infrastructure attribution shows only about 13% of nodes run on known cloud providers (DigitalOcean, AWS, Azure) while the remainder run on home machines or unidentified hosts, and a Hong Kong-based correlation analysis finds no significant day/night uptime pattern. Psaras then breaks down end-to-end content lifecycle latency into DHT walk and put phases for publishing and retrieval, and shows agent-version breakdowns of failed provider record puts (around 36% on go-ipfs, 31% on Hydra boosters) as a starting point for targeted debugging.

Thumbnail for Introducing Peer Copy - A fully Decentralized File Transfer Tool

IFIP DI2F Jun 2021

Introducing Peer Copy - A fully Decentralized File Transfer Tool

Dennis Trautwein

In this short conference presentation from IFIP Networking 2021, Dennis Trautwein introduces Peer Copy (pcp), a fully decentralized peer-to-peer file transfer tool built on libp2p and developed with co-authors Moritz Schubotz and Bela Gipp. The talk explains how pcp lets any two parties exchange a file using only a short, memorable sequence of words drawn from the BIP39 wordlist — no accounts, no relay servers, and no central rendezvous service. Trautwein walks through the two-stage protocol: peer discovery happens via multicast DNS when both endpoints sit on the same local network and via provider records in the InterPlanetary File System's distributed hash table when they are separated by the public internet, after which the shared word sequence is fed into a password-authenticated key exchange (PAKE) to derive a strong session key that authenticates the peers and encrypts the transfer. He contrasts this design with established tools like croc and magic-wormhole, which still depend on operator-run rendezvous infrastructure, and presents measurements showing that the fully decentralized approach can match the performance of these centralized alternatives while removing single points of failure and trust.

2020

Thumbnail for Crafting the Filecoin Spec

Filecoin Conversations Oct 2020

Crafting the Filecoin Spec

Yiannis Psaras · Hugo Dias

Yiannis Psaras and Hugo Dias describe the multi-month effort to rebuild the Filecoin specification at spec.filecoin.io as a single source of truth spanning theory, implementation, and testing across an open-source project with many external contributors. Yiannis covers the content side, including health-monitoring dashboards that track section-level status (stable, reliable, work-in-progress), links to internal and external audit reports, an implementation-agnostic dashboard pulling CI status and test coverage from Lotus, Venus, Forest, and other clients, and a stabilization progress bar showing over 80% of sections at stable/reliable. Hugo then walks through the tooling built around three goals — easy (plain Markdown authoring with `npm start` live preview or direct GitHub editing), consistent (validation, link checking, and formatting enforced via CI), and enabling (advanced features like GitHub symbol-lookup embeds with comment extraction that keep spec text in sync with implementation source, Mermaid and GraphViz diagram pipelines, and KaTeX math rendering). They close with planned work on FIP-driven versioning and surfacing Oni team test results directly inside the spec website.

Thumbnail for GossipSub v1.1

Filecoin Conversations Oct 2020

GossipSub v1.1

Yiannis Psaras · Dimitris Vyzovitis

Yiannis Psaras and Dimitris Vyzovitis walk through the design of GossipSub, the secure pub/sub message propagation protocol developed jointly by the libp2p team and ResNetLab for Filecoin and other permissionless blockchains. They explain the protocol's four core components: mesh construction with a tunable degree D governing how many peers each node connects to, gossip dissemination via I-HAVE/I-WANT messages over heartbeat intervals, a locally-maintained peer scoring function weighing first-message deliveries, delivery rate, message validity, uptime, and IP colocation, and a set of mitigation strategies including controlled mesh maintenance, flood publishing, adaptive gossip dissemination, and backoff-on-prune. The discussion covers the trade-off between eager-push and lazy-pull regimes given Filecoin's six-second propagation deadline within its 30-second epoch, and reviews Testground evaluation results comparing GossipSub against Bitcoin's flooding and Ethereum's square-root propagation under eclipse, Sybil, and cold-boot attacks, where GossipSub stays well under the deadline and recovers honest-node mesh dominance within roughly 1.5 minutes even when Sybils outnumber honest nodes 20-to-1.