Performance Audit of libp2p’s AutoNAT
Introduction
Peer-to-peer networks built on libp2p require nodes to determine whether their addresses are reachable from the internet. Most residential and mobile devices sit behind Network Address Translation (NAT) devices, a router technique that maps private IP addresses to a shared public IP. While NAT allows outbound connections, it impedes direct inbound traffic. A node that doesn't know it's behind NAT may advertise unreachable addresses, participate as a Distributed Hash Table (DHT) server when it can't serve queries, or fail to reserve relay connections it needs.
libp2p's AutoNAT protocol solves this by having peers test whether a node's addresses are actually dialable from outside. AutoNAT v1 uses a simple majority vote; AutoNAT v2 (specified 2023, deployed 2024) improves on this with per-address testing and nonce-based verification.
This blogpost outlines ProbeLab’s findings from a thorough performance audit of libp2p’s AutoNAT. It contains the main findings and points to relevant scripts developed to carry out the study.
What is this all about?
Despite widespread use of libp2p's AutoNAT technique across libp2p implementations and libp2p-based networks, there hasn't been a targeted, in-depth analysis of AutoNAT's performance and effectiveness. This is what we did with this project: we investigated AutoNAT v2 across go-libp2p, rust-libp2p, and js-libp2p to evaluate whether it solves the reachability detection problem efficiently.
A companion Nebula crawl analysis of the IPFS Amino DHT confirms that DHT-mode flipping exists in production (2-12% per Kubo version), but most of it correlates with disconnections and restarts. Only ~0.39% of stably-reachable peers show flipping that cannot be explained by network events.
Who is this for?
The final audit report is for libp2p developers and node operators. It is a technical document, which is accompanied by several scripts to reproduce the experiments we have ran. The study is expected to help node operators make the right configuration on their nodes, in order to achieve the best connectivity performance as far as AutoNAT goes.
How to use?
The autonat-perf-audit repository includes all the code, scripts and software used for the experiments carried out during this study. It also includes: i) the project’s final audit report, which is a much more detailed version of this blogpost with more technical details and references, and, perhaps more importantly ii) a "best practices" report, which serves as a per-implementation runbook for go-, rust- and js-libp2p for all things AutoNAT.
Findings at a glance
In controlled testbed conditions, AutoNAT v2 correctly determines reachability for all standard NAT types (0% false negatives/positives, ~6s convergence). However, we identified 5 findings where the broader reachability stack breaks. These relate to:
- how implementations wire the protocol results into downstream decisions,
- protocol-level limitations, and
- cross-implementation inconsistencies.
See the cross-implementation comparison for how each finding manifests per implementation.
The most impactful finding is that global (v1) and per-address (v2) reachability can disagree, and there is no canonical way to reconcile them.
Why this matters
In go-libp2p, DHT and AutoRelay nodes subscribe to v1's global flag and do not consume v2's per-address signal. As a result, the unstable global flag can override v2's stable per-address result and drive DHT and relay decisions.
Downstream impact
This affects Kubo, which relies on go-libp2p's AutoNAT signal to decide DHT server participation (see Nebula crawl analysis).
Root cause
Both v1 and v2 versions select servers from the same peer pool. v1 treats every non-success result as evidence against reachability. In contrast, v2 discards them entirely. This means that v1 can oscillate due to server unreliability alone.
Spec and implementation gap
But what happens with other libp2p implementations? rust-libp2p and js-libp2p are not affected today because their DHT consumes v2-level signals only, but this is an implementation choice, not a spec requirement. Neither v1 nor v2 documents clarify what should happen if both protocols run, and the spec does not define it either.
Cross-implementation analysis
Only go-libp2p has v2 consumed by a production project (Kubo). rust-libp2p's v2 implementation works correctly, when properly configured, but lacks a safety net when TCP port reuse fails. js-libp2p emits no reachability events from v2. No rust or js project deploys v2 in production.
Main findings
This is a summary of each of our findings from this study. Please refer to the full report for details.
Finding 1: Inconsistent Global vs Per-Address Reachability (v1 vs v2)
Problem: AutoNAT v1 produces a global reachability flag (one of {Public, Unknown, Private} for the whole node) while AutoNAT v2 produces per-address reachability (one of {Reachable, Unreachable, Unknown} for each multiaddr). When both protocols run on the same node, the two signals can disagree, and there is no canonical reduction defined by the spec or by any implementation.
Impact: Every go-libp2p deployment that relies on DHT participation or relay decisions may experience intermittent routing degradation. DHT queries fail when v1 flips the node to client mode; direct connections are replaced by higher-latency relay paths; the cycle repeats as v1 flips back. Validator networks see higher-latency relay paths during startup; IPFS nodes delay DHT participation and waste relay reservations. This is the production phenomenon that Obol observed (see Obol/Charon).
Solution: When v2 is enabled, v2's per-address reachability should be the source of truth for global reachability.
Finding 2: UDP Black Hole Detector Blocks QUIC Dial-Back
Problem: go-libp2p's UDP black hole detector is a performance optimization that filters outbound UDP/QUIC dials when the node's recent UDP success rate is too low. The AutoNAT v2 dialerHost (the internal host that performs dial-backs) shares the main host's counter in read-only mode. Read-only mode redefines the filter semantics: any state other than Allowed is treated as Blocked. The net effect is that the v2 server cannot prove a client's QUIC address reachable.
Impact: The failure surfaces on freshly deployed infrastructure, low-traffic nodes, and isolated testbeds, where the main host does not generate enough successful outbound UDP traffic to leave Probing. As a result, when every reachable v2 server shares the same condition (the typical testbed case), the client never confirms its QUIC addresses. This means servers stay Unknown indefinitely, even though the node is genuinely reachable.
Solution: Remove the black hole detector from the v2 dialerHost entirely.
Finding 3: Address-Restricted NAT False Positive
Problem: AutoNAT v2 produces 100% false positive rate for nodes behind address-restricted NAT (Endpoint Independent Mapping EIM + Address-Dependent Filter ADF). The dial-back comes from the same IP the client already contacted, so the NAT allows it through. The node is technically reachable from the testing server's IP, but AutoNAT declares it globally reachable, which is incorrect. Peers connecting from other IPs will be blocked by the NAT.
Impact: Nodes behind ADF NAT would advertise addresses as globally reachable when they are only reachable from previously contacted IPs. In practice, however, ADF is not known to be used by any modern consumer router, so the impact should be minimal.
Solution: Require dial-back from a different IP than the one the client contacted, when the server is multihomed (has multiple public IPs available). When multihomed servers are not available, the limitation should be documented in the spec so that implementations can flag the result as "reachable from contacted IPs only" rather than "globally reachable.”
Finding 4: Symmetric NAT Missing Signal
Problem: Under symmetric NAT, (Address- and Port-Dependent Mapping ADPM), each outbound connection uses a different external port. The expected signal is UNREACHABLE. Instead, go-libp2p and js-libp2p produce no signal at all and the status remains Unknown indefinitely. All three implementations fail to produce a timely reachability signal, but for different reasons (see full report for details).
Impact: The status stays Unknown instead of transitioning to Private.
Solution: For go-libp2p: AmbientAutoNAT should subscribe to EvtNATDeviceTypeChanged and transition to Private when symmetric NAT (EndpointDependent) is detected. For js-libp2p: emit reachability events so QUIC dial-back failures surface as UNREACHABLE rather than silent removal.
Finding 5: rust-libp2p TCP Port Reuse Incorrect Metadata
Problem: rust-libp2p's TCP transport produces incorrect PortUse metadata. When an outbound dial requests PortUse::Reuse but bind() falls back to an ephemeral port (because the listen port isn't available yet), the connection metadata still says PortUse::Reuse. This is wrong because the connection used an ephemeral port, so the metadata should say PortUse::New.
Impact: The incorrect metadata causes an ephemeral-port address to enter the address manager when it shouldn't.
Solution: Fix PortUse metadata in libp2p-tcp: when the outbound dial falls back to an ephemeral port, construct the connection with PortUse::New instead of PortUse::Reuse.
Conclusion
All five findings were filed as upstream tracking issues in the relevant libp2p repositories. Each issue describes the problem and proposes a specific fix. See the Upstream Issue column in Findings at a Glance in our Github repository for links and details.
For application developers who need to integrate the libp2p NAT stack today (Identify + AutoNAT + UPnP + AutoRelay + DCUtR) without waiting for the upstream fixes, see NAT stack best practices, which includes per-implementation runbooks for go-libp2p, rust-libp2p, and js-libp2p.
Related Posts
So, you want to use a DHT? Let us help you configure it correctly.
From public servers to private clusters: a practical configuration walkthrough for every libp2p DHT deployment pattern.
libp2p Kademlia DHT Configuration Parameters
The definitive guide to every libp2p Kademlia DHT configuration parameter; what each one does, why defaults exist, and when to override them.
Optimistic Provide: How We Made IPFS Content Publishing 10x Faster
How a statistical approach cut IPFS content publishing times by over one order of magnitude.