Dealbot Methodology
This document is the source of truth for how dealbot's Data Storage check works.
Source code links throughout this document point to the current implementation.
For event and metric definitions used by the dashboard, see Dealbot Events & Metrics.
Rendered from the dealbot documentation · cached for 24 hours
A "data storage check" is dealbot's end-to-end test of uploading a piece to a storage provider (SP) and confirming the uploaded data is publicly discoverable and retrievable. ("Deal" is a synonym for "data storage check".)
Every data storage check, dealbot:
A successful operation requires all assertions in the table below to pass.
Failure occurs if any step fails or the deal exceeds its max allowed time. There are no timing-based quality assertions. Operational timeouts exist to prevent jobs from running indefinitely, but they are not quality assertions.
Each deal asserts the following for every SP:
| # | Assertion | How It's Checked | Sub Status Affected | Retries | Relevant Metric for Setting a Max Duration | Implemented? |
|---|---|---|---|---|---|---|
| 1 | SP accepts piece upload | Upload completes without error (HTTP 200); piece CID is returned | Upload | 1 | ingestMs |
Yes |
| 2 | Piece submission recorded on-chain | Synapse piecesAdded progress event fires with a transaction hash |
Onchain | n/a | pieceAddedOnChainMs |
Yes |
| 3 | Piece is confirmed on-chain | Synapse piecesConfirmed progress event fires |
Onchain | n/a | pieceConfirmedOnChainMs |
Yes |
| 4 | SP indexes piece locally | PDP server reports indexed: true |
Discoverability | n/a | spIndexLocallyMs |
Yes |
| 5 | Content is discoverable on filecoinpin.contact | IPNI index returns a |
Discoverability (indexer=filecoinpin.contact) | Polling with delay until timeout | ipniVerifyMs |
Yes |
| 5b | Content is discoverable on cid.contact (observational cross-check) | IPNI index returns a |
cid.contact Verification | Polling with delay until timeout | ipniVerifyMs |
Yes |
| 6 | Content is retrievable | See Retrieval Check for specific assertions | Retrieval | 0 | ipfsRetrievalLastByteMs |
Yes |
| 7 | All checks pass | Deal is not marked successful until all assertions pass within window | All four | n/a | dataStorageCheckMs |
Yes |
The dealbot scheduler triggers data storage check jobs at a configurable rate.
flowchart TD CreateCar --> SelectDataSet["Select a dataset for data storage check"] SelectDataSet --> Upload["Upload CAR as piece to SP"] Upload --> Chain["Wait for on-chain piece creation confirmation"] Upload --> LocalIndex["Wait for SP local indexing"] LocalIndex --> IpniAnnouncement["Wait for SP to announce local index to IPNI"] IpniAnnouncement --> IpniVerification["IPNI verification"] LocalIndex --> IpfsRetrieval["SP /ipfs Retrieval Check"] Chain --> CheckResults["Mark data storage check successful if all steps pass"] IpniVerification --> CheckResults IpfsRetrieval --> CheckResults
Dealbot generates a random binary file with a unique name and embedded markers (prefix/suffix with timestamp and unique ID).
random-{timestamp}-{uniqueId}.binRANDOM_PIECE_SIZES (default: 10 MiB)Source: dataSource.service.ts
The raw data is converted to a CAR (Content Addressable Archive) file (via filecoin-pin integration). See https://github.com/filecoin-project/filecoin-pin/blob/master/documentation/behind-the-scenes-of-adding-a-file.md#create-car for more info.
Source: ipni.strategy.ts (convertToCar)
stored — SP confirms receipt (HTTP 2xx). Records the piece CID.Source: deal.service.ts (createDeal)
After upload completes, dealbot waits for the piece to be confirmed onchain via Synapse executeUpload(...).onProgress events:
- piecesAdded — piece submission is recorded as reported by the SP on-chain (transaction hash available).
- piecesConfirmed — confirm the piece is onchain by querying the chain RPC endpoint. filecoin-pin and synapse-sdk are doing this work under the hood.
After upload completes, dealbot polls the SP's PDP server to track the piece through its indexing lifecycle:
- sp_indexed: SP has indexed the piece locally. Any CID in the CAR is now retrievable with /ipfs/$CID retrieval, but it may not be discoverable by the rest of the network. Direct SP retrieval checking can commence.
- sp_advertised: SP has announced the piece index to IPNI. (In IPNI terminology this is "advertisement announcement" (see docs)). IPNI indexing verification can commence.
- Poll interval: 2.5 seconds (hardcoded POLLING_INTERVAL_MS in ipni.strategy.ts)
When the SP returns indexedAt or advertisedAt, dealbot uses those provider-side timestamps for spIndexLocallyMs, spAnnounceAdvertisementMs, and data-storage ipniVerifyMs. If those fields are absent or unusable, dealbot falls back to the time it observed the status while polling.
Source: ipni.strategy.ts (monitorPieceStatus)
After the SP announces the piece index to IPNI, dealbot ensures the uploaded piece can be discovered by others with standard IPFS tooling. It does this in two sequential stages using the waitForIpniProviderResults function from the filecoin-pin library, passing an explicit ipniIndexerUrl for each call:
cidContactVerification but does not affect the Discoverability sub-status.Additional notes:
- Polling interval: 2 seconds (configurable via IPNI_VERIFICATION_POLLING_MS)
- ipniVerifyMs indexer=filecoinpin.contact observation is measured from the SP's advertisedAt timestamp to the end of filecoinpin.contact verification when the SP provides a sane timestamp. This attributes the full "announced to visible on filecoinpin.contact" window instead of only dealbot's local polling window.
- ipniVerifyMs indexer=cid.contact observation is measured from filecoinpin.contact verification completion until verification completion on cid.contact. This captures the incremental propagation gap between the two indexers.
Source: ipni.strategy.ts (monitorAndVerifyIPNI)
See Retrieval Check for the specifics of retrieving and verifying the returned bytes match the CID.
A deal's overall status is a function of four sub-statuses: Upload, Onchain, Discoverability, and Retrieval. The deal succeeds only if all four report success; if any one fails, the overall deal is a failure. The flow is sequential at the start, then branches:
sp_indexed.flowchart TD U["Upload Status"] O["Onchain Status"] D["Discoverability Status"] CV["cid.contact Verification"] R["Retrieval Status"] OK["Data Storage Check success"] FAIL["Data Storage Check failure"] U -->|failure| FAIL U -->|success| O U -->|success| D D -->|sp_indexed| R O -->|failure| FAIL D -->|failure| FAIL R -->|failure| FAIL O -->|success| OK D -->|success| OK D -->|success| CV R -->|success| OK
It's expected that a Data Storage check will still store an overall status for easy querying:
| Overall Status | Meaning |
|---|---|
pending |
Upload Status = pending (i.e., piece upload to the SP hasn't started.) |
inProgress |
Data Storage check is running. |
success |
All sub-statuses are success. |
failure.timedout |
Any sub-status is failure.timedout. |
failure.other |
Any sub-status is failure.other. |
| Upload Status | Meaning |
|---|---|
pending |
Piece upload to the SP hasn't started. |
success |
SP confirmed receipt of the piece. |
failure.timedout |
Failed to upload within the allotted time. |
failure.other |
Failed to upload for other reasons. |
| Onchain Status | Meaning |
|---|---|
pending |
Onchain verification hasn't started yet because waiting for successful upload. |
success |
Piece confirmed on-chain (transaction hash recorded). |
failure.timedout |
Failed to confirm piece onchain within the allotted time. |
failure.other |
Failed to confirm piece onchain for other reasons. |
| Discoverability Status | Meaning |
|---|---|
pending |
Discoverability verification hasn't started yet because waiting for successful upload. |
sp_indexed |
SP indexed the piece locally |
sp_announced_advertisement |
SP announced the local index to IPNI so IPNI can pull it from the SP. |
success |
Root CID is discoverable via IPNI and the SP is listed as a provider in the IPNI response. |
skipped |
IPNI verification was not attempted because rootCID/blockCIDs are absent from deal metadata or rootCID cannot be parsed as a valid CID. |
failure.timedout |
Dealbot failed to confirm |
failure.other |
Dealbot failed to confirm |
| Retrieval Status | Meaning |
|---|---|
pending |
Retrieval checking hasn't started yet because Discoverability verification hasn't progressed past sp_indexed. |
success |
Piece was retrieved and verified with standard IPFS tooling. |
failure.timedout |
Piece wasn't retrieved and verified within the allotted time. |
failure.other |
Piece wasn't retrieved and verified for other reasons. |
Sources:
- types.ts (DealStatus)
- types.ts (IpniStatus)
Metric definitions live in Dealbot Events & Metrics.
Key environment variables that control deal creation behavior:
| Variable | Description |
|---|---|
RANDOM_PIECE_SIZES |
Possible random file sizes in bytes for data-storage checks. See docs/environment-variables.md#random_piece_sizes for defaults and examples. |
Source: apps/backend/src/config/app.config.ts
See also: docs/environment-variables.md for the source-of-truth configuration reference.
See https://github.com/filecoin-project/filecoin-pin/blob/master/documentation/content-routing-faq.md#why-is-there-filecoinpincontact-and-cidcontact
The items below were previously TBD and are now implemented. Tracking issue: https://github.com/FilOzone/dealbot/issues/280.
| Item | Status |
|---|---|
| Inline retrieval verification | Done — deal.service.ts runs testAllRetrievalMethods inline; deal throws on failure. |
| CID-based content verification | Done — ipfs-block.strategy.ts traverses the DAG and uses createBlock({ bytes, cid, hasher: sha256 }) which throws on hash mismatch (per-block CID integrity). |
| Per-deal max time limit | Done — DEAL_JOB_TIMEOUT_SECONDS triggers an AbortController in jobs.service.ts; on abort the deal is set to DealStatus.FAILED with failure-status metrics emitted. |
| Deal gated on all checks | Done — deal only reaches DealStatus.DEAL_CREATED after upload, onchain, IPNI, and retrieval all succeed. |
| Status model update | Done — DealStatus includes PIECE_CONFIRMED, DEAL_CREATED, FAILED; IpniStatus includes SP_INDEXED, SP_ADVERTISED, VERIFIED, FAILED; RetrievalStatus enum exists. |
piecesConfirmed progress event tracking |
Done — piecesConfirmedTime recorded, pieceConfirmedOnChainMs histogram emitted, DealStatus.PIECE_CONFIRMED state exists. |
| IPFS gateway retrieval verification | Done — inline retrieval runs after sp_indexed. |
filecoin-pin CAR conversion |
Done — car-utils.ts uses createCarFromPath from filecoin-pin/core/unixfs; deal.service.ts imports executeUpload from filecoin-pin. |