Skip to main content
Some applications need more than data — they need a proof of where it came from: “eligible data carries a proof it came from service X.” The protocol provides a standardized place to express that: a provenance commitment for a data point. The design is deliberately minimal to be flexible. The protocol anchors a fingerprint (a hash) on-chain; the actual proof and metadata live off-chain. A buyer fetches the proof and checks it against the on-chain hash. The protocol stays cheap, private, and not opinionated about what the proof is. In some cases, that proof is a ZK proof of a specific attribute of the data, for example, that a given file contains a minimum number of conversations, or that a given video meets a quality requirement.
Coming soon. This is being added to the data registry so the protocol has a standardized place to express verifiability. It anchors provenance; the off-chain attestation is what establishes origin.

How it works

The protocol stores only hashes — no data, no metadata, and no file location on-chain. Each data type defines its own metadata schema off-chain; the protocol never parses formats.

What the protocol anchors

A few fields on the (scope-aware) data registry:
FieldWhat it commits to
dataCommitmentA commitment to the exact data the buyer receives, recomputable by the buyer over the bytes they get
metadataHashHash of the off-chain metadata, including the provenance attestation
metadataURI (optional)Where to fetch that metadata
Metadata can live in a third party-provided service, in personal servers, or any other off-chain store. Only authorized buyers fetch it and verify it against the on-chain hash.

Pluggable proofs

The protocol is not opinionated about the proof. The on-chain anchor is the same regardless of how provenance is actually established, so the proof system can evolve with no contract change:
  • Today: an attestation signed by a Vana verification service — “this data came from service X.”
  • Later: a decentralized attester network, or zkTLS-style proofs, or any other scheme a data type wants.
Attestors are deliberately not fixed. Depending on its requirements, a data type can rely on a centralized provider, a decentralized attester network, or a trusted execution environment writing the attestation — and different data types can make different choices at the same time. This is also the answer to attester trust: a single signed attestation means trusting that signer, so where that assumption is too strong, a data type can choose a decentralized or TEE-based attestor instead. Because the protocol only stores a hash of whatever proof exists, these implementations coexist behind the same fields, and can evolve, with no contract change.

Binding to the data

A provenance proof is only meaningful if it is about the exact data the buyer receives. Otherwise a buyer could verify a real, valid attestation that is not actually about the bytes they were handed (proof substitution). That is what dataCommitment is for, and why it is distinct from metadataHash:
  • dataCommitment binds the data; metadataHash binds the proof/metadata.
  • A future attestation must sign over or reference dataCommitment, so “service X produced this” is provably about this committed data, not a free-floating claim.
  • The binding handle has to exist in the fields from day one — otherwise the “harden to zkTLS with no contract change” property is lost.

What this gives you

A standardized, cheap, private slot to commit to a data point’s provenance, and an upgrade path from a signed attestation to stronger proofs (zkTLS, decentralized attesters) without touching the contract. The on-chain anchor commits to the data; the strength of the origin guarantee comes from the off-chain attestation, which buyers fetch and verify against the anchor. Freshness and revocation can be layered on as a data type requires.
Status. The fields are being added to the new data registry so verifiability has a standardized place to live; the attestation layer (and zkTLS hardening) follows.