Skip to main content

Proof of contribution

Proof of Contribution (PoC) is how Data Liquidity Pools and DataDAOs verify that submitted data is real, unique, and attributable — and how they assign value (e.g. token rewards) to each contribution. It keeps pools Sybil-resistant and ties token issuance to validated contributions (e.g. VRC-20 when the pool uses that standard).

What PoC does

  • Validation — Confirms data is authentic, from the claimed source, and meets the pool’s quality bar.
  • Scoring — Assigns a value or score to each contribution so rewards and governance weight reflect impact.
  • Attribution — Links contributions to wallet addresses so tokens and rewards go to the right contributors.
Key elements: uniqueness (no redundant or copied data), quality (relevance and utility), ownership (data belongs to the contributor), and authenticity (data is real and unaltered). Contributors submit encrypted data; validation runs in a Trusted Execution Environment (TEE) so raw data isn’t exposed. Once validated, an attestation (proof) is recorded onchain and contributors can be rewarded. Each pool can define its own PoC logic (scripts, rules, or model-influence functions) depending on the type of data it handles.

Data refinement

Refinement ensures ingested data meets verifiable quality and security standards before it is stored and made available for permissioned query access. It is often integrated into a DLP’s PoC flow. Typical steps:
  1. Encryption — Encrypt the result and enforce access control.
  2. Masking — Optionally suppress fields the pool owner does not want to expose.
  3. Normalization — Structure data to match the onchain schema (e.g. in DataRefinerRegistry).
Refined output is structured, optionally masked, and encrypted. The schema definition is uploaded to IPFS and its CID is recorded onchain; a refiner image runs as part of PoC to perform refinement. Refined data is stored in decentralized storage (e.g. IPFS); the CID is written onchain and the Query Engine can index it in a TEE for permissioned queries.

Data validation (Satya network)

The recommended way to validate data securely on Vana is the Satya Network (Data Validators): confidential nodes that run Proof-of-Contribution in special hardware (e.g. Intel TDX). At a high level:
  1. The contributor uploads encrypted data and metadata (e.g. to the Data Registry).
  2. They request a proof-of-contribution job from the TEE Pool, paying a small fee.
  3. A Satya node is assigned; the contributor sends the encryption key and the PoC container image URL to that node (e.g. via POST /RunProof).
  4. The node decrypts the data in a protected environment, runs the PoC container, and produces an attestation.
  5. The attestation is uploaded (e.g. to IPFS) and the proof is written onchain; the node claims the fee.
The TEE Pool contract assigns nodes and tracks jobs; the DataDAO UI submits jobs (e.g. requestContributionProof(file_id, { value: job_fee })), listens for JobSubmitted, then calls the node’s RunProof API with file_id, job_id, encryption_key (or encrypted_encryption_key), encryption_seed, proof_url (Docker image), and optional env_vars / secrets. The PoC container receives system variables (e.g. FILE_ID, FILE_URL, JOB_ID, OWNER_ADDRESS) and custom env vars; it validates the data and outputs the attestation. For full request/response format and environment variables, see the official Satya and TEE Pool documentation.

Attestation schema

Attestations prove that data was evaluated by a trusted party (e.g. a Satya node). They live mostly off-chain; a URL or reference is written onchain with the data. Typical structure:
  • subjectfile_id, url, owner_address, decrypted_file_checksum, encrypted_file_checksum, encryption_seed.
  • provertype (e.g. satya), address, url.
  • proofimage_url, created_at, duration, dlp_id, valid, score (0–1), and optional scores for authenticity, ownership, quality, uniqueness; attributes and metadata (e.g. for onchain reward calculation).
  • signature — Prover signs a stringified representation of the signed fields; verification recovers the signer address.
The score and metadata fields are often written onchain so the DataDAO can compute how many tokens to issue to the contributor.

Customization and benefits

Each DLP can implement its own PoC function. For example: a social-media DataDAO might focus on engagement or activity; a financial DataDAO might prioritize timeliness and accuracy. One recommended approach is a model influence function that measures how much new information a data point adds to a model. Benefits: fair governance (contributors earn rights based on validated contributions), incentivization (rewards in the pool’s currency or VANA), privacy (validation without exposing raw data), and data quality (only validated, useful data enters the pool). When building a DataDAO, you configure verification and refinement logic as part of your DLP; see the DLP quickstart for where this fits in the deployment flow.