Quickstart Guide

Who Is This For?

This guide walks you through the fastest way to start using the Vana Data Access Layer, whether you are:

  • A DataDAO → Refining and publishing structured datasets.
  • A Developer or Researcher → Querying decentralized datasets securely.

📘

Note

This system is currently inclosed alpha. Some steps may evolve. Contact the team in the Discord for early access.


For DataDAOs

Step 1: Define Your Dataset Schema

  • Create a schema definition to structure your dataset.
  • Upload the schema definition to IPFS and note the schema's content identifier (CID) for the refinement type registration in step 2.
  • Example schema format:
{
  "name": "social_media_posts",
  "version": "0.0.1",
  "description": "Refined social media dataset",
  "dialect": "sqlite",
  "schema": "CREATE TABLE posts (id INTEGER PRIMARY KEY, user_id TEXT, content TEXT, timestamp DATETIME);"
}

Step 2: Refine & Encrypt Your Data

  • Extend your proof-of-contribution mechanism with a dockerized data refinement step
    • Normalize - Convert raw data into a structured SQLite format.
    • Mask (optional) - Suppress any data that should not be accessed through standard queries.
    • Encrypt - Encrypt the dataset to protect against unauthorized access.
  • Add the data refiner to the Data Refiner Registry contract, referencing both the schema and the (docker) refinement instruction urls.
  • Run the data refinement step on uploaded file contents as part of your DataDAO's Proof-of-Contribution (PoC) step.

Step 3: Store & Publish the Data

  • Upload the encrypted refined data point to IPFS.
  • Link the data point to the original data by persisting the refined data's content identifier (CID) to the corresponding Data Registry file's refinements.
  • Define access policies & pricing for all potential data consumers by creating and approving general permission requests in the Query Engine contract.

For Data Consumers

Step 1: Discover Available Datasets

  • Use the Data Refiner registry contract to find structured datasets and their schemas
  • Alternatively explore which DLPs provide suitable general data access permissions through the query engine contract

📘

Note

Can't find what you need? Start a conversation with us through our Discord or contact DLP builders directly through their public channels

Step 2: Request Access & Pay for Queries

  • Contact the DataDAO off-chain to propose a permission request with query pricing for the dataset, tables, or columns you need.
  • Once the request is approved by the DataDAO through the Query Engine contract, you can execute queries targeting the approved data.
  • Pre-pay the required query fee in $VANA to the compute engine through the job registry contract.

Step 3: Run Queries on the Data

  • Write your query to run against the target permission schema.
  • Submit a new query job to the Compute Engine through an HTTP POST request to the api jobs endpoint.
  • The most recently approved permissions that support your data query are applied automatically.
  • Provide an optional webhook url for a callback with result artifacts when the query has been completed.

Step 4: Retrieve & Process Results

  • Once pre-payment is confirmed, the query is executed.
  • If a webhook url was provided, this will be called with the resulting artifacts after the query has successfully completed.
  • Otherwise, query the job results directly from the Compute Engine.