Quickstart Guide

Who Is This For?

This guide walks you through the fastest way to start using the Vana Data Access Layer, whether you are:

A DataDAO → Refining and publishing structured datasets.
A Developer or Researcher → Querying decentralized datasets securely.

📘
Note
This system is currently inclosed alpha. Some steps may evolve. Contact the team in the Discord for early access.

For DataDAOs

Step 1: Define Your Dataset Schema

Create a schema definition to structure your dataset.
Upload the schema definition to IPFS and note the schema's content identifier (CID) for the refinement type registration in step 2.
Example schema format:

{
  "name": "social_media_posts",
  "version": "0.0.1",
  "description": "Refined social media dataset",
  "dialect": "sqlite",
  "schema": "CREATE TABLE posts (id INTEGER PRIMARY KEY, user_id TEXT, content TEXT, timestamp DATETIME);"
}

Step 2: Refine & Encrypt Your Data

Extend your proof-of-contribution mechanism with a dockerized data refinement step
- Normalize - Convert raw data into a structured SQLite format.
- Mask (optional) - Suppress any data that should not be accessed through standard queries.
- Encrypt - Encrypt the dataset to protect against unauthorized access.
Add the data refiner to the Data Refiner Registry contract, referencing both the schema and the (docker) refinement instruction urls.
Run the data refinement step on uploaded file contents as part of your DataDAO's Proof-of-Contribution (PoC) step.

Step 3: Store & Publish the Data

Upload the encrypted refined data point to IPFS.
Link the data point to the original data by persisting the refined data's content identifier (CID) to the corresponding Data Registry file's refinements.
Define access policies & pricing for all potential data consumers by creating and approving general permission requests in the Query Engine contract.

For Data Consumers

Step 1: Discover Available Datasets

Use the Data Refiner registry contract to find structured datasets and their schemas
Alternatively explore which DLPs provide suitable general data access permissions through the query engine contract

📘
Note
Can't find what you need? Start a conversation with us through our Discord or contact DLP builders directly through their public channels

Step 2: Request Access & Pay for Queries

Contact the DataDAO off-chain to propose a permission request with query pricing for the dataset, tables, or columns you need.
Once the request is approved by the DataDAO through the Query Engine contract, you can execute queries targeting the approved data.
Pre-pay the required query fee in $VANA to the compute engine through the job registry contract.

Step 3: Run Queries on the Data

Write your query to run against the target permission schema.
Submit a new query job to the Compute Engine through an HTTP POST request to the api jobs endpoint.
The most recently approved permissions that support your data query are applied automatically.
Provide an optional webhook url for a callback with result artifacts when the query has been completed.

Step 4: Retrieve & Process Results

Once pre-payment is confirmed, the query is executed.
If a webhook url was provided, this will be called with the resulting artifacts after the query has successfully completed.
Otherwise, query the job results directly from the Compute Engine.