Quickstart Guide
Who Is This For?
This guide walks you through the fastest way to start using the Vana Data Access Layer, whether you are:
- A DataDAO → Refining and publishing structured datasets.
- A Developer or Researcher → Querying decentralized datasets securely.
Note
This system is currently inclosed alpha. Some steps may evolve. Contact the team in the Discord for early access.
For DataDAOs
Step 1: Define Your Dataset Schema
- Create a schema definition to structure your dataset.
- Upload the schema definition to IPFS and note the schema's content identifier (CID) for the refinement type registration in step 2.
- Example schema format:
{
"name": "social_media_posts",
"version": "0.0.1",
"description": "Refined social media dataset",
"dialect": "sqlite",
"schema": "CREATE TABLE posts (id INTEGER PRIMARY KEY, user_id TEXT, content TEXT, timestamp DATETIME);"
}
Step 2: Refine & Encrypt Your Data
- Extend your proof-of-contribution mechanism with a dockerized data refinement step
- Normalize - Convert raw data into a structured SQLite format.
- Mask (optional) - Suppress any data that should not be accessed through standard queries.
- Encrypt - Encrypt the dataset to protect against unauthorized access.
- Add the data refiner to the Data Refiner Registry contract, referencing both the schema and the (docker) refinement instruction urls.
- Run the data refinement step on uploaded file contents as part of your DataDAO's Proof-of-Contribution (PoC) step.
Step 3: Store & Publish the Data
- Upload the encrypted refined data point to IPFS.
- Link the data point to the original data by persisting the refined data's content identifier (CID) to the corresponding Data Registry file's refinements.
- Define access policies & pricing for all potential data consumers by creating and approving general permission requests in the Query Engine contract.
For Data Consumers
Step 1: Discover Available Datasets
- Use the Data Refiner registry contract to find structured datasets and their schemas
- Alternatively explore which DLPs provide suitable general data access permissions through the query engine contract
Note
Can't find what you need? Start a conversation with us through our Discord or contact DLP builders directly through their public channels
Step 2: Request Access & Pay for Queries
- Contact the DataDAO off-chain to propose a permission request with query pricing for the dataset, tables, or columns you need.
- Once the request is approved by the DataDAO through the Query Engine contract, you can execute queries targeting the approved data.
- Pre-pay the required query fee in $VANA to the compute engine through the job registry contract.
Step 3: Run Queries on the Data
- Write your query to run against the target permission schema.
- Submit a new query job to the Compute Engine through an HTTP POST request to the api jobs endpoint.
- The most recently approved permissions that support your data query are applied automatically.
- Provide an optional webhook url for a callback with result artifacts when the query has been completed.
Step 4: Retrieve & Process Results
- Once pre-payment is confirmed, the query is executed.
- If a webhook url was provided, this will be called with the resulting artifacts after the query has successfully completed.
- Otherwise, query the job results directly from the Compute Engine.
Updated 2 days ago