Querying Data on Vana
A step-by-step tutorial to discovering datasets, getting access, and executing a query on the Vana network.
This tutorial will walk you through the end-to-end process of submitting a job to query data from a DataDAO. We will use the default compute instruction, which simply returns the query results as a database file.
Step 1: Discover a Dataset
The DataRefinerRegistry
contract holds the list of all available data refiner types. Refiners store references to the off-chain schema definitions and the Docker image used to process the data. Interact with the refiners
function to find the refinerId
of the dataset you want to query.
- Contract:
DataRefinerRegistry
(0x93c...) - Function:
refiners(uint256 refinerId)
An example schema definition returned by this function might look like this:
{
"name": "spotify",
"version": "0.0.1",
"description": "Schema for storing music-related data",
"dialect": "sqlite",
"schema": "CREATE TABLE IF NOT EXISTS \"albums\"(\n [AlbumId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,\n [Title] NVARCHAR(160) NOT NULL,\n [ArtistId] INTEGER NOT NULL,\n FOREIGN KEY ([ArtistId]) REFERENCES \"artists\" ([ArtistId]) \n\t\tON DELETE NO ACTION ON UPDATE NO ACTION\n);\nCREATE TABLE IF NOT EXISTS \"artists\"(\n [ArtistId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,\n [Name] NVARCHAR(120)\n);\n"
}
Step 2: Request & Verify Access
To query a dataset, you need permission from the DataDAO that owns it. This process typically involves an off-chain agreement on terms (e.g., in Discord), followed by the DataDAO granting you on-chain permissions.
You need approval for two things:
- Data Access: Permission to query the specific dataset (
refinerId
). - Compute Access: Permission to use a specific
computeInstructionId
with that DataDAO's data. For this tutorial, we will use the default instruction:- Mainnet
computeInstructionId
:3
- Moksha Testnet
computeInstructionId
:40
- Mainnet
How to Verify Your Permissions
Before submitting a job, you can check on-chain to see if your wallet already has the necessary permissions.
-
Check Data Access Permission:
TheQueryEngine
contract is the gateway for permissioning data access. DataDAOs use it to approve or revoke access requests for specific datasets.- Contract:
QueryEngine
(0xd25...) - Function:
getPermissions(uint256 refinerId, address grantee)
- Contract:
-
Check Compute Access Permission:
TheComputeInstructionRegistry
holds the list of approved compute instructions that can be run on a DataDAO's data.- Contract:
ComputeInstructionRegistry
(0x578...) - Function:
isApproved(uint256 instructionId, uint256 dlpId)
- Contract:
If both checks pass, you are ready to proceed. If not, you will need to contact the DataDAO owner (e.g., via Discord) to request access for both the dataset and the default compute instruction ID.
Step 3: Submit and Execute Your Job
This is a multi-part process involving both on-chain transactions and off-chain API calls, all managed by the ComputeEngine
. The ComputeEngine
contract handles job submissions, pre-deposited payments, and tracking the status of your query.
3a. Pre-pay for the Job
Fund your account on the ComputeEngine
contract to cover compute costs.
- Contract:
ComputeEngine
(0xb2B...) - Function:
deposit(address token, uint256 amount)
(Use0x0
for the token address to deposit $VANA).
3b. Register the Job On-Chain
Call submitJob
on the ComputeEngine
contract. This returns a jobId
that you will use in the next steps.
- Function:
submitJob(uint80 maxTimeout, bool gpuRequired, uint256 computeInstructionId)
maxTimeout
: Use300
(seconds).gpuRequired
: Set tofalse
for the default job.computeInstructionId
: Use the default ID for your target network (e.g.,3
for Mainnet).
3c. Prepare Signatures
You must generate two signatures with the wallet that submitted the job:
- Job ID Signature: Sign the
jobId
(as a 32-byte hex string). This is used in thex-job-id-signature
header. - Query Signature: Sign the raw string content of your SQL query (e.g.,
"SELECT * FROM users LIMIT 10"
). This is used in the request body.
Example using Ethers.js:
3d. Trigger Execution via API
The response from your submitJob
transaction will include the tee-url
. Make a POST
request to this URL to start the job.
- Endpoint:
POST https://{tee-url}/job/{job-id}/
- Header:
x-job-id-signature: "0x..."
- Body:
{ "input": { "query": "SELECT id, locale FROM users LIMIT ?", "query_signature": "0x...", "refinerId": 12, "params": [10] } }
Step 4: Monitor and Retrieve Results
You can poll the job status using its ID.
- Endpoint:
GET https://{tee-url}/job/{job-id}/
- Header:
x-job-id-signature: "0x..."
Once the job status
is success
, the response will contain an artifacts
array:
{
"job_id": "123",
"run_id": "123-1b6dc6acbeb84f5ea50f79e7b081e9e6",
"status": "success",
"artifacts": [
{
"id": "art-9643cb38bea94261b5d2d2bba701bd2b",
"url": "https://{tee-url}/job/100/artifacts/art-9643cb38bea94261b5d2d2bba701bd2b",
"size": 72,
"mimetype": "application/json",
"expires_at": "2025-05-07T08:46:55.629878",
"status": "available",
"file_name": "stats.json",
"file_extension": ".json"
}
],
"usage": {
"cpu_time_ms": 1234891,
"memory_mb": 12.3,
"duration_ms": 2458312
}
}
To download your results, make a final GET
request:
- Endpoint:
GET https://{tee-url}/job/{job-id}/artifacts/{artifact-id}
- Header:
x-job-id-signature: "0x..."
The artifact (e.g., query_results.db
) can now be used in your application.
Next Steps & Support
You have now successfully queried a decentralized dataset in a privacy-preserving way. You can plug the results from your downloaded artifact into an application, data pipeline, or AI agent framework.
Need help or want to find DataDAOs to work with?
- Join the conversation in the Vana Builder Discord.
- Explore available DataDAOs directly on Datahub.
Updated 1 day ago