Skip to main content
Scopes define what kind of data a user has and a builder can request. Schemas define what that data looks like. Together they form the type system of the Data Portability Protocol.

Scope taxonomy

A scope is a hierarchical identifier for a category of personal data:
{source}.{category}[.{subcategory}]
The first segment is always the source (the platform the data comes from). The second segment is the category. An optional third segment provides further specificity.

Examples

ScopeSourceWhat it contains
instagram.profileInstagramUsername, bio, follower counts
instagram.postsInstagramPosts and captions
instagram.likesInstagramLiked content
instagram.followersInstagramFollower list
chatgpt.conversationsChatGPTConversation history
chatgpt.conversations.sharedChatGPTShared conversations
youtube.watch_historyYouTubeWatch history
youtube.subscriptionsYouTubeChannel subscriptions
gmail.messagesGmailEmail messages
gmail.labelsGmailEmail labels

Naming rules

  • Use lowercase, dot-separated segments
  • Source segment must match the platform name
  • Category should describe the data type, not the API endpoint
  • Keep scope names stable — changing a scope name changes the encryption key used for that data

Schema registry

Every scope has a corresponding schema that defines the structure of its data. Schemas are registered onchain in the DataRefinerRegistry contract, which maps a schemaId to a schema definition (typically an IPFS CID pointing to a JSON Schema document).

How schemas are used

  1. When a Personal Server writes a data file, it looks up the schemaId for the scope via the Gateway (GET /v1/schemas?scope={scope})
  2. The data file is validated against the schema before storage
  3. The schemaId is included when registering the file in the DataRegistry
  4. Other Personal Server instances use the schemaId to resolve the canonical scope when syncing

Schema lookup

GET /v1/schemas?scope={scope}     # Look up schemaId by scope
GET /v1/schemas/{schemaId}        # Get schema metadata and definition URL
Schema definitions encode the canonical scope for the dataset. This is required so that a Personal Server can derive the correct scope key for decryption before it has decrypted the file contents.

Data file format

In v1, all data files are JSON. Each file follows a standard envelope format:
{
  "$schema": "https://ipfs.io/<cid_for_schema_id>",
  "version": "1.0",
  "scope": "instagram.profile",
  "collectedAt": "2026-01-21T10:00:00Z",
  "data": {
    "username": "alice",
    "displayName": "Alice Smith",
    "bio": "...",
    "followers": 1234,
    "following": 567
  }
}
FieldDescription
$schemaURL pointing to the IPFS CID for the registered schemaId
versionEnvelope format version ("1.0" in v1)
scopeCanonical scope identifier
collectedAtUTC timestamp of when the data was collected
dataSource-specific payload — structure defined by the schema

Encryption

Before upload to a storage backend, the entire plaintext JSON file is encrypted as a single OpenPGP blob. No plaintext metadata is stored alongside the ciphertext. The fileId linkage is tracked in the Personal Server’s local index. See Storage & Encryption for the full encryption model.

File naming

Data files are stored locally at:
~/.vana/data/{scope}/{YYYY-MM-DDTHH-mm-ssZ}.json
The filename uses the collectedAt timestamp with colons replaced by hyphens for filesystem compatibility.

Data connectors

Data Connectors are modules that extract data from specific platforms. They are not part of the protocol — they are implementation details of specific clients (e.g. Data Connect desktop app). The protocol defines the data format; connectors produce data in that format. Each connector publishes metadata that maps scopes to human-readable labels:
{
  "connectorId": "instagram",
  "displayName": "Instagram",
  "scopes": [
    {
      "scope": "instagram.profile",
      "label": "Your Instagram profile",
      "description": "Basic profile info, bio, and counts"
    },
    {
      "scope": "instagram.posts",
      "label": "Your Instagram posts",
      "description": "Your posts and captions"
    }
  ],
  "version": "1.0"
}
The Desktop App uses this metadata to render consent UI labels when a builder requests access to specific scopes.

Adding new scopes

To add a new data source to the protocol:
  1. Define the scope taxonomy — Choose scope names following the {source}.{category} pattern
  2. Create a JSON Schema — Define the structure of the data field for each scope
  3. Register the schema — Upload the schema to IPFS and register it in the DataRefinerRegistry with the canonical scope
  4. Build a Data Connector (optional) — Implement a connector module for the new platform in the Desktop App
Data Connectors are not required at the protocol level. Any client that can produce data in the correct file format and write it to a Personal Server via POST /v1/data/{scope} is compatible.