Anyone can submit data to the Vana network. However, for data to be considered valid by a DLP, it must be attested for by a trusted party. These trusted parties issue an attestation about the data to prove that it is, in fact, authentic, high-quality, unique, and has whatever other properties DLPs value in its data contributions.
Data attestations live mostly offchain, and a URL to a data's attestation is written onchain alongside the data itself.
The attestation of a data point must follow a spec. Attestations show relevant information about how the data was evaluated, proof-of-contribution scores, integrity checksums, and custom metadata relevant to a specific DLP.
An example of when this would be useful: consider a ChatGPT DLP that accepts GDPR exports from chatgpt.com. Say the DLP considers the export to be high quality when the number of conversations in the export exceeds 10. This DLP can insert numberOfConversations: xxx
in the attestation when Proof of Contribution is run, and anyone can see how valuable that encrypted data point is.
signed_fields
Contains the main data fields that are signed by the prover.
subject
Information about the datapoint being attested for.
url
URL where the encrypted file lives.
file_id
The ID of the file, given by the Data Registry
owner_address
Wallet address of the file owner.
decrypted_file_checksum
Checksum of the decrypted file for integrity verification.
encrypted_file_checksum
Checksum of the encrypted file for integrity verification.
encryption_seed
The message that was signed by the owner to retrieve the encryption key.
prover
Information about the prover.
type
Type of the prover, satya
is one of the confidential TEE nodes in the Satya network. Proofs can also be self-signed
where the data owner generates the proof.
address
Wallet address of the prover.
url
URL or address where the prover service is hosted.
proof
Details about the generated proof.
image_url
Docker image URL of where the instructions to generate the proof is downloaded from
created_at
Timestamp of when the proof was created.
duration
Duration of the proof generation process, in seconds.
dlp_id
DLP ID from the Root Network Contract, this is used to tie the proof to a DLP.
valid
Boolean indicating if the subject is valid.
score*
Overall score of the subject, from 0-1.
authenticity
Authenticity score of the subject, from 0-1.
ownership
Ownership score of the subject, from 0-1.
quality
Quality score of the subject, from 0-1.
uniqueness
Uniqueness score of the subject, from 0-1.
attributes
Additional key/value pairs that will be available on the public proof. These can be used to quickly view properties about the encrypted subject.
metadata*
Key/value metadata about the proof that is written onchain.
signature
Generated by the prover signing a stringified representation of signed_fields
, sorted by the key name. To verify it, we can take the signature and the stringified representation, and extract the address that signed it, which should match the prover.address
.
* The score
and metadata
fields are written onchain, and a DLP can use these fields to calculate how many DLP-specific tokens should be issued as a reward to the data contributor for their contribution.