> For the complete documentation index, see [llms.txt](https://openledger.gitbook.io/openledger/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://openledger.gitbook.io/openledger/datanets-and-proof-of-attribution/data-attribution-pipeline.md).

# Data Attribution Pipeline

The **Proof of Attribution** mechanism in OpenLedger ensures that each data source is cryptographically linked to model outputs, providing an immutable and decentralized record of contributions.&#x20;

The attribution pipeline follows these steps:

#### **Step 1: Data Contribution**

* Data contributors submit structured, domain-specific datasets for AI model training.
* Each dataset is attributed on-chain, ensuring transparency and verifiability.

#### **Step 2: Datanets and Influence Attribution**

* Contributors submit training data with metadata, defining its intended use.
* The impact of each data contribution is measured based on:
  * **Feature-level Influence:** Assessing the data’s effect on model training.
  * **Contributor Reputation:** Evaluating the credibility and past contributions of data providers.

#### **Step 3: Training and Verification**

* Influence scores are calculated to determine the quality and relevance of each contribution.
* Training logs ensure all data contributions are recorded and validated.

#### **Step 4: Reward Distribution Based on Attribution**

* Data contributors receive token-based rewards proportional to their data’s impact on model outputs.
* A fair attribution system ensures high-value contributions are prioritized.

#### **Step 5: Penalizing Malicious or Low-Quality Contributions**

* Contributions flagged as biased, redundant, or adversarial are penalized through stake slashing.
* If a contributor’s penalty score exceeds a threshold, future rewards are reduced, ensuring only high-quality data is retained in model training.

This structured pipeline ensures a provable and trustless attribution system that rewards valuable contributions while maintaining model integrity.