# System Architecture

#### Core Components

The Open LoRA system is built on a modular architecture consisting of:

**LoRA Adapters Storage**

* Stores fine-tuned LoRA adapters in OpenLedger
* Adapters are loaded dynamically when needed rather than preloading all into memory.

**Model Hosting & Adapter Merging Layer**

* Uses a shared base model, while LoRA adapters are merged on-the-fly during inference.
* Supports ensemble merging of multiple adapters to improve inference performance.

**Inference Engine**

* Implements efficient CUDA optimizations, including:
* Flash-Attention for reducing memory overhead.
* Paged-Attention for efficient handling of long sequences.
* SGMV Optimization (Sparse General Matrix Vector multiplication) to accelerate inference.

**Request Router & Token Streaming**

* Routes API requests dynamically based on required adapters.
* Streams generated tokens efficiently using optimized kernel implementations.

**Attribution Engine**

* Automatically records which models, adapters, and data were used for each inference.
* Ensures fair and verifiable attribution to all contributors (developers, data providers, compute nodes).
* Enables reward distribution based on real-time usage tracking.

**OpenLedger Network**

* Decentralized infrastructure that connects storage, inference, and attribution components.
* Uses smart contracts for access control, attribution logging, and token-based rewards.
* Ensures secure, scalable, and trustless coordination across the AI pipeline.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://openledger.gitbook.io/openledger/openlora/system-architecture.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
