# P2P Inference

#### Network Topology

OpenShard uses a hub-spoke topology at the discovery layer — buyers query the registry as a central coordination point — but a direct point-to-point topology at the inference layer. Once a buyer resolves a seller from the registry, all inference traffic flows directly from buyer to seller with no intermediary in the path. The registry plays no role once routing is resolved. This design ensures inference content is never routed through a central service and that registry operators cannot observe, modify, or selectively block inference traffic.

#### Buyer Routing Logic

The buyer's router module evaluates registered sellers against the requested service ID, current health metrics, and pricing constraints. When multiple sellers match a query, the router produces a ranked candidate list and selects the highest-ranked available peer. Routing is not a one-shot decision — the payment retry flow is tightly integrated with it. If a selected seller returns a 402 Payment Required response, the buyer does not surface this error to the calling application. It handles channel creation or channel state updates internally and retries the request with a valid payment authorization. From the perspective of the downstream application, the request completes normally; the payment negotiation is entirely invisible.

#### Seller Autonomy and Independence

Each seller runs as a fully independent process with its own configuration, offering definitions, payment manager, metrics registry, and nullifier cache for replay defense. Sellers share no runtime state with one another and are not coordinated by any central process during operation. A seller's presence in the registry index is a liveness signal only; it carries no authority over how the seller processes individual requests. This design allows the network to accommodate heterogeneous compute providers with different capability profiles, pricing structures, upstream model configurations, and privacy enforcement policies. A seller can choose to require ZK proof verification for all requests, or offer a public endpoint for free inference — the protocol supports both configurations without changes to the buyer or registry.

#### SDK: Programmable Network Orchestration

The openshard-ai SDK package exposes Network, SellerNode, and Offering as a developer-facing API for building and orchestrating network topologies in code. This is particularly useful for three operational patterns. Single-seller quick boot covers cases where an operator wants to instantiate and start a single SellerNode with defined offerings in a few lines of code. Multi-node simulation covers integration testing and synthetic load generation, where a Network instance manages a set of SellerNode processes with dynamically assigned ports and shared configuration defaults. Test harness embedding covers CI/CD pipelines where ephemeral seller nodes need to spin up and tear down cleanly around individual test suites.

The Network orchestrator manages port assignment automatically — addSeller() rejects explicit port values to enforce sequential auto-assignment and prevent conflicts. SellerNode.start() forks the seller process, serializes its offering configuration into the child process environment, and optionally opens a localtunnel for external reachability. Lifecycle events including started, error, and exit propagate from the node level to the network level, enabling centralized event handling across a multi-seller topology.

The CLI package serves a complementary but distinct purpose from the SDK. The CLI is operator-oriented and command-driven, suited for manual operations, interactive inspection, and demonstrations. The SDK is developer-oriented and embedding-friendly, suited for automation, test harnesses, and programmatic topology generation. Operators choose based on whether they are running the network interactively or building tooling around it.

<br>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.openshard.ai/core-concepts/p2p-inference.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
