Incentive Mechanism Design for Decentralized AI Training

Date: 2026-03-12 · Scope: Mechanism design, validation without trust, Bitcoin/Lightning conditional payments, bounty design

1. Mechanism Design for Distributed Computation

1.1 The Core Problem

You want rational, self-interested actors to contribute honest computation (gradient computation for model training) to a shared objective (reducing model loss). The participants are:

Workers (miners): Compute gradients on assigned data shards
Validators: Evaluate gradient quality and authorize payments
Coordinator: Aggregates validated gradients into global model updates

The mechanism must ensure that honest participation is a dominant strategy — no participant should be able to increase their payoff by deviating from the protocol.

1.2 Game-Theoretic Frameworks

Shapley Values

The Shapley value from cooperative game theory gives each player their marginal contribution averaged over all possible orderings of players. For distributed training:

Player i’s Shapley value = average improvement in model loss when player i’s gradient is added, computed over all possible subsets of other players’ gradients
Properties: Efficiency (total value is distributed), symmetry (equal contributors get equal pay), null player (zero-value contributions get zero pay), additivity

Application to FL: Multiple papers (MDPI Axioms 2023, Wireless Communications 2022) design federated learning incentive mechanisms using Shapley values. The Shapley value is computed as:

phi_i = (1/|N|!) * sum over all orderings S of [v(S ∪ {i}) - v(S)]

where v(S) is the model performance using only the gradients from players in set S.

Practical problem: Computing exact Shapley values requires evaluating 2^N subsets — exponential in the number of participants. Approximation methods include:

Monte Carlo Shapley: Random sampling of permutations (~100–1000 samples)
Truncated Shapley: Only evaluate first k players in each permutation
Fuzzy Shapley: Extend to uncertain participation attitudes (Axioms 2024)

Key insight from recent work (arXiv 2504.05563, “Do Data Valuations Make Good Data Prices?”): Popular valuation methods like Leave-One-Out and Data Shapley make poor payment rules. They fail to ensure truthful reporting of costs, leading to inefficient market outcomes. The authors recommend adapting VCG and Myerson payments from mechanism design literature.

VCG (Vickrey-Clarke-Groves) Mechanism

VCG achieves truthful revelation as a dominant strategy. Applied to federated learning (arXiv 2008.06680, FVCG mechanism):

Each worker reports their cost of participation
The mechanism selects participants to maximize social surplus (total value minus total cost)
Worker i’s payment = value they provide to others (the marginal externality)
Properties: Dominant-strategy incentive compatible (DSIC), individually rational (IR), Pareto efficient, weakly budget-balanced

FVCG payment formula:

Payment_i = V(S*) - V(S*\{i}) + c_i
           = (social welfare with i) - (social welfare without i) + i's reported cost

This makes each worker’s utility equal to their marginal social contribution, aligning individual and collective incentives.

Practical limitation: VCG requires the coordinator to know or estimate the value function V(S), which means evaluating model performance with and without each participant — similar overhead to Shapley computation.

Scoring Rules (Proper Scoring Rules)

A strictly proper scoring rule incentivizes an agent to truthfully report their probability distribution (or in this case, their gradient estimate). Examples:

Logarithmic scoring rule: Pays log(p) where p is the probability assigned to the realized outcome
Brier score: Pays 2p - sum(p_i^2) — bounded, simple to compute
Peer prediction: Score worker i’s gradient against worker j’s gradient on the same data — doesn’t require ground truth

For gradient validation, a natural scoring rule: pay proportional to the correlation between the submitted gradient and the true gradient (estimated by the validator through sampling).

Stackelberg Games

Model the coordinator as a leader and workers as followers. The coordinator announces a pricing/reward scheme, workers decide whether and how much to participate. Recent papers (IEEE TII 2023) use Stackelberg equilibrium for industrial IoT federated learning:

Leader: sets reward per unit of quality and minimum quality threshold
Followers: choose effort level to maximize (reward - cost)
Equilibrium: coordinator finds the reward schedule that maximizes their objective subject to workers’ participation constraints

Auction Mechanisms

Workers bid their cost of participation, coordinator runs an auction:

Reverse auction: Workers bid to provide computation, lowest bidders win
Combinatorial auctions: Account for complementarities between workers (e.g., workers with diverse data are jointly more valuable)

1.3 Pricing Heterogeneous Contributions

Different GPU types produce gradients of different quality and at different speeds. Pricing must account for:

Hardware heterogeneity:

H100 vs A100 vs consumer GPUs: FLOPS/$ varies by 1.2–3x depending on task
Memory bandwidth matters more for some operations: workstation GPUs (A40, A6000, L40) offer 1.2x higher memory bandwidth and 1.8x greater memory capacity per unit price than datacenter GPUs (arXiv 2502.00722)
Communication bandwidth determines synchronization frequency

Gradient quality heterogeneity:

Workers with larger local batch sizes produce lower-variance gradients
Workers training on higher-quality data produce more informative gradients
Slower workers may produce stale gradients if synchronization is asynchronous

Practical pricing approaches:

Output-based pricing (Templar/Covenant approach): Price purely by measured loss reduction. Workers who produce better gradients — regardless of hardware — earn more. Simple and incentive-compatible but ignores cost differences.
Cost-adjusted pricing: Pay = quality_score / reported_cost. Workers with cheaper hardware earn more per dollar if they produce equal-quality gradients.
Benchmark-normalized pricing (BOINC approach): Normalize contributions by a hardware benchmark (e.g., Whetstone FLOPS). One BOINC “cobblestone” = one day of 1 GFLOPS CPU. This measures effort rather than outcome.
Market-based pricing: Let workers set their own prices (reverse auction). The coordinator selects the cheapest workers that meet quality thresholds. Markets naturally discover the right price for heterogeneous hardware.

1.4 Preventing Free-Riding

Free-riders submit no work (or minimal work) but claim rewards. Attack variants:

Zero-gradient attack: Submit zero or random gradients
Replay attack: Resubmit another worker’s gradient
Partial work: Train for fewer steps than required
Data-skipping: Process only easy examples

Defenses:

Loss-reduction scoring (Templar): Directly measures whether a gradient improves the model. Zero or random gradients score poorly. Replay attacks are caught by checking whether the gradient helps more on the worker’s assigned data than on random data.
STD-DAGMM detection (FRAD, IEEE 2023): Detects free-riders by analyzing the statistical properties of submitted gradients — variance, norm, direction relative to other submissions.
FRIDA (arXiv 2410.05020): Uses membership and property inference attacks to detect whether a worker actually trained on data — if their gradient doesn’t encode properties of the assigned data, they’re flagged.
Weight evolution frequency (ScienceDirect 2024): Track how each worker’s model weights evolve over rounds. Free-riders show anomalous evolution patterns.
Contribution measurement: Estimate each worker’s marginal contribution via:
- Gradient direction similarity to aggregated gradient
- Loss reduction on held-out validation set
- Correlation with independently computed reference gradients

1.5 Gradient Poisoning Defense

Malicious workers submit gradients designed to degrade the model (backdoor insertion, convergence disruption). This is a Byzantine fault tolerance problem.

Robust aggregation methods:

Method	Mechanism	Tolerance
Krum	Select gradient closest (in Euclidean distance) to most others	f < n/2 - 1
Trimmed Mean	Remove top and bottom k% values per coordinate, average rest	k < 25%
Geometric Median	Minimize sum of distances to all gradients	f < n/2
SignGuard	Use element-wise sign of gradients for anomaly detection	Collaborative filtering
FLRAM	Isolation forests + DBSCAN on gradient magnitudes/signs	Adaptive

State of the art (2025):

Dynamic gradient filtering (ScienceDirect 2024): Adapts filtering threshold based on observed gradient distributions
Trajectory anomaly detection (Nature Scientific Reports 2024): Uses singular values of gradient matrices as features, processed by improved Isolation Forest
Adaptive adversaries survey (ePrint 2025/510): Shows that adaptive adversaries who observe the defense mechanism can circumvent most static defenses — defense must be randomized or adaptive

Key insight: No robust aggregation method is free. They all reduce the effective contribution of honest workers (you’re throwing away some good data). The overhead is typically 10–30% slower convergence versus naive averaging with no adversaries.

1.6 Sybil Attack Prevention

A Sybil attacker creates many fake identities to gain disproportionate influence (e.g., outvote honest validators, dilute rewards).

Defense layers:

Stake-based (proof of stake): Each identity must lock up capital. Creating 100 fake identities requires 100x the stake. Bittensor requires staking TAO to participate.
Hardware commitment: Require each identity to demonstrate unique hardware (proof of GPU). NVIDIA’s confidential computing features could attestate specific GPU serial numbers.
Proof of personhood: Verify that each participant is a unique human (World/Worldcoin approach). Not practical for compute-heavy tasks where one person may legitimately run multiple machines.
Reputation systems: New identities start with low reputation and earn it over time. Combined with stake, this makes Sybil attacks expensive over the long term.
Performance-based filtering: If each identity must independently produce useful work, the cost of a Sybil attack scales linearly with the number of identities — there’s no advantage to splitting one worker into 10 fake workers unless they can share work.

Best practice for decentralized training: Combine stake (economic barrier) with performance scoring (each identity must independently demonstrate useful computation). This makes Sybil attacks purely wasteful — 10 fake identities do 10x the work of 1 real identity, for no additional reward.

2. Validation Without Trust

2.1 Covenant’s Gauntlet Validator (Detailed)

The Gauntlet is the validation system used in Covenant/Templar (Bittensor Subnet 3). Based on the Covenant-72B analysis:

Architecture:

The Gauntlet runs as a set of validator nodes on the Bittensor blockchain. Validators must stake TAO to participate. Their scoring of miners determines TAO emission distribution.

Scoring Pipeline:

Loss Score (primary signal):
- Validator takes miner’s compressed pseudo-gradient
- Computes model loss on held-out batch BEFORE applying gradient: L_before
- Applies the gradient to the model
- Computes loss AFTER: L_after
- Score = L_before - L_after (positive = gradient helped)
Assigned vs. unassigned data check:
- Each miner is assigned specific data shards
- Validator checks: does the gradient help MORE on the miner’s assigned data than on random data?
- If not, the miner likely copied someone else’s gradient or used unauthorized data
Norm calibration:
- Pseudo-gradients are scaled relative to the median norm across all submissions
- Prevents miners from submitting outsized updates (could destabilize training) or undersized updates (minimal contribution)
OpenSkill ranking:
- Scores are accumulated over time using OpenSkill (a Bayesian rating system similar to TrueSkill/Elo)
- Uses the Plackett-Luce model to rank miners within each evaluation window
- Reputation is hard to game with a single round — you need sustained quality
Liveness and sync checks:
- Verify miners are synchronized with the current model state
- Stale gradients (from old checkpoints) are rejected

Validation overhead:

Not every miner is evaluated every round — random subsets
Validator computes two forward passes per miner per evaluation (before/after)
For a 72B model, each forward pass on a validation batch takes ~30–60 seconds on validator hardware
Total validation compute is ~5–15% of total training compute (estimated)

Key design insight: The loss-reduction scoring creates a directly measurable, objective metric that doesn’t require re-running the full training. You’re checking the gradient’s effect on a small held-out batch, not reproducing the entire training run.

2.2 Proof of Learning (PoL)

Paper: Jia et al., IEEE S&P 2021 (arXiv 2103.05633)

Protocol:

During training, the prover logs a training transcript: intermediate model checkpoints (weights at intervals), training data ordering, hyperparameters, random seeds
The proof P(T, f_W) = (W, I, H, A) where:
- W = model weights at checkpoints
- I = data point ordering information
- H = signatures of training data points
- A = auxiliary info (hyperparameters, architecture)
A verifier replays segments of the training transcript (random subset of checkpoint intervals) and checks that the weight changes are consistent with gradient descent on the claimed data

Verification overhead:

Complexity: O(E × Q × k × C_|W|) where Q = fraction of intervals verified, k = steps per interval, C_|W| = cost of one update step
At Q = 10% (verify 10% of intervals), overhead is ~10% of training compute
The key finding: an adversary seeking to manufacture a fake PoL must perform at least as much work as genuine training

Spoofing attacks and defenses:

Directed retraining: Adversary knows final weights W_T, tries to reconstruct a plausible training path. Defense: verification checks statistical properties of the trajectory, not just endpoints.
Inverse gradient attack: Given W_t, solve for W_{t-1} that would lead to it. Defense: this is computationally hard and introduces detectable artifacts.
Limitations (arXiv 2208.03567, “Proof-of-Learning is Currently More Broken Than You Think”): Shows that PoL is vulnerable to spoofing attacks that manipulate tolerance parameters. The verification’s reliance on approximate matching (gradients are stochastic) creates a window for adversaries.

Enhancement: Watermarking-enhanced PoL requires attackers to reproduce both authentic training logs AND watermark-consistent ownership signals, increasing attack cost by >10x.

2.3 Zero-Knowledge Machine Learning (ZKML)

Concept: Use zero-knowledge proofs to verify that a claimed computation (training step, inference) was performed correctly, without revealing the model weights or data.

Survey: arXiv 2502.18535 (Feb 2025) — comprehensive survey of ZK-based verifiable ML.

Three categories:

Verifiable training: Prove that model was trained correctly on claimed data
Verifiable inference: Prove that output came from a specific model on specific input
Verifiable testing: Prove model achieves claimed accuracy on a benchmark

Overhead numbers (from the survey):

System	Task	Proof Generation Time	Verification Time	Memory
zkCNN	VGG16 inference	88.3 seconds	~seconds	Moderate
zkDT	Decision tree (23 levels)	250 seconds	~seconds	Moderate
zkDL	10M parameter NN training	~10s (parallelized)	<1s	High
MobileNet v2	Inference verification	N/A	10.27 seconds	Moderate
Transformer	General	N/A	N/A	148 GB (!)

Constraint-to-parameter ratio for transformers: 58–85x — the ZK circuit is 58–85 times larger than the model computation itself.

Optimization strategies:

Quantization (ZEN): 5.4–22x constraint reduction through neural network quantization
Commitment optimization (Artemis): 7.3x improvement in prover time
Lookup tables: Precomputed values reduce division overhead

Bottom line: ZKML is currently impractical for large models. Verifying a single forward pass through a 72B model would require astronomical compute and memory. Useful today only for small models (<1M parameters) or specific inference verification. May become practical in 3–5 years with hardware acceleration and algorithmic improvements.

2.4 Other Validation Approaches

Proof of Training (Springer 2025): Blockchain network trains models and proves the training was performed correctly. Workers are rewarded with cryptocurrency proportional to computational contributions. Different from PoL in that the blockchain coordinates training rather than just verifying it.

Redundant computation (BOINC approach): Multiple workers compute the same task. Results must agree within tolerance. Quorum of agreement triggers credit. Overhead: at least 2–3x computation, but very simple and robust.

Statistical verification: For gradient computation specifically, you can verify a gradient by:

Sampling a small random subset of the training batch
Computing the gradient on that subset independently
Checking that the submitted gradient is statistically consistent (cosine similarity, norm ratio)

This gives partial verification at 1–5% overhead rather than full recomputation.

Commitment schemes: Worker commits to their gradient (hash or Merkle root) before seeing others’ gradients. After all commitments, gradients are revealed. Prevents copying attacks. Cheap (<1% overhead) but doesn’t prevent garbage gradients.

2.5 Verification Overhead Summary

Method	Overhead (% of training compute)	What it verifies	Adversary model
Loss-reduction (Gauntlet)	5–15%	Gradient quality	Lazy/free-rider
Proof of Learning	~10% (at 10% sampling)	Training integrity	Spoofing
ZKML	58–85x (!)	Computational correctness	Any
Redundant computation	100–200%	Exact correctness	Byzantine
Statistical sampling	1–5%	Approximate correctness	Lazy/noisy
Commitment scheme	<1%	Non-copying	Plagiarism

Practical recommendation: Layer multiple cheap methods rather than using one expensive method. Commitment scheme (prevent copying) + statistical sampling (catch garbage) + loss-reduction scoring (measure quality) gives strong guarantees at ~10–20% total overhead.

3. Bitcoin Script / Lightning for Conditional Payments

3.1 What Bitcoin Script Can Express

Bitcoin Script is deliberately limited — it’s a stack-based, non-Turing-complete language. It can express:

Hash locks: “Payment unlockable by revealing preimage of hash H”
Time locks: “Payment unlockable only after block height N / time T”
Signature checks: “Payment requires valid signature from key K”
Multi-signature: “Payment requires M-of-N signatures”
Conditional branches: OP_IF / OP_ELSE / OP_ENDIF

It cannot express:

Arbitrary computation (no loops, no state)
Floating-point arithmetic
Complex data structures
Direct verification of ML computations

3.2 HTLCs (Hash Time-Locked Contracts)

The building block of Lightning payments:

OP_IF
    OP_HASH160 <hash_of_preimage> OP_EQUALVERIFY
    <recipient_pubkey> OP_CHECKSIG
OP_ELSE
    <timeout> OP_CHECKLOCKTIMEVERIFY OP_DROP
    <sender_pubkey> OP_CHECKSIG
OP_ENDIF

Semantics: Recipient can claim by revealing the preimage of the hash. If they don’t claim before the timeout, sender gets their money back.

For gradient payments: The preimage could encode gradient metadata, but HTLCs alone can’t verify gradient quality. You need an external oracle to determine whether the gradient was good, then release the preimage.

3.3 PTLCs (Point Time-Locked Contracts)

PTLCs replace hash locks with adaptor signatures on elliptic curve points:

Instead of sharing hash H and requiring preimage r, use point P = r*G on secp256k1
Each hop in a multi-hop payment uses a different point (better privacy than HTLCs which share the same hash across all hops)
Require Schnorr signatures (available since Bitcoin Taproot, Nov 2021)

Advantages over HTLCs:

Privacy: different adaptor per hop, no wormhole attacks
Efficiency: smaller on-chain footprint with Taproot
Composability: can combine with other signature conditions

For conditional computation payments: PTLCs enable privately-conditional payments — the condition (adaptor point) is hidden from the blockchain. An oracle could produce a signature adaptor that corresponds to a specific computation result.

3.4 Hold Invoices (Lightning Escrow)

A hold invoice (hodl invoice) in Lightning allows the receiver to delay settlement:

Sender pays the invoice, locking funds in an HTLC
Receiver sees the payment but doesn’t immediately settle (doesn’t reveal the preimage)
An external condition is checked (oracle attestation, computation verification)
If condition met: receiver settles (reveals preimage, claims funds)
If condition not met: payment times out, sender gets refund

This is the key primitive for “pay only if gradient improves loss”:

1. Worker generates hold invoice for gradient payment
2. Coordinator pays the hold invoice (funds locked)
3. Worker submits gradient
4. Coordinator/validator evaluates gradient quality
5. If quality >= threshold:
     Coordinator releases preimage to worker (or oracle does)
     Worker settles the invoice, receives sats
6. If quality < threshold:
     Invoice times out (CLTV expiry)
     Coordinator's funds return automatically

Current implementations:

LND supports hold invoices natively
CLN (Core Lightning) supports them via plugins
Supertestnet’s hodlcontracts — oracle + escrow system for Lightning with three contract templates (trading, lending, betting)

Limitations:

Hold invoices lock liquidity in the payment channel for the entire hold period
Long hold times (hours) can strain channel capacity
The HTLC timeout must be set conservatively (computation time + validation time + buffer)
Maximum HTLC count per channel is limited (483 in the spec) — can’t have thousands of outstanding hold invoices

3.5 Can You Do “Pay Only If Gradient Improves Loss” on Lightning?

Yes, with an oracle pattern. Here’s the specific mechanism:

Design: Oracle-Attested Gradient Payment

Actors:
  - Worker: computes gradient
  - Coordinator: aggregates gradients, updates model
  - Validator Oracle: evaluates gradient quality, attests result

Protocol:
  1. Coordinator creates a hold invoice: "Pay W sats to Worker, locked by preimage P"
  2. Coordinator locks funds via the hold invoice HTLC
  3. Worker computes gradient G on assigned data shard
  4. Worker submits G to Coordinator
  5. Coordinator sends G to Validator Oracle
  6. Validator runs:
       L_before = loss(model, validation_batch)
       model' = apply(model, G)
       L_after = loss(model', validation_batch)
       quality = L_before - L_after
  7. If quality > threshold:
       Oracle reveals preimage P (or signs an adaptor)
       Worker claims payment
  8. If quality <= threshold:
       Oracle withholds preimage
       HTLC times out, funds return to Coordinator

Trust assumptions: The validator oracle must be honest. Mitigation:

Use multiple independent validators (majority agreement)
Rotate validators randomly per round
Validators stake collateral (slashable if caught cheating)
Computation is deterministic — anyone can verify the oracle’s claim by replaying the loss evaluation

3.6 DLCs (Discreet Log Contracts) for Training Outcome Bets

DLCs enable oracle-dependent conditional payments that are private and efficient:

How DLCs work:

Alice and Bob deposit funds into a 2-of-2 multisig (the “funding transaction”)
They pre-sign a set of Contract Execution Transactions (CETs), one for each possible outcome
Each CET distributes the locked funds according to the outcome it represents
The oracle commits to a public nonce R before the event
When the event occurs, the oracle attests the outcome by publishing signature s = k - hash(R, outcome) × x
The winning party uses the oracle’s signature to complete their CET’s signature and broadcast it

For training outcome bets:

Scenario: Alice bets that a decentralized training run will achieve MMLU > 70 within 30 days. Bob bets it won’t.

Funding: Alice deposits 0.5 BTC, Bob deposits 0.5 BTC into 2-of-2 multisig

CETs:
  - Outcome "MMLU > 70": Alice gets 0.9 BTC, Bob gets 0.1 BTC
  - Outcome "MMLU <= 70": Alice gets 0.1 BTC, Bob gets 0.9 BTC

Oracle: Evaluates model on MMLU benchmark at deadline
  - Publishes signature attesting to the actual MMLU score
  - Winning party uses oracle signature to complete and broadcast winning CET

Numeric outcome DLCs: For continuous outcomes (exact MMLU score, loss value), DLCs can encode ranges using binary decomposition. The oracle attests to each digit of the outcome independently, and the CET distribution is a function of the numeric value.

DLCs on Lightning:

DLC channels can be routed through Lightning payment channels
CETs function as off-chain commitments within the channel
Settlement is instant (no on-chain transaction needed unless disputed)
Papers: “Discreet Log Contract Channels and Integration in the Lightning Network” (Kuwahara)
Implementation: Suredbits

Practical DLC applications for decentralized training:

Quality bounties: “I’ll pay X sats if the next training round reduces loss by > Y”
Milestone contracts: “Pay on each of: 50% training complete, 75% complete, benchmark target hit”
Performance insurance: “If the model regresses (loss increases), coordinator pays workers a penalty”
Compute futures: “Lock in a price now for GPU time delivered over the next week”

3.7 What Lightning Can and Cannot Express

CAN express (with oracles):

Payment Type	Mechanism	Practical?
Pay per validated gradient	Hold invoice + oracle attestation	Yes
Pay proportional to quality	Multiple hold invoices of varying amounts + oracle selects which to settle	Clunky but possible
Escrow with refund	Hold invoice with CLTV timeout	Yes, native
Milestone payments	DLC with multiple outcome ranges	Yes
Training outcome bets	DLC with numeric oracle	Yes
Atomic multi-party payments	Multi-hop HTLCs	Yes

CANNOT express (fundamental limitations):

Payment Type	Why Not	Workaround
On-chain gradient verification	Script can’t do ML math	Oracle attestation
Continuous payment streams	Lightning is discrete payments	Frequent micropayments
Proportional payment (exact ratio)	Script can’t compute ratios	Pre-define a set of amounts
Multi-round commitments	HTLCs are single-use	New invoice per round
Slashing (take FROM a participant)	Lightning is push-only	Pre-locked collateral in DLC

Key constraint: Lightning cannot verify computation. All gradient quality assessment must happen off-chain, with the result attested by an oracle. The trust model shifts from “trust the computation” to “trust the oracle” — but the oracle’s job (loss evaluation) is deterministic and publicly verifiable by anyone who has the model and validation data.

4. Existing Literature & Projects

4.1 Bittensor Incentive Mechanism — Formal Analysis

Yuma Consensus is Bittensor’s on-chain mechanism for computing validator and miner emissions:

Validators submit a weight matrix (their scores of each miner)
Yuma Consensus applies stake-weighted median clipping to resist outlier validators
Exponentially smoothed bonds reward validators for consensus alignment
Emissions are split: 41% miners, 41% validators, 18% subnet creator

Critical empirical analysis (arXiv 2507.02951, peer-reviewed):

Stake concentration:

Top 1% of wallets control median 89.8% of stake across 64 subnets
Gini coefficient: 0.9825 (extreme inequality)
Over half of subnets: fewer than 1% of wallets needed for 51% attack

Performance-reward correlation:

Validator stake→reward: r = 0.80–0.95 (dominant)
Validator performance→reward: r = 0.50 (moderate)
Miner stake→reward: r = 0.50–0.80
Miner performance→reward: r = 0.10–0.30 (very weak!)

Translation: “Economic power translates directly into earnings regardless of actual contribution quality.” The system pays the wealthy, not the productive.

Proposed fixes:

Performance-weighted emission split: +0.032 performance→reward, only -0.018 stake→reward
Composite scoring: +0.36 performance→reward but catastrophic -0.91 stake→reward
Performance bonus multiplier: conservative +0.009 improvement, minimal disruption
88th percentile stake cap: 20x improvement in coalition size needed for 51% attack

Takeaway for mechanism design: Bittensor demonstrates that stake-weighted consensus is fundamentally at odds with quality-weighted rewards. Any system using staking for security will tend toward plutocracy unless explicitly corrected by performance metrics. The Gauntlet (loss-reduction scoring) is Covenant’s attempt to solve this, but it operates within Yuma Consensus which still overweights stake.

4.2 Hivemind / Learning@home

Project: github.com/learning-at-home/hivemind (NeurIPS 2020)

Approach: Decentralized deep learning in PyTorch, designed for training on thousands of volunteers with unreliable connections. Uses Decentralized Mixture-of-Experts (DMoE) — different peers specialize in different parts of the model.

Coordination: Kademlia-based Distributed Hash Table (DHT) for peer discovery. Scales to tens of thousands of peers with logarithmic search complexity.

Incentive design: Hivemind notably did NOT implement monetary incentives. It relied on:

Volunteer altruism (like BOINC/Folding@Home)
Academic credit and community recognition
Shared access to the resulting model

Result: The project demonstrated the technical feasibility of decentralized training but failed to attract large-scale sustained participation without monetary incentives. This is the core lesson — volunteer computing works for science (protein folding has intrinsic appeal) but struggles for general ML training where the output model is the only incentive.

Technical legacy: Hivemind’s DHT and fault-tolerant aggregation code is used by subsequent projects including INTELLECT-1 and (indirectly) Bittensor subnets.

4.3 BOINC — Lessons from Volunteer Computing

BOINC credit system (boinc.berkeley.edu):

Credit design:

1 cobblestone = 1/200 of a day’s work on a 1 GFLOPS machine
Credit has no monetary value — it’s a reputation/competition metric
Used for: individual progress tracking, inter-volunteer competition, per-project throughput metrics

Validation:

Redundant computing: Each work unit is sent to 2+ volunteers. Results must agree within tolerance.
Quorum: Minimum number of agreeing results before credit is granted
Canonical result: If all results agree, the most common result is canonical
Credit granted only on validated work: No validation = no credit

Cheating prevention:

Homogeneous redundancy: send identical work to similar platforms to enable comparison
Result validation via quorum agreement
Project-specific validators that check result plausibility
Volunteer reputation (running average of validation success rate)

Key lessons for decentralized training:

Non-monetary incentives have ceiling. BOINC attracted millions of volunteers but peak participation was driven by SETI@home’s unique appeal. Most BOINC projects struggle for volunteers.
Redundant computation is expensive but robust. 2–3x overhead is the price of trustless validation. For ML training, this is too expensive — you’d rather do 3x more training than validate 3x.
Credit gaming is real. BOINC had persistent problems with volunteers gaming the credit system (overclocking, reporting inflated FLOPS, running on faster hardware than reported).
Competition works. Teams and leaderboards drove significant participation. Folding@home’s points system + team competition sustained engagement for decades.
Validation must be cheap relative to computation. BOINC’s approach (re-compute and compare) only works when the work units are small. For large neural network training, alternative validation methods are needed.

4.4 Folding@Home

Points system:

Points awarded based on work unit difficulty and completion time
Bonus points for completing work units quickly (before deadline)
Individual and team leaderboards
No monetary value — purely competitive

Scale: At peak (COVID-19, 2020), Folding@Home exceeded 2.4 exaFLOPS — more than the top 500 supercomputers combined. Driven by viral social media and the concrete goal of COVID drug discovery.

Lesson: A compelling narrative (cure diseases!) can substitute for monetary incentives, but only temporarily. Participation declined 90%+ after COVID interest waned.

4.5 Gridcoin — Bridging BOINC and Cryptocurrency

Gridcoin adds cryptocurrency rewards to BOINC contributions:

Miners earn GRC tokens by contributing to whitelisted BOINC projects
Token reward proportional to BOINC credit earned (Proof of BOINC)
Whitelist prevents gaming with self-created BOINC projects

Lesson: Gridcoin proved that cryptocurrency can incentivize volunteer computing, but the token’s low market value ($0.01–$0.05/GRC) meant the economics rarely covered electricity costs. The incentive only works when the token has sufficient market value.

4.6 Academic Literature Summary

Key papers on incentive-compatible distributed learning:

Paper	Year	Key Contribution
Jia et al., “Proof of Learning”	2021	Training transcript verification, O(10%) overhead
Cong et al., “FVCG”	2020	VCG mechanism for FL, truthful cost reporting
“Gradient-Driven Rewards” (NeurIPS)	2021	Guarantees fairness via measured gradient contribution
“Federated Learning Incentive via Shapley” (MDPI)	2023	Pareto-optimal payoff allocation
“Incentive-Based FL” (arXiv 2510.14208)	2025	Survey of architectural elements and future directions
ICLR 2025 conference paper	2025	Fine-grained influence propagation in decentralized networks
“Coin.AI” (MDPI Entropy)	2019	Proof-of-useful-work for blockchain-based distributed deep learning
DLchain (SERVICES 2020)	2020	Blockchain with deep learning as PoUW
“Proof of Training” (Springer 2025)	2025	Verifiable model training via blockchain delegation

Survey taxonomy (arXiv 2510.14208): FL incentive mechanisms fall into four technical approaches:

Shapley values: Fair but computationally expensive
Stackelberg games: Leader-follower optimal pricing
Auctions: Market-based resource allocation
Contracts: Principal-agent with screening/signaling

5. Autoresearch Bounty Design

5.1 Structuring a Bounty: “Improve This Metric by X%”

The autoresearch pattern (mutate file, evaluate, keep improvements, discard regressions) naturally suggests a bounty structure. Here’s how to design one for decentralized workers:

Basic bounty structure:

Bounty: Improve email classification accuracy from 87% to 92%+
  Mutable: system_prompt.txt
  Eval: python eval.py --corpus labeled_emails.jsonl
  Metric: accuracy (higher is better)
  Payment: 50,000 sats per percentage point improvement
  Verification: coordinator runs eval.py independently
  Duration: 72 hours
  Holdback: 20% of payment released after 7-day holdout eval

Key design parameters:

Fixed vs. proportional payment:
- Fixed bounty: “Pay 100,000 sats for reaching 92% accuracy.” Simple, clear. Risk: pays the same for 92.01% and 99%.
- Proportional to improvement: “Pay 10,000 sats per 0.1% above baseline.” Better incentive alignment. Risk: small improvements earn small amounts (may not motivate).
- Recommended hybrid: Fixed base payment for reaching threshold + proportional bonus above it. Example: 50,000 sats for reaching 92%, plus 5,000 sats per 0.1% above 92%.
First-past-the-post vs. tournament:
- First-past-the-post: First worker to submit an improvement above threshold wins. Simple but discourages incremental improvement and rewards speed over quality.
- Tournament: All submissions within deadline are evaluated, best wins. Better for quality but workers may withhold improvements until deadline (strategic delay).
- Rolling tournament: Evaluate each submission as it arrives. If it improves on the current best, lock payment via hold invoice. If a better submission arrives before settlement, cancel previous hold and lock for new best. Workers have incentive to submit early (time value of money).
Multiple concurrent workers:
- Allow parallel exploration with different strategies
- Only the best result gets paid (tournament)
- Optionally: pay top-N (e.g., top 3 get 50%, 30%, 20% of pool)
- Shapley-value based: pay each worker proportional to their marginal contribution to the final result

5.2 Verification: Confirming Genuine Improvement

The core challenge: How does the coordinator know the improvement is real and generalizes, not just overfitting to the eval set?

Multi-layer verification:

Reproduction check: Coordinator runs the exact same eval.py on the exact same corpus. Deterministic evaluation (temperature=0, fixed seeds) should produce identical scores.
Holdout evaluation: Run the mutated file against a held-out test set that the worker never sees. This catches overfitting to the eval corpus.
```
Payment structure:
  80% released on eval set improvement
  20% released after holdout set evaluation (24-48 hours later)
```
Temporal stability: Run the eval at multiple time points (same prompt, different API calls if LLM-based). Average over 3–5 runs. This catches non-deterministic gaming.
Human review (spot check): For prompt optimization bounties, a human reviews the top-scoring submission to verify it’s not gaming the metric. This is the Goodhart defense of last resort.
Canary detection: Include a few “canary” examples in the eval set that are deliberately tricky. If a submission gets 100% on canaries (which are designed to be hard even for a perfect system), it’s likely gaming.

Lightning implementation:

1. Worker submits improved prompt
2. Coordinator locks 80% payment via hold invoice (HTLC, 24h timeout)
3. Coordinator runs eval.py → if score > threshold, settles 80% immediately
4. Coordinator runs holdout eval (next day) → if holdout score > threshold,
   pays remaining 20% via new invoice
5. If holdout eval fails → 20% is not paid (worker keeps 80%)

5.3 Payment Sizing

Cost-based pricing:

Estimate the compute cost for a reasonable number of experiments
Set bounty to cover costs + margin (1.5–3x compute cost)
Example: email classification bounty requires ~50 experiments, each costing ~$0.30 in API calls = $15 compute. Bounty should be $25–50 to be attractive.

Value-based pricing:

Estimate the value of the improvement to the bounty poster
Set bounty as fraction of that value (10–50%)
Example: better email classification saves 10 minutes/day of manual triage = ~$50/month. A 5% improvement saves $2.50/month. Over 2 years = $60 value. Bounty: $15–30.

Market-based pricing:

Post the bounty and let workers decide if it’s worth their time
If no takers at current price, increase it
If many takers, decrease it (or add more bounties)

5.4 Preventing Metric Gaming (Goodhart’s Law)

The fundamental tension: Any single metric, when optimized aggressively enough, will be gamed. “When a measure becomes a target, it ceases to be a good measure.”

Historical examples:

Autoresearch agents changing random seeds on the first experiment
AI leaderboards (Arena) gamed by selectively showcasing strongest model variants
BLEU scores in machine translation over-optimized at the expense of readability
Delhi’s cobra bounties bred snakes for profit (canonical Goodhart example)

Defenses specific to autoresearch bounties:

Multi-metric scoring: Don’t optimize a single number. Use a weighted composite:
```
score = 0.6 * accuracy + 0.2 * (1 - false_positive_rate) + 0.1 * brevity + 0.1 * holdout_accuracy
```
Gaming requires simultaneously improving all components, which is much harder.
Held-out evaluation set: Worker never sees the holdout set. Payment partially contingent on holdout performance. If eval score is 95% but holdout is 82%, the submission is rejected or penalized.
Anti-memorization: Hash the eval examples and check that they don’t appear verbatim in the mutated file (prompt). This prevents the obvious attack of embedding the answers.
Semantic review: For prompt optimization, require that the prompt is human-readable and doesn’t contain encoded information. Maximum prompt length constraint.
Diverse eval sets: Rotate the eval set between rounds. Workers can’t overfit to a single fixed set if the set changes.
Red-team evaluation: Include adversarial examples designed to exploit common gaming strategies. Score these separately.
Budget cap on Goodharting: Accept that some metric gaming is inevitable. Set a ceiling: “maximum payment is 3x the baseline value” — this limits the reward for extreme gaming while still rewarding genuine improvement.

The key insight: “The human’s job is program.md.” The quality of the bounty specification — the eval set, the metric, the constraints — determines the quality of the resulting optimization. A poorly specified bounty will be gamed. A well-specified bounty channels gaming into genuine improvement.

6. Synthesis: A Practical Lightning-Native Design

6.1 Architecture: Lightning Payment Channel for Gradient Exchange

Combining the research above into a concrete, implementable system:

+--------------------------------------------------------------+
|                    COORDINATOR NODE                            |
|                                                                |
|  - Maintains current model state                               |
|  - Assigns data shards to workers                              |
|  - Aggregates validated gradients                              |
|  - LND node with payment channels to workers/validators        |
|                                                                |
|  Channels:                                                     |
|    <-> Worker 1 (capacity: 500K sats)                         |
|    <-> Worker 2 (capacity: 500K sats)                         |
|    <-> Validator 1 (capacity: 100K sats)                      |
|    <-> Validator 2 (capacity: 100K sats)                      |
+--------------------------------------------------------------+

Payment flow per training round:
  1. Coordinator creates hold invoices for each active worker
  2. Workers compute gradients on assigned data
  3. Workers submit gradients + commitment hashes
  4. Validators randomly selected to evaluate subset of gradients
  5. Validators compute loss-reduction scores
  6. Coordinator pays validators a flat fee (settled immediately)
  7. Workers with quality > threshold: hold invoices settled (payment released)
  8. Workers with quality ≤ threshold: hold invoices time out (no payment)
  9. Coordinator aggregates validated gradients into model update

6.2 Payment Structure

Per-round worker payment:

base_payment = 1000 sats (covers electricity for one round of computation)
quality_bonus = max(0, quality_score - threshold) * 500 sats per unit
total_payment = base_payment + quality_bonus

Where quality_score = L_before - L_after on the validator’s held-out batch.

Validator payment:

validator_fee = 200 sats per gradient evaluated (flat fee)
accuracy_bonus = 100 sats if validator's score agrees with majority of other validators

Anti-Sybil:

Workers must open a payment channel with minimum capacity (100K sats = ~$50)
This serves as implicit stake — creating many fake identities requires proportional capital
Channels can be reused across rounds (amortize opening cost)

6.3 DLC Layer for Milestone Contracts

On top of the per-round Lightning payments, use DLCs for longer-term commitments:

DLC: Training Milestone Contract
  Parties: Sponsor (wants model trained) + Worker Pool
  Oracle: Independent evaluator who runs benchmark at milestones

  Funding: Sponsor deposits 0.1 BTC

  CETs:
    - "Model reaches 60 MMLU by week 2": Pool gets 0.03 BTC
    - "Model reaches 65 MMLU by week 4": Pool gets 0.03 BTC
    - "Model reaches 70 MMLU by week 8": Pool gets 0.04 BTC
    - "No milestone reached": Sponsor gets 0.1 BTC back

  Oracle attestation:
    - Oracle runs lm-eval at each deadline
    - Signs the MMLU score (numeric outcome)
    - Winning CET is constructed from oracle's signature

6.4 What This Design Achieves

Incentive compatibility:

Workers are paid for quality (loss reduction), not just participation
Free-riders earn nothing (zero quality → threshold not met → timeout)
Gradient poisoners lose their opportunity cost (computed bad gradients, got no payment)
Validators are paid for honest evaluation (agreement with majority)

Trustlessness:

Payments are conditional on measurable, reproducible metrics
Hold invoices provide automatic refund if conditions aren’t met
DLCs provide private, oracle-attested milestone payments
No party can unilaterally seize funds