Claim-to-Evidence Table

Claims should have a shape.

A public claim is useful only when it points to evidence, states what it does not prove, names the condition that would weaken it, and gives critics a bounded route to attack it.

Evidence shapes for claims, artifacts, receipts, and challenge routes.

Public Table v0

Bounded claims, public evidence, and downgrade triggers.

Claim Evidence status What it does not prove Downgrade trigger Challenge route
P02-C1
WisdomBench measures longitudinal learning from failure rather than single-shot task capability.
Public supporting evidence: GitHub, Hugging Face dataset, Zenodo record. Human-like wisdom, general deployment reliability, or that all agents learn from failure. Task leakage, scoring bugs, reproduction failure, or stronger baselines removing the longitudinal effect. WisdomBench issue template
PCA-C1
High-risk AI action should not earn action credit until warrant and receipt closure exist.
Public protocol and interface demo. Live trading profit, private product performance, or universal safety. The public gate allows unsafe action, gives credit without receipts, or cannot reproduce its no-go boundary. Proof-carrying action issue template
CREDIT-C1
Repair intent, pretty reports, bootstrap probes, and semantic summaries must not become metric, reward, denominator, or clean-learning credit until closed evidence exists.
Public boundary plus counterexample packet. That private repair queues are public, or that every private trace can be disclosed. A public artifact lets repair intent influence reward, denominator, clean-learning labels, or gate authority without closure. Credit leak packet
AUTH-C1
Research-only, shadow, suggestion, no-go, or public demo outputs must not imply permission to act.
Public boundary plus no-go demo and review-status route. Live deployment safety, private product readiness, or permission to act in any external system. A UI label, API field, README, or public page turns a research artifact into action authority. Authority leak packet
P24-C1
Adaptive systems need relational observability: relations, constraints, control debt, and evidence half-life.
Public protocol stage. A theorem covering all adaptive systems or a finished private product. Relation variables, control debt, or evidence half-life do not change decisions beyond scalar baselines. Public counterexample route
P20-C1
Physical AI should route degraded evidence to recovery or abstention rather than direct action.
Public bounded support; rebuild needed before stronger deployment claims. Detector SOTA, offensive autonomy, or real-world robot deployment performance. Stronger conformal, shield, or fusion baselines handle the same degraded evidence without this boundary. Public counterexample route
F1-C1
Trading is used as a high-risk testbed for proof-carrying action discipline, not as a public claim of live profitability.
Public boundary and technical boundary route. Live trading edge, customer readiness, private execution quality, or alpha dominance. Public language implies live profitability, private execution readiness, or authority beyond no-go evidence. Boundary issue template

Rule

No evidence row is allowed to silently become a larger claim.

The public layer is deliberately narrow: it exposes protocols, manifests, bounded evidence, negative results, public demos, and repair routes. It does not expose private financial execution details, customer data, protected orchestration, commercial schedulers, credentials, or non-public venue materials.

If a critic can show a stronger baseline, a leakage path, a failed reproduction command, or an over-broad boundary, the claim should be narrowed or repaired. That is the point of the registry.