Spec-18-Conformance-and-Failure-Modes
What this spec defines
Acceptance scenarios that exercise Stigmem's safety behavior under federation partitions, malicious peer input, partial replication failure, and replay attempts.
Extraction status
This file contains the ADR-010 prose extraction for failure-mode acceptance scenarios. It defines scenario intent and expected outcomes. Concrete test harness layout and fixture implementation remain implementation details.
Legacy version labels from archived source material are normalized
to the current v0.9.0a1 protocol line here. Historical wording
remains available in spec/archive/evolution/ and
spec/EVOLUTION.md.
Conformance gate
A conforming federation-capable node MUST demonstrate the scenarios in this spec or equivalent tests.
Equivalent tests may use different fixture names, ports, or process orchestration, but MUST preserve the setup, fault, and expected safety outcomes. Failure-mode tests SHOULD run against the same public HTTP and federation surfaces that clients use. White-box shortcuts MAY be used only to simulate network partitions, process crashes, clock state, or replay caches.
Split-brain
Setup. Two nodes, A and B, are federated with scope=public.
Both begin with the same public facts.
Scenario.
- Cut network connectivity between A and B.
- Write fact
F_ato node A for a shared entity/relation/scope. - Write conflicting fact
F_bto node B for the same entity/relation/scope. - Maintain the partition long enough for both nodes to continue serving local reads and writes.
- Restore connectivity.
- Allow replication to complete.
Expected outcomes.
Both facts retained
Both nodes retain both facts.
Contradiction detected
At least the node that ingests the second conflicting fact detects it.
Conflict queryable
Conflict query APIs expose the unresolved conflict.
Both returned
Fact queries with contradicted facts included return both facts with contradiction metadata.
No silent discard
No fact is silently discarded.
Malicious peer
Setup. Two nodes, A and B, are federated. A malicious process obtains or forges input for the B-to-A direction.
Scenario.
- The malicious process attempts to push a fact whose scope exceeds B's peer declaration.
- The malicious process attempts to push a fact whose source is outside B's declared namespace or authority.
- The malicious process replays a previously observed token within the active replay window.
Expected outcomes.
Over-scope rejected
Source-forgery rejected
In-window replay rejected
Audit captures reason
Rejections produce audit events with enough detail to diagnose.
Store uncorrupted
The receiving fact store is not corrupted by rejected input.
Partial replication failure
Setup. Node A pulls from node B. Node B has a larger public fact set than A has already replicated. A has persisted a cursor for the last fully accepted page.
Scenario.
- B fails after returning a later page but before A persists the cursor for that page.
- A attempts another pull while B is unreachable.
- A continues serving local reads and writes.
- B restarts.
- A's next pull cycle resumes.
Expected outcomes.
No crash on unavailability
A does not crash while B is unavailable.
Local reads/writes available
Resume from persisted cursor
Not from the beginning and not from an uncommitted future cursor.
No duplicates on re-ingest
Convergence on resume
Final convergence includes all eligible facts.
Replay attack
Setup. Two nodes, A and B, are federated. A valid peer token is observed by an attacker.
Scenario.
- The token is used legitimately once.
- The same token is replayed within the active nonce window.
- A new token is generated with the same nonce.
- A token is submitted after expiry.
Expected outcomes.
First use succeeds
Immediate replay fails
With a nonce-replay error.
Same-nonce reuse fails
A different token carrying the same nonce also fails.
Expired token fails
With an expiry error.
Failures audited
Replay and expiry failures are audited.
Out of scope
This spec does not define: