Skip to main content
Version: v0.9.0a2
Operator

R-PEER-COMPROMISE

3 min readOn-call operatorRunbook ยท critical

When to use

A federation peer appears compromised, malicious, or misconfigured in a way that could affect your node.

Trigger alerts:

peer_capability_violation

peer_replay_burst

Repeated federation_handshake_failed

Suspicious manifest_rotation_observed

High-volume writes

Unexpected high-volume writes or quarantine admissions from one peer.

Identifyโ€‹

Capture the current evidence before changing state:

curl -s "https://your-node.example.com/v1/federation/audit?limit=200" \
-H "Authorization: Bearer $STIGMEM_ADMIN_KEY" | jq .

curl -s "https://your-node.example.com/v1/federation/peers" \
-H "Authorization: Bearer $STIGMEM_ADMIN_KEY" | jq .

Record the peer entity URI, peer URL, pinned key, recent pull status, and the first timestamp where behavior changed.

Containโ€‹

Stop new data from the peer first. Do not delete audit events โ€” they are the evidence trail.

  1. Disable or remove the peer registration.
  2. Revoke capability tokens issued to that peer.
  3. If your deployment supports source-trust rules, lower the peer's trust score so future facts are quarantined.
  4. Pause any automated promotion from quarantine.

Investigateโ€‹

Review what the peer wrote and what escaped quarantine:

curl -s "https://your-node.example.com/v1/facts?source=<peer-entity-uri>&limit=200" \
-H "Authorization: Bearer $STIGMEM_ADMIN_KEY" | jq .

Check for:

Sensitive scopes

Facts in sensitive scopes.

Agent-control relations

interpret_as=instruction or agent-control relations.

Promoted from quarantine

Replays / capability violations

Close to the first suspicious write.

Recoverโ€‹

  1. Retract facts that are false, unsafe, or outside the agreed federation contract.
  2. Keep benign facts if you can justify them from audit evidence.
  3. Ask the peer operator to rotate compromised node or issuer keys.
  4. Re-register the peer only after you verify its new manifest/key material out of band.
  5. Run a small test pull and watch quarantine/audit events before restoring full trust.

Communicateโ€‹

Notify the peer operator with:

Peer entity URI and URL

Alert names and timestamps

Example fact IDs or audit event IDs

What you disabled locally

Evidence needed before re-enabling

If compromised data may have reached downstream peers, notify those operators too.