Skip to main content
Version: v0.9.0a2
Operator

R-REKOR-UNAVAILABLE

3 min readOn-call operatorRunbook

When to use

Fact writes continue locally but chain checkpoints cannot be submitted to the configured transparency log.

Trigger signals:

Pending checkpoint

fact_chain_checkpoints.status = 'pending' for longer than your alert window.

Rekor in last_error

fact_chain_checkpoints.last_error mentions Rekor, transparency-log, network, or signing-package availability.

Stale chain proofs

Full recall proofs show a pending checkpoint while recent writes continue.

Identifyโ€‹

Check the latest checkpoint state:

SELECT tenant_id,
covered_chain_seq,
status,
attempt_count,
created_at,
submitted_at,
next_retry_at,
last_error
FROM fact_chain_checkpoints
ORDER BY covered_chain_seq DESC
LIMIT 10;

Confirm the node's transparency-log configuration:

echo "$STIGMEM_TL_BACKEND"
echo "$STIGMEM_TL_REKOR_URL"

Containโ€‹

  1. Keep fact writes enabled unless your deployment policy requires fail-closed external witnessing. Facts remain locally chained while checkpoints retry.
  2. Preserve the pending checkpoint rows and application logs.
  3. Avoid rebuilding or truncating fact_chain while checkpoint submission is pending.
  4. If the outage is caused by a bad Rekor URL or missing Sigstore dependency, correct configuration before restarting the node.

Investigateโ€‹

Local config failure

Verify backend, Rekor URL, package extras, and outbound egress policy.

Public Rekor outage

Check the Sigstore status page and retry later.

Private Rekor

Contact the log operator and preserve the last successful tl_log_index.

Recoverโ€‹

After the transparency log is reachable again, allow the node to retry pending checkpoints.

A healthy checkpoint has all of:

status = 'submitted', non-null submitted_at, non-null tl_log_id, non-null tl_leaf_hash, non-null tl_log_index.

Run a full recall verification request and confirm the returned chain_proof includes the latest submitted checkpoint metadata.

Communicateโ€‹

Tell peer operators and auditors:

Affected tenant

Highest locally covered chain sequence

First pending checkpoint timestamp

Root cause class

Local configuration, network reachability, or Rekor service availability.