Architecture
What this page covers
Engineer-facing architecture reference: the fact model, provenance, decay, scope hierarchy, contradiction semantics, HLC, federation protocol, auth model, repo map, and the experimental graph/recall pipeline.
Audience: engineers contributing to the node, implementing an adapter, or reading the spec alongside the code.
System overviewโ
A Stigmem deployment is one or more nodes โ self-contained FastAPI+SQLite processes โ connected by the federation protocol. Clients assert and query facts via HTTP/JSON. Nodes peer with each other via a signed PeerDeclaration handshake and replicate facts across scope boundaries.
The fact modelโ
Every piece of knowledge is an atomic, immutable fact (Spec-01-Fact-Model).
Facts are append-only: there is no PUT or DELETE. Updating a
value means asserting a new fact; the latest fact for a given
(entity, relation, scope) triple wins by precedence rules.
(entity, relation, value, source, timestamp, hlc, confidence, scope)
entityrelationSpec-16-Namespace-Registry).valuestring, text, number, boolean, datetime, ref, null.sourcetimestamphlcconfidence1.0 = certain, 0.0 = retracted.scopeProvenance and decayโ
Provenanceโ
Every fact carries source and timestamp, stored without modification. Queries return the original source, never an intermediate relay. Federated facts additionally carry a stigmem:received_from meta-fact:
{
"entity": "stigmem:fact:<uuid>",
"relation": "stigmem:received_from",
"value": { "type": "ref", "v": "<originating-node-id>" },
"source": "system:stigmem"
}
This meta-fact is stored locally and never re-replicated.
Decayโ
Facts have an optional expiry: valid_until: ISO 8601 | null.
Hidden by default
Expired facts are hidden from normal queries (as if they don't exist).
Retained in store
Queryable with include_expired=true.
TTL via meta-fact
(entity=<fact-id>, relation="stigmem:ttl", value=<datetime>).
valid_until and confidence are orthogonal.
A historical certain fact has confidence=1.0 and valid_until set
to when it was superseded by a newer value.
Scope hierarchy and enforcement (Spec-02-Scopes-and-ACL)โ
local โโโ node-only, never leaves this instance
team โโโ logical team boundary, node-operator-defined, never federated
company โโโ owning company node; federated only when PeerDeclaration explicitly allows
public โโโ federatable to any registered peer by default
allowed_scopes: ["local"] cannot write a public-scoped fact.Contradiction semantics (Spec-15-Fact-Semantics)โ
A contradiction exists when two facts share (entity, relation, scope) but have different values and both have confidence > 0.0.
Both facts are retained.
The node auto-generates a stigmem:conflict:<uuid> reification with
stigmem:conflict:between and a stigmem:conflict:status = "unresolved"
companion fact.
Resolution order at query time:
- Higher
confidencewins. - Equal confidence โ higher
hlcwins (causal ordering). - Tie โ both returned with
contradicted: true; caller decides.
Resolution is explicit and traceable: POST /v1/conflicts/:id/resolve writes a new fact with the resolved value, updating conflict status to "resolved" with full provenance.
Hybrid Logical Clock (Spec-12-HLC-Bounded-Skew)โ
Every node maintains a single HLC: HLC = "{wall_ms_utc}.{counter}" (e.g. "1746230400000.003").
Advance rules:
- On local write:
hlc = max(now_ms, last_hlc_ms)as wall; increment counter if wall unchanged. - On receiving a federated fact:
hlc = max(now_ms, received_hlc_ms); increment counter.
The HLC prevents the split-brain scenario where two nodes partition, accept divergent writes, and then disagree about ordering on reunion.
Federation Protocol (Spec-05-Federation-Trust)โ
Handshakeโ
A PeerDeclaration is a signed JSON document declaring federation intent:
{
"declaring_node_id": "stigmem://node.alice.example",
"target_node_id": "stigmem://node.bob.example",
"allowed_scopes": ["public"],
"direction": "bidirectional",
"signed_at": "2026-05-01T00:00:00Z",
"declaration_sig": "<Ed25519 signature of canonical JSON of the above fields>"
}
Registration is mutual: both nodes must POST /v1/federation/peers with the declaration to activate the peering.
Replicationโ
Pull-based:
GET /v1/federation/facts?since_hlc=<last-seen>&scope=public&limit=500
Authorization: Bearer <peer-token>
Short-lived peer tokens
Ed25519-signed JWTs, max 1-hour expiry, nonce replay-protected.
Idempotent ingest
Re-asserting a fact that already exists is a no-op.
HLC cursor resume
Replication resumes exactly where it left off across restarts.
Failure modesโ
All four failure scenarios are automated in node/tests/observability/test_failure_modes.py:
Auth modelโ
Authorization: Bearer <raw-key>. Node stores only SHA-256 hex digest. Maps to an entity_uri, permissions, and allowed_scopes./.well-known/stigmem. Nonce cache prevents replay.Auth mode is advertised at /.well-known/stigmem as "auth": "none" | "required". Single-operator deployments MAY set STIGMEM_AUTH_REQUIRED=false.
Repo mapโ
stigmem/
โโโ spec/ โ canonical security evidence and generated spec mirrors
โ โโโ security/ โ threat model and evidence registry
โ โโโ specs/ โ generated protocol spec snapshots
โ
โโโ node/ โ reference node (FastAPI + SQLite)
โ โโโ src/stigmem_node/
โ โ โโโ main.py โ FastAPI app factory, lifespan, router registration
โ โ โโโ auth.py โ resolve_identity() dependency; API key validation
โ โ โโโ db.py โ SQLite connection, schema migrations
โ โ โโโ hlc.py โ node_hlc.tick() โ global HLC (threading.Lock protected)
โ โ โโโ federation_pull.py โ background pull loop (HLC cursor, idempotent ingest)
โ โ โโโ federation_ingest.py โ idempotent fact ingestion; federation audit log
โ โ โโโ peer_auth.py โ PeerDeclaration verification, peer token generation
โ โ โโโ peer_token.py โ Ed25519 JWT sign/verify, nonce cache
โ โ โโโ routes/
โ โ โโโ facts.py โ POST/GET /v1/facts
โ โ โโโ federation.py โ /v1/federation/* facade
โ โ โโโ conflicts.py โ /v1/conflicts
โ โ โโโ wellknown.py โ GET /.well-known/stigmem
โ โโโ migrations/ โ schema migrations
โ โโโ tests/ โ reference-node unit, integration, conformance, security tests
โ
โโโ adapters/ โ platform adapters
โโโ packages/ โ published Python packages and extracted plugins
โโโ sdks/ โ SDKs and generated clients
โโโ experimental/ โ deferred or gated features per ADR-002 / ADR-008
โ
โโโ docs/ โ Docusaurus 3 documentation site
Key implementation notesโ
SQLite as the v0.9.0a1 reference
Schema is migration-friendly by design. PostgreSQL backend is feasible after the alpha line but not required.
HLC requires a threading lock
The in-process HLC state is shared between HTTP request path and background federation pull. Without threading.Lock, concurrent writes race.
Idempotency + conflict edge case
If fact F arrives twice via replication, the second ingestion is a no-op โ it must not create a second conflict record.
declaration_sig excluded from preimage
The Ed25519 signature covers all PeerDeclaration fields except declaration_sig itself.
Federation handshake, conflict detection flow, and HLC tick protocol sequence diagrams are planned. Contributions welcome โ see CONTRIBUTING.md at the repo root.
Graph index and recall pipeline (Spec-X11-Recall-Graph)โ
This section describes Spec-X11-Recall-Graph, which remains experimental. Security review of subscription auth and cross-garden recall scoping is in progress.
pre-reset graph & recall design adds three interconnected subsystems: a graph adjacency index, a vector embedding store, and a hybrid recall pipeline.
Graph adjacency index (entity_edges)โ
The facts table is flat. pre-reset adds a side-index that makes entity-to-entity traversal O(edges) rather than O(facts):
CREATE TABLE entity_edges (
id TEXT PRIMARY KEY, -- = source fact id
subject TEXT NOT NULL, -- "from" entity URI
relation TEXT NOT NULL, -- edge label (predicate)
object TEXT NOT NULL, -- "to" entity URI
scope TEXT NOT NULL,
confidence REAL NOT NULL,
source_trust REAL, -- cached per Spec-05-Federation-Trust
created_at INTEGER NOT NULL
);
CREATE INDEX idx_edges_subject ON entity_edges (subject, scope, confidence);
CREATE INDEX idx_edges_object ON entity_edges (object, scope, confidence);
GET /v1/graph/neighbors exposes bounded traversal over this index (depth โค 2 by default).
Vector embedding store (vec_facts)โ
Each live fact (confidence > 0.1 by default) is embedded as the composed string "{entity_display} {relation} {value_text}" and stored in a vec_facts virtual table.
Default model: nomic-embed-text-v1.5 โ 768 dimensions, Apache-2.0, runs offline via ollama pull nomic-embed-text. Matryoshka architecture lets operators truncate to 256 dimensions via STIGMEM_EMBED_DIMENSIONS.
Three-stage hybrid recall pipelineโ
GET/POST /v1/recall runs three retrieval stages independently, then fuses their candidate sets:
Stage weights default to {lexical: 0.30, vector: 0.50, graph: 0.20} and are caller-configurable.
Salience signals:
exp(-0.01 ร age_days)fact.confidence0.5 + 0.5 ร t; 1.0 when trust off.โ Security-critical โ ANN scope filter (R1).
vec_facts holds embeddings for all scopes and gardens with no
scope column. Stage 2 ANN results MUST be joined back against the
facts table and filtered by scope = :scope and the caller's
garden ACL before being passed to fusion or used as graph expansion
seeds. Without this filter, garden-B fact IDs surface in a garden-A
caller's response, leaking fact existence and content.
Memory cardsโ
A memory card is a per-entity synthesized summary stored as a stigmem:memory:card fact. It is the primary result for entity-centric recall queries.
Surface contradictions
Cards never silently pick a winner. contradicted_count non-zero signals partial unreliability.
Refresh on assert
New fact for the entity invalidates the card; enqueue background refresh.
Refresh on decay
Confidence change in a constituent fact invalidates the card.
Age threshold
STIGMEM_CARD_MAX_AGE_S (default 86400s).
During refresh the stale card remains readable with card_stale: true. Pass force_refresh=true to block on synchronous regeneration (bounded to 500ms).
Garden ACL on card recall (R2).
When a recall query includes a memory card, the node MUST verify the
caller's garden ACL against the card's garden_id before including
it. Cards in unauthorized gardens MUST be excluded; the response
falls back to raw facts from authorized gardens only.
Subscriptionsโ
POST /v1/subscriptions registers a push subscription on a scope, entity, or garden. The node delivers events when matching facts change โ eliminating polling loops for agents that watch shared entities.
Cross-garden leakage prevention.
The node re-evaluates the caller's garden ACL and capability
token revocation status on every event delivery. Event content is
not populated until the ACL check passes. If access has been revoked
since subscription creation, delivery is silently dropped and the
subscription is cancelled with event type
subscription_cancelled_access_revoked. Implementations MUST NOT
optimize this check away (e.g., by caching the result from
subscription creation time).
Events are buffered for STIGMEM_SUBSCRIPTION_REPLAY_S seconds (default 3600s). Subscribers may request replay via GET /v1/subscriptions/:id/events?after={event_id} within this window.