Skip to main content
Version: v0.9.0a2
Spec
Experimental feature
This feature is Experimental. Breaking changes may occur before it reaches Stable. See feature status definitions →

§21. Lazy Instruction Discovery

12 min readSpec contributor · Agent runtime authorExperimental · future plugin line

What this section covers

How agents discover and load their instructions on demand rather than preloading every instruction document at startup. Three runtime components — a boot stub, an instruction manifest, and the recall_instruction tool — plus an off-path discovery audit for continuous retrieval-quality evaluation.

Status: Experimental / opt-in source package on main

Source material: Archived evolutionary spec snapshots. This page is the maintained Spec-X home for lazy instruction discovery.

EXPERIMENTAL

The boot-stub schema and instruction-manifest format are not yet finalized and may change in a future minor release. Do not deploy lazy-discovered instructions in production agents handling sensitive data or irreversible tool use until this section reaches GA. Always pin instructions_manifest_uri to a trusted, integrity-verified source.

Section body

Each subsection below shows the most recent normative text from the spec source. When earlier spec drafts also contained text for the same subsection, those revisions are collapsed under a Revisions accordion beneath it — open one to see what changed. Subsections that only appear in one draft render as plain text with no accordion.

Status: Experimental. Implementation source is opt-in and remains outside the supported default install.


§21.1 Boot Stub

The boot stub is the minimal agent preamble loaded unconditionally at the start of every heartbeat or session. Its purpose is to give the agent enough context to function and to provide handles for lazy-loading the rest of its instructions.

§21.1.1 Required Content

Field
Role
Description
agent_id
identity
Stable UUID that uniquely identifies this agent within the deployment.
agent_role
human label
e.g. "CTO", "ResearchScientist".
heartbeat_contract
URI
instruction: fact URI pointing to the heartbeat procedure document.
manifest_uri
URI
instruction: scope URI for the instruction manifest.
recall_tool_schema
inline schema
JSON Schema for recall_instruction; MUST be present so the agent can invoke it without a separate fetch.

Rules that apply unconditionally on every heartbeat MAY be embedded directly in the boot stub body.

This is the primary mitigation against chronic instruction-scope misses (§21.5.3 limitation note): a rule that is always in context cannot be silently missed by a retrieval failure. Deployments SHOULD classify each instruction unit as "always applicable" (candidate for boot stub embedding) or "task-conditional" (lazy-load via manifest).

§21.1.2 Wire Format

The boot stub MUST be serialized as a markdown document with YAML frontmatter:

---
agent_id: "8e0ed057-bcd8-4f8f-92ee-c046c55b64e9"
agent_role: "CTO"
heartbeat_contract: "instruction:acme/heartbeat-contract/v1"
manifest_uri: "instruction:acme/agent/cto/manifest/v1"
stub_version: 1
generated_at: "2026-05-04T00:00:00Z"
adapter_profile: "paperclip-claude-code"
migration_mode: "stigmem"
---

# Agent Boot Stub

You are **CTO** (id: `8e0ed057-bcd8-4f8f-92ee-c046c55b64e9`).

Your heartbeat procedure is at `instruction:acme/heartbeat-contract/v1`.
Your instruction manifest is at `instruction:acme/agent/cto/manifest/v1`.

Call `recall_instruction(intent)` to load relevant instruction sections before
performing any non-trivial task.

The body section MUST be no longer than 500 tokens as measured by a cl100k-compatible tokenizer. Implementations SHOULD target ≤ 450 tokens to leave headroom for adapter injection.

§21.1.3 Adapter Profiles

Profile
Injection
Description
paperclip-claude-code
Paperclip
Injects Paperclip tool definitions and heartbeat harness context.
openai-assistants
OpenAI
Injects OpenAI Assistants tool-call shim.
generic
none
Stub is delivered as-is.

Implementations MAY define additional profiles. Unknown profiles MUST be treated as generic.

§21.1.4 Boot Stub Delivery

GET /v1/agents/{agent_id}/boot-stub[?profile={adapter_profile}]

See §21.8.1 for the full wire contract. The boot stub MUST be regenerated whenever the agent's manifest_uri changes or the stub schema version increments; stale delivery is a correctness defect, not a warning.

§21.1.5 Task-Type Preloads

Immediately after boot stub delivery and before the agent receives any task context, the runtime MUST deliver the content of all manifest units whose required_by_task_types array contains the current heartbeat's wake reason. This is called task-type preloading. No retrieval scoring is applied; units are fetched deterministically.

(S1) Wake reason MUST be sourced from the authenticated heartbeat trigger event.

E.g. the control-plane JWT or signed adapter payload. The runtime MUST NOT accept an unverified wake_reason claim originating from the agent's task context or any caller-supplied payload.

  1. Compare the current wake reason against each manifest entry's required_by_task_types array. String comparison is exact and case-sensitive.
  2. All matching units MUST be fetched and injected into the agent's context before any task context is provided.
  3. Preloaded units MUST be included in the heartbeat's audit record under loaded_chunks, tagged with "source": "task_type_preload".
  4. If a preloaded unit's fact_uri is unreachable, fall back to path if present ("source": "fallback_path") or surface a preload_unit_unavailable warning and continue. (S2) If the unavailable unit has guarantee_load: true, the runtime MUST treat unavailability as fatal and MUST abort the heartbeat. In all cases the warning or error MUST be written to instruction_audit.
  5. Token budget: boot stub + task-type preloads SHOULD remain under 2000 tokens. SHOULD emit preload_budget_warning when exceeded but MUST NOT silently drop units.

Governance:

2-task-type cap

Any entry with more than 2 required_by_task_types values MUST require explicit admin approval (task_types_approval_required).

Enum validation

Build pipelines MUST validate strings against the registered wake-reason enum; unknown values MUST cause task_type_unknown.

Structural only

The intent of task-type preloads is for structurally-predictable critical units. MUST NOT use as a shortcut to load content that should be retrieved semantically.

(S3) Blast radius

Units in required_by_task_types are exposed unconditionally to all subsequent task context, including adversarial prompt injections later in the same heartbeat. Authors SHOULD NOT declare units containing content that must remain confidential.


§21.2 Instruction Manifest

§21.2.1 Token Budget

The instruction manifest MUST fit within 1000 tokens (cl100k).

Implementations MUST enforce this at write time and MUST reject a manifest update that would exceed it with error manifest_too_large.

§21.2.2 Manifest Entry Shape

{
"name": "security-posture",
"description": "Security constraints, escalation thresholds, and hard prohibitions.",
"required_by_task_types": ["issue_assigned", "issue_commented"],
"guarantee_load": false,
"load_triggers": {
"intents": ["security rule", "what am I not allowed to do", "escalation threshold"],
"keywords": ["security", "escalate", "prohibited", "never"],
"task_types": ["issue_assigned", "issue_commented", "routine_fired"]
},
"fact_uri": "instruction:acme/agent/cto/security-posture/v2",
"path": null,
"token_estimate": 320
}
Field
Requirement
Description
name
MUST · unique
Stable identifier; unique within the manifest.
description
MUST · ≤120 chars
One-line description of what this unit covers.
required_by_task_types
SHOULD for critical units
Wake-reason strings that cause deterministic preload (§21.1.5).
guarantee_load
MAY · max 5/agent
If true, unit is always appended to recall_instruction responses; requires admin approval; content MUST be safe for any authorised recall caller to observe.
load_triggers.intents
SHOULD
Natural-language intent phrases.
load_triggers.keywords
SHOULD
Exact or prefix-match keywords; MAY use BM25 matching.
load_triggers.task_types
MAY · semantic hint
Distinct from required_by_task_types: hint, not deterministic preload.
fact_uri / path
exactly one
instruction:-scope stigmem fact URI OR file path. Neither/both → manifest_entry_invalid.
token_estimate
SHOULD
Used for budget planning.

required_by_task_types vs load_triggers.task_types: complementary. required_by_task_types is a deterministic preload commitment. load_triggers.task_types is a semantic hint — it does not guarantee loading.

§21.2.3 Manifest Wire Contract

The manifest is stored as a stigmem fact under the instruction: scope (§21.4) and is also surfaced as a structured API resource. See §21.8.2 and §21.8.3.


§21.3 recall_instruction Tool Contract

§21.3.1 Request Shape

{
"intent": "I need to check out an issue and start work",
"max_chunks": 3,
"token_budget": 1200,
"manifest_hint": ["heartbeat-procedure", "checkout-procedure"]
}
Field
Required · Default
Description
intent
MUST · —
Free-text description of what the agent is about to do.
max_chunks
SHOULD · 3
Maximum number of instruction units to return.
token_budget
SHOULD · 2000
Soft token budget for the combined response content.
manifest_hint
MAY
Explicit unit names from the manifest; loaded first before ranked retrieval.

§21.3.2 Response Shape

{
"chunks": [
{
"name": "heartbeat-procedure",
"fact_uri": "instruction:acme/agent/cto/heartbeat-procedure/v3",
"content": "## Heartbeat Procedure\n\n...",
"tokens": 420,
"valid_until": "2027-05-04T00:00:00Z",
"version": "v3",
"score": 0.91,
"source": "stigmem"
}
],
"total_tokens": 420,
"truncated": false,
"missed_hints": [],
"audit_token": "audi_01J..."
}

audit_token is a first-class body field, not a header.

The agent must pass it back when submitting usage feedback. Embedding it in the body ensures it cannot be silently dropped by middleware.

chunks[].source is "stigmem" or "fallback_path". missed_hints lists manifest_hint names that were not found.

§21.3.3 Backing Implementation

recall_instruction MUST be implemented as a stigmem recall call restricted to the instruction: scope:

POST /v1/recall
{
"scope": "instruction:{deployment}/{agent_id}",
"intent": "{agent-provided intent string}",
"max_facts": "{max_chunks}",
"token_budget": "{token_budget}",
"weights": { "lexical": 0.35, "semantic": 0.50, "graph": 0.15 },
"require_garden_ids": ["{agent_instruction_garden_id}"]
}

The require_garden_ids constraint MUST be applied so that recall_instruction cannot return facts from gardens the agent is not authorized to read. If manifest_hint is provided, the named units MUST be included before ranked retrieval; missing/inaccessible hints are silently omitted and named in missed_hints.

Guaranteed units (guarantee_load: true):

  1. Position. Guaranteed units MUST be appended after ranked results by default (attention primacy). A unit with force_position: "prepend" MUST be prepended. (S6) force_position: "prepend" MUST undergo explicit content review at publish time, recorded in provenance, and MUST require a distinct admin approval record separate from the general guarantee_load approval.
  2. Budget precedence. Guaranteed units MUST NOT be silently dropped by token_budget exhaustion. Ranked results are truncated first; guaranteed units never to zero.
  3. (S4) Agent cap. At most 5 manifest units per agent may have guarantee_load: true. Publish MUST be rejected with guarantee_cap_exceeded if exceeded.
  4. Relevance threshold. SHOULD warn if a guaranteed unit has empirical P(relevant | recall_invoked) < 0.6.
  5. Governance. Requires explicit administrator approval recorded in provenance metadata.
  6. (S5) Confidentiality note. Guaranteed units are accessible to any principal authorised to invoke recall_instruction for this agent, including via prompt injection. Content in guaranteed units MUST NOT rely on retrieval difficulty for confidentiality.

§21.3.4 Determinism and Auditability

The same (intent, manifest_hint, max_chunks, token_budget) tuple MUST produce the same ordered result.

Given the same set of instruction facts at the same valid_until boundaries. This determinism property enables replay-based audit.

Implementations MUST record every recall_instruction invocation in the discovery audit table before returning. Audit-write failure is logged as audit_write_failed but MUST NOT block the response (audit is best-effort).


§21.4 instruction: Scope Semantics

The instruction: URI scheme is a reserved stigmem scope for agent instruction artifacts. It extends §17 (Memory Garden) and §19 (Federation Trust) with instruction-specific semantics.

§21.4.1 Scope Namespace

instruction:{deployment}/{agent_id}/{unit_name}/{version}
Segment
Constraint
Notes
{deployment}
org root
MUST match the entity_uri root in the org manifest.
{agent_id}
UUID or shortname
Stable agent UUID or a well-known shortname (e.g. cto).
{unit_name}
manifest entry
The instruction unit name from the manifest.
{version}
monotonic
e.g. v1, v3; MUST be monotonically incrementing; MUST NOT be a floating alias like latest.

The special URI instruction:{deployment}/{agent_id}/manifest/{version} addresses the agent's instruction manifest itself.

§21.4.2 Versioning

Instruction facts are mutable as a series; individual versioned facts are immutable once written.

A new version MUST be written as a new fact rather than mutating an existing fact. The previous version's valid_until MUST be set to the new version's created_at within the same transaction or a 30-second grace window.

Latest version wins

recall_instruction MUST return only the latest version (highest version string by semantic version ordering).

Per-heartbeat cache

Agents MAY cache instruction chunks for the duration of a heartbeat/session.

No cross-heartbeat cache

Unless valid_until extends past the next expected heartbeat time.

§21.4.3 Provenance

Field
Requirement
Notes
source_trust
MUST · ≥0.9 if human
Instruction facts authored by verified human administrators SHOULD have source_trust ≥ 0.9.
attestation_chain
MUST
At least one signature from an org manifest key. Unsigned instruction facts MUST be quarantined.
derived_from
SHOULD
Reference the prior version hash when updating; null valid for first version.
authored_by
MUST
entity_uri of the human or system that created this version.
authored_at
MUST
ISO 8601 timestamp.

§21.4.4 Garden Membership

All instruction facts MUST be placed in a dedicated instruction garden:

garden_id: "instruction:{deployment}:{agent_id}"

Agent: read-only

The agent itself, via capability token with verb read.

Admins: read + write

Via admin API key.

Peer agents: no access

MUST NOT have read access to another agent's instruction garden unless explicitly granted.

§21.4.5 Cross-Agent Confidentiality

Cross-agent instruction access is a confidentiality boundary.

Capability tokens granting read on another agent's instruction scope MUST NOT be derivable unless the requesting agent's role is a declared supervisor in the org manifest. Federation MUST NOT replicate instruction-scope facts to peer nodes unless the receiving node is in the same deployment trust domain. Cross-agent recall attempts MUST return 403 instruction_scope_denied. Audit logs MUST NOT be surfaced to peer agents.


§21.5 Discovery Audit

§21.5.1 Audit Record Shape

{
"id": "audevent_01J...",
"agent_id": "8e0ed057-bcd8-4f8f-92ee-c046c55b64e9",
"heartbeat_id": "run_ad74de74...",
"session_start": "2026-05-04T12:00:00Z",
"intent": "I need to check out an issue and start work",
"loaded_chunks": ["heartbeat-procedure", "checkout-procedure"],
"used_chunks": ["heartbeat-procedure"],
"missed_chunks": [],
"audit_token": "audi_01J...",
"audit_closed": "2026-05-04T12:01:05Z",
"created_at": "2026-05-04T12:00:02Z"
}

The four-way comparison (intentloaded_chunksused_chunksmissed_chunks) is the raw input for the evaluation metrics defined in §21.5.3. used_chunks and missed_chunks MAY be populated by the runtime or by the agent via self-report at heartbeat end.

§21.5.2 Audit Submission API

POST /v1/instruction/audit
Authorization: Bearer <agent api-key>
Content-Type: application/json

{
"audit_token": "audi_01J...",
"used_chunks": ["heartbeat-procedure"],
"missed_chunks": []
}
→ 204 No Content on success
→ 400 audit_token_invalid if token not recognized or already fully closed
→ 400 audit_token_expired if token is older than 24 hours

The audit endpoint MUST be idempotent.

A second submission with the same audit_token MUST return 204 without modifying the record.

§21.5.3 Replay-Based Evaluation

Metric
Formula
Meaning
Recall@k
|used ∩ loaded@k| / |used|
Fraction of used_chunks that appear in loaded_chunks within rank k.
Hit@k
≥1 in loaded@k
Fraction of heartbeats where at least one used_chunk was in loaded_chunks.
Miss rate
|missed| / (|used| + |missed|)
Alert when miss_rate > 0.15 over 100+ events.

These metrics SHOULD be computed over a rolling 7-day window. Determinism (§21.3.4) guarantees the replay is reproducible.

Known limitation — endogeneity of used_chunks (non-normative): All three metrics are computed relative to used_chunks, which is itself derived from agent behavior during the heartbeat being measured. An agent that chronically fails to load a required instruction unit will never reference it, so the unit will never appear in used_chunks. The chronic miss is therefore invisible to all three live-audit metrics. This is an accepted limitation.

21.5.4 Probe-Set Eval (follow-on, non-normative)

To complement the endogenous live-audit metrics with an exogenous coverage signal, implementations SHOULD maintain a probe set: a curated list of (intent, required_units) pairs administered independently of the live agent. After every manifest update and on a periodic schedule (e.g. daily), run recall_instruction(intent) against each probe and compute Probe-coverage@k and Probe-hit@k.

21.5.5 Probe-Set Coverage Sampling with Soft Score Lift (non-normative)

  1. Probe-set construction at manifest publish time. 2. Background coverage audit runs daily and on every embedding-model version bump. 3. Soft score lift for coverage-critical units: score += log(1 + λ) where λ ≈ 0.15. 4. Coverage endpoint (§21.8.6). 5. Probe-set calibration with PII-stripped real heartbeat intents on a weekly cadence.

§21.6 Migration Semantics

§21.6.1 Co-existence Period

  1. If a manifest entry has both fact_uri and path, fact_uri MUST take precedence.
  2. If fact_uri lookup fails, the runtime MUST fall back to path if present and MUST append "source": "fallback_path".
  3. File-path entries are read-only.
  4. The boot stub MUST indicate migration state via migration_mode: "file", "coexistence", or "stigmem".

§21.6.2 Deprecation Path

Stage
Mode
Action
1. Seed
verify ≥5 heartbeats
Write instruction content to stigmem as instruction: facts; verify recall quality.
2. Coexist
coexistence
Add fact_uri alongside existing path.
3. Verify
7-day window
Monitor audit metrics; confirm miss_rate < 0.10.
4. Promote
stigmem
Remove path from the manifest entry.
5. Archive
legacy folder
Move source markdown to docs/legacy-instructions/ with a redirect comment.

Deployments MUST NOT skip Stage 3 (Verify) for agents that handle sensitive operational decisions.

The risk of an undetected miss in a security-relevant instruction unit is higher than the cost of a 7-day observation window.

§21.6.3 Bulk Migration Tooling

Implementations SHOULD provide a stigmem migrate-instructions CLI that reads existing markdown, splits at H2/H3 boundaries, writes each section as an instruction: fact, and emits a manifest entries array for review. It MUST NOT automatically update the manifest or boot stub.


§21.7 Schema Migrations

Three tables support the lazy instruction layer:

instruction_manifests

Versioned snapshots of each agent's manifest. Previous versions retained with superseded_at.

instruction_audit

Append-only log backing the discovery audit. One row per recall_instruction invocation.

boot_stubs

Caches the rendered boot stub per (agent_id, adapter_profile). Invalidated on manifest version change.

CREATE TABLE IF NOT EXISTS instruction_manifests (
id TEXT PRIMARY KEY,
agent_id TEXT NOT NULL,
version TEXT NOT NULL,
fact_uri TEXT NOT NULL,
token_count INTEGER NOT NULL,
body TEXT NOT NULL,
created_at INTEGER NOT NULL,
superseded_at INTEGER,
UNIQUE(agent_id, version)
);
CREATE INDEX IF NOT EXISTS idx_manifests_agent ON instruction_manifests (agent_id, superseded_at NULLS FIRST);

CREATE TABLE IF NOT EXISTS instruction_audit (
id TEXT PRIMARY KEY,
agent_id TEXT NOT NULL,
heartbeat_id TEXT NOT NULL,
session_start INTEGER NOT NULL,
intent TEXT NOT NULL,
loaded_chunks TEXT NOT NULL,
used_chunks TEXT NOT NULL DEFAULT '[]',
missed_chunks TEXT NOT NULL DEFAULT '[]',
audit_token TEXT NOT NULL UNIQUE,
audit_closed INTEGER,
created_at INTEGER NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_audit_agent_session ON instruction_audit (agent_id, session_start DESC);
CREATE INDEX IF NOT EXISTS idx_audit_token ON instruction_audit (audit_token);

CREATE TABLE IF NOT EXISTS boot_stubs (
agent_id TEXT NOT NULL,
adapter_profile TEXT NOT NULL DEFAULT 'generic',
stub_version INTEGER NOT NULL DEFAULT 1,
body TEXT NOT NULL,
token_count INTEGER NOT NULL,
generated_at INTEGER NOT NULL,
manifest_version TEXT NOT NULL,
PRIMARY KEY (agent_id, adapter_profile)
);

§21.8 Wire Format Additions

§21.8.1 Get Boot Stub (MUST)

GET /v1/agents/{agent_id}/boot-stub[?profile={adapter_profile}]
Authorization: Bearer <agent api-key or admin api-key>

→ 200 Content-Type: text/markdown
X-Stub-Version: 1
X-Manifest-Version: v3
X-Token-Count: 420
[stub body]

→ 403 if caller is not the agent itself or an admin
→ 404 if agent not found or no stub generated yet

If profile is absent, MUST default to generic. Unknown profiles MUST be treated as generic.

§21.8.2 Get Instruction Manifest (MUST)

GET /v1/agents/{agent_id}/instruction-manifest
Authorization: Bearer <agent api-key or admin api-key>

→ 200 {
"manifest_version": "v3",
"fact_uri": "instruction:acme/agent/cto/manifest/v3",
"token_count": 840,
"entries": [ ...entry objects... ],
"last_updated_at": "2026-05-04T00:00:00Z"
}
→ 403 if caller is not the agent itself or an admin
→ 404 if no manifest configured for agent

§21.8.3 Publish / Replace Instruction Manifest (MUST)

PUT /v1/agents/{agent_id}/instruction-manifest
Authorization: Bearer <admin api-key>
Content-Type: application/json

{
"version": "v4",
"entries": [ ...entry objects... ],
"skip_coverage_gate": false
}

→ 200 { "fact_uri": "...", "token_count": 910, "coverage_report": [...] }
→ 400 manifest_too_large
→ 400 manifest_entry_invalid
→ 400 manifest_coverage_failure
→ 400 task_type_unknown
→ 400 guarantee_cap_exceeded
→ 400 task_types_approval_required
→ 409 manifest_version_conflict

Augmented manifest coverage gate (Approach A):

  1. For every string in load_triggers.intents, generate N=5 paraphrases using lexically and syntactically diverse augmentation (MUST NOT use the retrieval encoder's own nearest-neighbour space as the sole paraphrase source).
  2. Run recall_instruction(paraphrase) for each generated paraphrase; check whether this unit appears in top-k results (default k=3).
  3. Compute coverage_pct = (paraphrases where unit in top-k) / (total paraphrases).
  4. If coverage_pct < 0.80 for any unit, the entire publish MUST be rejected with manifest_coverage_failure.

(S7) Two-admin co-sign for skip_coverage_gate on guaranteed-unit manifests.

When skip_coverage_gate: true is used on a manifest containing any guarantee_load: true entry, the bypass provenance record MUST include co-signatures from at least two distinct administrators (two distinct authored_by entity URIs). Single-admin bypass is permitted only for manifests with no guarantee_load: true entries. (S10) Implementations SHOULD automatically schedule re-certification within 7 days when skip_coverage_gate: true is used.

(S8) Paraphrase generator data boundary.

Paraphrase generation input MUST be limited to load_triggers.intents strings only. Instruction fact content MUST NOT be sent to any external paraphrase generation service. External services MUST be listed in the deployment's trust manifest with an appropriate DPA. Implementations SHOULD prefer local, deterministic paraphrase methods for confidential instruction content.

Re-certification: When the deployment's embedding model version changes, all existing manifests MUST be re-certified through this gate before the new model version is activated for production recall.

This route MUST atomically: (1) run coverage gate, (2) write manifest fact, (3) update instruction_manifests table, (4) invalidate boot stub cache.

§21.8.4 Recall Instructions (MUST)

POST /v1/agents/{agent_id}/recall-instruction
Authorization: Bearer <agent api-key>
Content-Type: application/json

{ "intent": "...", "max_chunks": 3, "token_budget": 1200, "manifest_hint": ["heartbeat-procedure"] }

→ 200 { ...response shape per §21.3.2... }
→ 400 intent_required
→ 403 instruction_scope_denied
→ 404 if agent not found
→ 503 recall_backend_unavailable (retryable)

§21.8.5 Submit Discovery Audit (SHOULD)

The wire-level route for §21.5.2. This is a SHOULD (not MUST) because the audit is best-effort.

§21.8.6 Get Manifest Coverage Report (SHOULD)

GET /v1/agents/{agent_id}/instruction-manifest/coverage
Authorization: Bearer <agent api-key or admin api-key>

Agent-key response:
→ 200 {
"manifest_version": "v4",
"embedding_model_version": "nomic-embed-text-v1.5",
"evaluated_at": "2026-05-04T06:00:00Z",
"units": [ { "name": "...", "coverage_pct": 0.95, "hit_at_10": 0.92, "probe_count": 20, "last_evaluated_at": "..." } ]
}

Admin-key response: same as above, plus "coverage_status" field per unit.

→ 403 instruction_scope_denied
→ 404 if no manifest or no coverage report generated yet

(S9) Scope validation + (S11) Categorical label restriction.

The agent_id path parameter MUST be validated against the caller's API key scope; peer-agent queries MUST return 403. The coverage_status categorical label ("ok", "coverage_critical", "not_evaluated") SHOULD be returned only in admin-key responses. Agent-key responses SHOULD return only raw coverage_pct and hit_at_10 values, omitting the categorical label — this limits the retrieval-quality oracle surface for non-admin callers.

coverage_status values (admin-only): "ok" (hit@10 ≥ 0.4), "coverage_critical" (hit@10 < 0.4, soft-lift eligible), "not_evaluated" (probe run not yet completed).


§21.9 Error Reference

HTTP
Error code
Condition
400
intent_required
intent field absent or empty.
400
manifest_too_large
Manifest exceeds 1000-token budget.
400
manifest_entry_invalid
Entry has neither fact_uri nor path, or has both.
400
manifest_coverage_failure
One or more units failed the paraphrase coverage gate.
400
task_type_unknown
required_by_task_types value not in registered enum.
400
guarantee_cap_exceeded
More than 5 entries have guarantee_load: true.
400
task_types_approval_required
Entry declares > 2 required_by_task_types without admin approval.
400
audit_token_invalid
Token not recognized or already fully closed.
400
audit_token_expired
Token older than 24 hours.
403
instruction_scope_denied
Caller's token scope does not match the agent's instruction garden.
404
manifest_not_found
No instruction manifest configured for the agent.
404
boot_stub_not_found
No boot stub generated yet.
409
manifest_version_conflict
Version string already exists; manifest versions are immutable.
503
recall_backend_unavailable
Stigmem recall backend unreachable; retryable.

Subsection anchors

Anchors below are provided so docs links to specific subsections always resolve, even when the subsection text lives only in earlier spec drafts.

§21.5.4