§21. Lazy Instruction Discovery
What this section covers
How agents discover and load their instructions on demand rather
than preloading every instruction document at startup. Three
runtime components — a boot stub, an instruction manifest,
and the recall_instruction tool — plus an off-path discovery
audit for continuous retrieval-quality evaluation.
Status: Experimental / opt-in source package on main
Source material: Archived evolutionary spec snapshots. This page is the maintained Spec-X home for lazy instruction discovery.
The boot-stub schema and instruction-manifest format are not yet finalized and may change in a future minor release. Do not deploy lazy-discovered instructions in production agents handling sensitive data or irreversible tool use until this section reaches GA. Always pin instructions_manifest_uri to a trusted, integrity-verified source.
Each subsection below shows the most recent normative text from the spec source. When earlier spec drafts also contained text for the same subsection, those revisions are collapsed under a Revisions accordion beneath it — open one to see what changed. Subsections that only appear in one draft render as plain text with no accordion.
Status: Experimental. Implementation source is opt-in and remains outside the supported default install.
§21.1 Boot Stub
The boot stub is the minimal agent preamble loaded unconditionally at the start of every heartbeat or session. Its purpose is to give the agent enough context to function and to provide handles for lazy-loading the rest of its instructions.
§21.1.1 Required Content
agent_idagent_role"CTO", "ResearchScientist".heartbeat_contractinstruction: fact URI pointing to the heartbeat procedure document.manifest_uriinstruction: scope URI for the instruction manifest.recall_tool_schemarecall_instruction; MUST be present so the agent can invoke it without a separate fetch.Rules that apply unconditionally on every heartbeat MAY be embedded directly in the boot stub body.
This is the primary mitigation against chronic instruction-scope misses (§21.5.3 limitation note): a rule that is always in context cannot be silently missed by a retrieval failure. Deployments SHOULD classify each instruction unit as "always applicable" (candidate for boot stub embedding) or "task-conditional" (lazy-load via manifest).
§21.1.2 Wire Format
The boot stub MUST be serialized as a markdown document with YAML frontmatter:
---
agent_id: "8e0ed057-bcd8-4f8f-92ee-c046c55b64e9"
agent_role: "CTO"
heartbeat_contract: "instruction:acme/heartbeat-contract/v1"
manifest_uri: "instruction:acme/agent/cto/manifest/v1"
stub_version: 1
generated_at: "2026-05-04T00:00:00Z"
adapter_profile: "paperclip-claude-code"
migration_mode: "stigmem"
---
# Agent Boot Stub
You are **CTO** (id: `8e0ed057-bcd8-4f8f-92ee-c046c55b64e9`).
Your heartbeat procedure is at `instruction:acme/heartbeat-contract/v1`.
Your instruction manifest is at `instruction:acme/agent/cto/manifest/v1`.
Call `recall_instruction(intent)` to load relevant instruction sections before
performing any non-trivial task.
The body section MUST be no longer than 500 tokens as measured by a cl100k-compatible tokenizer. Implementations SHOULD target ≤ 450 tokens to leave headroom for adapter injection.
§21.1.3 Adapter Profiles
paperclip-claude-codeopenai-assistantsgenericImplementations MAY define additional profiles. Unknown profiles MUST be treated as generic.
§21.1.4 Boot Stub Delivery
GET /v1/agents/{agent_id}/boot-stub[?profile={adapter_profile}]
See §21.8.1 for the full wire contract. The boot stub MUST be regenerated whenever the agent's manifest_uri changes or the stub schema version increments; stale delivery is a correctness defect, not a warning.
§21.1.5 Task-Type Preloads
Immediately after boot stub delivery and before the agent receives any task context, the runtime MUST deliver the content of all manifest units whose required_by_task_types array contains the current heartbeat's wake reason. This is called task-type preloading. No retrieval scoring is applied; units are fetched deterministically.
(S1) Wake reason MUST be sourced from the authenticated heartbeat trigger event.
E.g. the control-plane JWT or signed adapter payload. The runtime
MUST NOT accept an unverified wake_reason claim originating from
the agent's task context or any caller-supplied payload.
- Compare the current wake reason against each manifest entry's
required_by_task_typesarray. String comparison is exact and case-sensitive. - All matching units MUST be fetched and injected into the agent's context before any task context is provided.
- Preloaded units MUST be included in the heartbeat's audit record under
loaded_chunks, tagged with"source": "task_type_preload". - If a preloaded unit's
fact_uriis unreachable, fall back topathif present ("source": "fallback_path") or surface apreload_unit_unavailablewarning and continue. (S2) If the unavailable unit hasguarantee_load: true, the runtime MUST treat unavailability as fatal and MUST abort the heartbeat. In all cases the warning or error MUST be written toinstruction_audit. - Token budget: boot stub + task-type preloads SHOULD remain under 2000 tokens. SHOULD emit
preload_budget_warningwhen exceeded but MUST NOT silently drop units.
Governance:
2-task-type cap
Any entry with more than 2 required_by_task_types values MUST require explicit admin approval (task_types_approval_required).
Enum validation
Build pipelines MUST validate strings against the registered wake-reason enum; unknown values MUST cause task_type_unknown.
Structural only
The intent of task-type preloads is for structurally-predictable critical units. MUST NOT use as a shortcut to load content that should be retrieved semantically.
(S3) Blast radius
Units in required_by_task_types are exposed unconditionally to all subsequent task context, including adversarial prompt injections later in the same heartbeat. Authors SHOULD NOT declare units containing content that must remain confidential.
§21.2 Instruction Manifest
§21.2.1 Token Budget
The instruction manifest MUST fit within 1000 tokens (cl100k).
Implementations MUST enforce this at write time and MUST reject a
manifest update that would exceed it with error manifest_too_large.
§21.2.2 Manifest Entry Shape
{
"name": "security-posture",
"description": "Security constraints, escalation thresholds, and hard prohibitions.",
"required_by_task_types": ["issue_assigned", "issue_commented"],
"guarantee_load": false,
"load_triggers": {
"intents": ["security rule", "what am I not allowed to do", "escalation threshold"],
"keywords": ["security", "escalate", "prohibited", "never"],
"task_types": ["issue_assigned", "issue_commented", "routine_fired"]
},
"fact_uri": "instruction:acme/agent/cto/security-posture/v2",
"path": null,
"token_estimate": 320
}
namedescriptionrequired_by_task_typesguarantee_loadrecall_instruction responses; requires admin approval; content MUST be safe for any authorised recall caller to observe.load_triggers.intentsload_triggers.keywordsload_triggers.task_typesrequired_by_task_types: hint, not deterministic preload.fact_uri / pathinstruction:-scope stigmem fact URI OR file path. Neither/both → manifest_entry_invalid.token_estimate
required_by_task_typesvsload_triggers.task_types: complementary.required_by_task_typesis a deterministic preload commitment.load_triggers.task_typesis a semantic hint — it does not guarantee loading.
§21.2.3 Manifest Wire Contract
The manifest is stored as a stigmem fact under the instruction: scope (§21.4) and is also surfaced as a structured API resource. See §21.8.2 and §21.8.3.
§21.3 recall_instruction Tool Contract
§21.3.1 Request Shape
{
"intent": "I need to check out an issue and start work",
"max_chunks": 3,
"token_budget": 1200,
"manifest_hint": ["heartbeat-procedure", "checkout-procedure"]
}
intentmax_chunkstoken_budgetmanifest_hint§21.3.2 Response Shape
{
"chunks": [
{
"name": "heartbeat-procedure",
"fact_uri": "instruction:acme/agent/cto/heartbeat-procedure/v3",
"content": "## Heartbeat Procedure\n\n...",
"tokens": 420,
"valid_until": "2027-05-04T00:00:00Z",
"version": "v3",
"score": 0.91,
"source": "stigmem"
}
],
"total_tokens": 420,
"truncated": false,
"missed_hints": [],
"audit_token": "audi_01J..."
}
audit_token is a first-class body field, not a header.
The agent must pass it back when submitting usage feedback. Embedding it in the body ensures it cannot be silently dropped by middleware.
chunks[].source is "stigmem" or "fallback_path". missed_hints lists manifest_hint names that were not found.
§21.3.3 Backing Implementation
recall_instruction MUST be implemented as a stigmem recall call restricted to the instruction: scope:
POST /v1/recall
{
"scope": "instruction:{deployment}/{agent_id}",
"intent": "{agent-provided intent string}",
"max_facts": "{max_chunks}",
"token_budget": "{token_budget}",
"weights": { "lexical": 0.35, "semantic": 0.50, "graph": 0.15 },
"require_garden_ids": ["{agent_instruction_garden_id}"]
}
The require_garden_ids constraint MUST be applied so that recall_instruction cannot return facts from gardens the agent is not authorized to read. If manifest_hint is provided, the named units MUST be included before ranked retrieval; missing/inaccessible hints are silently omitted and named in missed_hints.
Guaranteed units (guarantee_load: true):
- Position. Guaranteed units MUST be appended after ranked results by default (attention primacy). A unit with
force_position: "prepend"MUST be prepended. (S6)force_position: "prepend"MUST undergo explicit content review at publish time, recorded in provenance, and MUST require a distinct admin approval record separate from the generalguarantee_loadapproval. - Budget precedence. Guaranteed units MUST NOT be silently dropped by
token_budgetexhaustion. Ranked results are truncated first; guaranteed units never to zero. - (S4) Agent cap. At most 5 manifest units per agent may have
guarantee_load: true. Publish MUST be rejected withguarantee_cap_exceededif exceeded. - Relevance threshold. SHOULD warn if a guaranteed unit has empirical
P(relevant | recall_invoked) < 0.6. - Governance. Requires explicit administrator approval recorded in provenance metadata.
- (S5) Confidentiality note. Guaranteed units are accessible to any principal authorised to invoke
recall_instructionfor this agent, including via prompt injection. Content in guaranteed units MUST NOT rely on retrieval difficulty for confidentiality.
§21.3.4 Determinism and Auditability
The same (intent, manifest_hint, max_chunks, token_budget) tuple MUST produce the same ordered result.
Given the same set of instruction facts at the same valid_until
boundaries. This determinism property enables replay-based audit.
Implementations MUST record every recall_instruction invocation in the discovery audit table before returning. Audit-write failure is logged as audit_write_failed but MUST NOT block the response (audit is best-effort).
§21.4 instruction: Scope Semantics
The instruction: URI scheme is a reserved stigmem scope for agent instruction artifacts. It extends §17 (Memory Garden) and §19 (Federation Trust) with instruction-specific semantics.
§21.4.1 Scope Namespace
instruction:{deployment}/{agent_id}/{unit_name}/{version}
{deployment}entity_uri root in the org manifest.{agent_id}cto).{unit_name}{version}v1, v3; MUST be monotonically incrementing; MUST NOT be a floating alias like latest.The special URI instruction:{deployment}/{agent_id}/manifest/{version} addresses the agent's instruction manifest itself.
§21.4.2 Versioning
Instruction facts are mutable as a series; individual versioned facts are immutable once written.
A new version MUST be written as a new fact rather than mutating an
existing fact. The previous version's valid_until MUST be set to
the new version's created_at within the same transaction or a
30-second grace window.
Latest version wins
recall_instruction MUST return only the latest version (highest version string by semantic version ordering).
Per-heartbeat cache
Agents MAY cache instruction chunks for the duration of a heartbeat/session.
No cross-heartbeat cache
Unless valid_until extends past the next expected heartbeat time.
§21.4.3 Provenance
source_trustsource_trust ≥ 0.9.attestation_chainderived_fromnull valid for first version.authored_byentity_uri of the human or system that created this version.authored_at§21.4.4 Garden Membership
All instruction facts MUST be placed in a dedicated instruction garden:
garden_id: "instruction:{deployment}:{agent_id}"
Agent: read-only
The agent itself, via capability token with verb read.
Admins: read + write
Via admin API key.
Peer agents: no access
MUST NOT have read access to another agent's instruction garden unless explicitly granted.
§21.4.5 Cross-Agent Confidentiality
Cross-agent instruction access is a confidentiality boundary.
Capability tokens granting read on another agent's instruction
scope MUST NOT be derivable unless the requesting agent's role is a
declared supervisor in the org manifest. Federation MUST NOT
replicate instruction-scope facts to peer nodes unless the receiving
node is in the same deployment trust domain. Cross-agent recall
attempts MUST return 403 instruction_scope_denied. Audit logs MUST
NOT be surfaced to peer agents.
§21.5 Discovery Audit
§21.5.1 Audit Record Shape
{
"id": "audevent_01J...",
"agent_id": "8e0ed057-bcd8-4f8f-92ee-c046c55b64e9",
"heartbeat_id": "run_ad74de74...",
"session_start": "2026-05-04T12:00:00Z",
"intent": "I need to check out an issue and start work",
"loaded_chunks": ["heartbeat-procedure", "checkout-procedure"],
"used_chunks": ["heartbeat-procedure"],
"missed_chunks": [],
"audit_token": "audi_01J...",
"audit_closed": "2026-05-04T12:01:05Z",
"created_at": "2026-05-04T12:00:02Z"
}
The four-way comparison (intent → loaded_chunks → used_chunks → missed_chunks) is the raw input for the evaluation metrics defined in §21.5.3. used_chunks and missed_chunks MAY be populated by the runtime or by the agent via self-report at heartbeat end.
§21.5.2 Audit Submission API
POST /v1/instruction/audit
Authorization: Bearer <agent api-key>
Content-Type: application/json
{
"audit_token": "audi_01J...",
"used_chunks": ["heartbeat-procedure"],
"missed_chunks": []
}
→ 204 No Content on success
→ 400 audit_token_invalid if token not recognized or already fully closed
→ 400 audit_token_expired if token is older than 24 hours
The audit endpoint MUST be idempotent.
A second submission with the same audit_token MUST return 204
without modifying the record.
§21.5.3 Replay-Based Evaluation
used_chunks that appear in loaded_chunks within rank k.used_chunk was in loaded_chunks.miss_rate > 0.15 over 100+ events.These metrics SHOULD be computed over a rolling 7-day window. Determinism (§21.3.4) guarantees the replay is reproducible.
Known limitation — endogeneity of
used_chunks(non-normative): All three metrics are computed relative toused_chunks, which is itself derived from agent behavior during the heartbeat being measured. An agent that chronically fails to load a required instruction unit will never reference it, so the unit will never appear inused_chunks. The chronic miss is therefore invisible to all three live-audit metrics. This is an accepted limitation.21.5.4 Probe-Set Eval (follow-on, non-normative)
To complement the endogenous live-audit metrics with an exogenous coverage signal, implementations SHOULD maintain a probe set: a curated list of
(intent, required_units)pairs administered independently of the live agent. After every manifest update and on a periodic schedule (e.g. daily), runrecall_instruction(intent)against each probe and compute Probe-coverage@k and Probe-hit@k.21.5.5 Probe-Set Coverage Sampling with Soft Score Lift (non-normative)
- Probe-set construction at manifest publish time. 2. Background coverage audit runs daily and on every embedding-model version bump. 3. Soft score lift for coverage-critical units:
score += log(1 + λ)where λ ≈ 0.15. 4. Coverage endpoint (§21.8.6). 5. Probe-set calibration with PII-stripped real heartbeat intents on a weekly cadence.
§21.6 Migration Semantics
§21.6.1 Co-existence Period
- If a manifest entry has both
fact_uriandpath,fact_uriMUST take precedence. - If
fact_urilookup fails, the runtime MUST fall back topathif present and MUST append"source": "fallback_path". - File-path entries are read-only.
- The boot stub MUST indicate migration state via
migration_mode:"file","coexistence", or"stigmem".
§21.6.2 Deprecation Path
instruction: facts; verify recall quality.fact_uri alongside existing path.miss_rate < 0.10.path from the manifest entry.docs/legacy-instructions/ with a redirect comment.Deployments MUST NOT skip Stage 3 (Verify) for agents that handle sensitive operational decisions.
The risk of an undetected miss in a security-relevant instruction unit is higher than the cost of a 7-day observation window.
§21.6.3 Bulk Migration Tooling
Implementations SHOULD provide a stigmem migrate-instructions CLI that reads existing markdown, splits at H2/H3 boundaries, writes each section as an instruction: fact, and emits a manifest entries array for review. It MUST NOT automatically update the manifest or boot stub.
§21.7 Schema Migrations
Three tables support the lazy instruction layer:
instruction_manifests
Versioned snapshots of each agent's manifest. Previous versions retained with superseded_at.
instruction_audit
Append-only log backing the discovery audit. One row per recall_instruction invocation.
boot_stubs
Caches the rendered boot stub per (agent_id, adapter_profile). Invalidated on manifest version change.
CREATE TABLE IF NOT EXISTS instruction_manifests (
id TEXT PRIMARY KEY,
agent_id TEXT NOT NULL,
version TEXT NOT NULL,
fact_uri TEXT NOT NULL,
token_count INTEGER NOT NULL,
body TEXT NOT NULL,
created_at INTEGER NOT NULL,
superseded_at INTEGER,
UNIQUE(agent_id, version)
);
CREATE INDEX IF NOT EXISTS idx_manifests_agent ON instruction_manifests (agent_id, superseded_at NULLS FIRST);
CREATE TABLE IF NOT EXISTS instruction_audit (
id TEXT PRIMARY KEY,
agent_id TEXT NOT NULL,
heartbeat_id TEXT NOT NULL,
session_start INTEGER NOT NULL,
intent TEXT NOT NULL,
loaded_chunks TEXT NOT NULL,
used_chunks TEXT NOT NULL DEFAULT '[]',
missed_chunks TEXT NOT NULL DEFAULT '[]',
audit_token TEXT NOT NULL UNIQUE,
audit_closed INTEGER,
created_at INTEGER NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_audit_agent_session ON instruction_audit (agent_id, session_start DESC);
CREATE INDEX IF NOT EXISTS idx_audit_token ON instruction_audit (audit_token);
CREATE TABLE IF NOT EXISTS boot_stubs (
agent_id TEXT NOT NULL,
adapter_profile TEXT NOT NULL DEFAULT 'generic',
stub_version INTEGER NOT NULL DEFAULT 1,
body TEXT NOT NULL,
token_count INTEGER NOT NULL,
generated_at INTEGER NOT NULL,
manifest_version TEXT NOT NULL,
PRIMARY KEY (agent_id, adapter_profile)
);
§21.8 Wire Format Additions
§21.8.1 Get Boot Stub (MUST)
GET /v1/agents/{agent_id}/boot-stub[?profile={adapter_profile}]
Authorization: Bearer <agent api-key or admin api-key>
→ 200 Content-Type: text/markdown
X-Stub-Version: 1
X-Manifest-Version: v3
X-Token-Count: 420
[stub body]
→ 403 if caller is not the agent itself or an admin
→ 404 if agent not found or no stub generated yet
If profile is absent, MUST default to generic. Unknown profiles MUST be treated as generic.
§21.8.2 Get Instruction Manifest (MUST)
GET /v1/agents/{agent_id}/instruction-manifest
Authorization: Bearer <agent api-key or admin api-key>
→ 200 {
"manifest_version": "v3",
"fact_uri": "instruction:acme/agent/cto/manifest/v3",
"token_count": 840,
"entries": [ ...entry objects... ],
"last_updated_at": "2026-05-04T00:00:00Z"
}
→ 403 if caller is not the agent itself or an admin
→ 404 if no manifest configured for agent
§21.8.3 Publish / Replace Instruction Manifest (MUST)
PUT /v1/agents/{agent_id}/instruction-manifest
Authorization: Bearer <admin api-key>
Content-Type: application/json
{
"version": "v4",
"entries": [ ...entry objects... ],
"skip_coverage_gate": false
}
→ 200 { "fact_uri": "...", "token_count": 910, "coverage_report": [...] }
→ 400 manifest_too_large
→ 400 manifest_entry_invalid
→ 400 manifest_coverage_failure
→ 400 task_type_unknown
→ 400 guarantee_cap_exceeded
→ 400 task_types_approval_required
→ 409 manifest_version_conflict
Augmented manifest coverage gate (Approach A):
- For every string in
load_triggers.intents, generate N=5 paraphrases using lexically and syntactically diverse augmentation (MUST NOT use the retrieval encoder's own nearest-neighbour space as the sole paraphrase source). - Run
recall_instruction(paraphrase)for each generated paraphrase; check whether this unit appears in top-k results (default k=3). - Compute
coverage_pct = (paraphrases where unit in top-k) / (total paraphrases). - If
coverage_pct < 0.80for any unit, the entire publish MUST be rejected withmanifest_coverage_failure.
(S7) Two-admin co-sign for skip_coverage_gate on guaranteed-unit manifests.
When skip_coverage_gate: true is used on a manifest containing any
guarantee_load: true entry, the bypass provenance record MUST
include co-signatures from at least two distinct administrators
(two distinct authored_by entity URIs). Single-admin bypass is
permitted only for manifests with no guarantee_load: true entries.
(S10) Implementations SHOULD automatically schedule
re-certification within 7 days when skip_coverage_gate: true is used.
(S8) Paraphrase generator data boundary.
Paraphrase generation input MUST be limited to load_triggers.intents
strings only. Instruction fact content MUST NOT be sent to any
external paraphrase generation service. External services MUST be
listed in the deployment's trust manifest with an appropriate DPA.
Implementations SHOULD prefer local, deterministic paraphrase
methods for confidential instruction content.
Re-certification: When the deployment's embedding model version changes, all existing manifests MUST be re-certified through this gate before the new model version is activated for production recall.
This route MUST atomically: (1) run coverage gate, (2) write manifest fact, (3) update instruction_manifests table, (4) invalidate boot stub cache.
§21.8.4 Recall Instructions (MUST)
POST /v1/agents/{agent_id}/recall-instruction
Authorization: Bearer <agent api-key>
Content-Type: application/json
{ "intent": "...", "max_chunks": 3, "token_budget": 1200, "manifest_hint": ["heartbeat-procedure"] }
→ 200 { ...response shape per §21.3.2... }
→ 400 intent_required
→ 403 instruction_scope_denied
→ 404 if agent not found
→ 503 recall_backend_unavailable (retryable)
§21.8.5 Submit Discovery Audit (SHOULD)
The wire-level route for §21.5.2. This is a SHOULD (not MUST) because the audit is best-effort.
§21.8.6 Get Manifest Coverage Report (SHOULD)
GET /v1/agents/{agent_id}/instruction-manifest/coverage
Authorization: Bearer <agent api-key or admin api-key>
Agent-key response:
→ 200 {
"manifest_version": "v4",
"embedding_model_version": "nomic-embed-text-v1.5",
"evaluated_at": "2026-05-04T06:00:00Z",
"units": [ { "name": "...", "coverage_pct": 0.95, "hit_at_10": 0.92, "probe_count": 20, "last_evaluated_at": "..." } ]
}
Admin-key response: same as above, plus "coverage_status" field per unit.
→ 403 instruction_scope_denied
→ 404 if no manifest or no coverage report generated yet
(S9) Scope validation + (S11) Categorical label restriction.
The agent_id path parameter MUST be validated against the caller's
API key scope; peer-agent queries MUST return 403. The
coverage_status categorical label ("ok", "coverage_critical",
"not_evaluated") SHOULD be returned only in admin-key responses.
Agent-key responses SHOULD return only raw coverage_pct and
hit_at_10 values, omitting the categorical label — this limits the
retrieval-quality oracle surface for non-admin callers.
coverage_status values (admin-only): "ok" (hit@10 ≥ 0.4), "coverage_critical" (hit@10 < 0.4, soft-lift eligible), "not_evaluated" (probe run not yet completed).
§21.9 Error Reference
intent_requiredintent field absent or empty.manifest_too_largemanifest_entry_invalidfact_uri nor path, or has both.manifest_coverage_failuretask_type_unknownrequired_by_task_types value not in registered enum.guarantee_cap_exceededguarantee_load: true.task_types_approval_requiredrequired_by_task_types without admin approval.audit_token_invalidaudit_token_expiredinstruction_scope_deniedmanifest_not_foundboot_stub_not_foundmanifest_version_conflictrecall_backend_unavailableSubsection anchors
Anchors below are provided so docs links to specific subsections always resolve, even when the subsection text lives only in earlier spec drafts.