Élan is a BEAM-native multi-agent runtime with durable state, git-native provenance, and policy-governed tool orchestration. It is designed for long-running autonomous systems that keep their promises even when machines do not.
Name origin: Élan comes from the French word "élan", meaning momentum or spirited energy.
Working prototype. The runtime core, recovery path, agent loop, embedding API, and recursive exploration API are implemented locally. Embedded-consumer hardening is now complete; the Epistemic Integrity Layer (Phase 1–2) is live — RAG grounding, ConfabulumGate, GroundtraceRecord telemetry, and CompetenceSignal are all wired.
audit_store_<run_id>.jsonl hash chain.
Agents restart with explicit state, checkpoints, and idempotent side effects so that a crash never becomes data loss.
A full audit trail of events, decisions, and tool usage gives you a clear source of truth.
Supervised agents coordinate through durable tasks, leases, retries, and explicit provenance boundaries.
Élan runs one agent per supervised process, backed by an event log, checkpoint store, task
graph, and message log. Agents use gen_statem for explicit transitions, a bounded
tool-calling loop for reasoning, callback-based step streaming for callers, and git branches
plus worktrees for provenance isolation. The same runtime can run under the CLI/TUI or be
embedded as a library in another OTP application.
The BenchArena/stack execution layer now runs a 4-step protocol with formal epistemic guarantees at each stage:
Question ↓ [Decompose] ─── ConfabulumRate gate ─── halt? ↓ [RAG Retrieve] (Perplexity sonar × N sub-questions, citation-grounded) ↓ [Verify + grounded context] ─── ConfabulumRate gate ─── halt? ↓ [Synthesise] ─── ConfabulumRate gate ─── halt? ↓ Answer (confidence_score, certainty_vocab, GroundtraceRecord)
9-type taxonomy applied after every step. gate/2 returns {:pass, score} or {:halt, type, score}. Synthesis is blocked on halt. Threshold: 0.65 per type.
Per-step signed record emitted to audit_store_<run_id>.jsonl. Each record carries prev_record_hash — tamper-evident by construction.
Phases 1A–1C close the TruthfulQA regression (stack 53.3% vs standard 100%) by adding retrieval grounding, confabulation gating, and confidence propagation — transforming the stack from “formally interesting” to “demonstrably trustworthy.”
The stack adapter’s 3-step protocol was extended to a 4-step protocol: decompose → retrieve → verify → synthesise.
sonar retrieval call (up to 5 sub-questions, capped for latency).citations URLs augment the verify-step prompt as grounded context blocks.Promoted from post-hoc classifier to in-pipeline gate. Synthesis is blocked on any halt signal.
9 confabulum types:
gate/2 scores the answer across all 9 types using lightweight textual heuristics.{:pass, aggregate_score} if all types are below threshold (0.65).{:halt, worst_type, worst_score} if any type exceeds the threshold.
confidence_score: float() added to each SubTask, propagated as
min(own, min(deps)) — confidence degrades monotonically along dependency chains.
CompetenceSignal injects a vocabulary-appropriate prefix into the synthesis prompt, enforcing
honest hedging at the generation step.
Every BenchArena run — or production pipeline execution — produces a complete, signed, append-only GroundtraceRecord per SubTask. An auditor with the record can reconstruct the full execution path without access to the original runtime.
20-field immutable struct. Each field is content-addressed; the hash chain links every record to its predecessor.
%GroundtraceRecord{
record_id: UUID, # globally unique, content-addressed
run_id: String, # links to parent BenchArena/pipeline run
subtask_id: String, # SubTask.id from SemanticIR
adapter: atom(), # :stack | :agent_loop | :perplexity_standard
model_id: String, # e.g. "sonar-pro-20260401"
model_temperature: Float, # 0.0 for deterministic mode
prompt_hash: String, # SHA-256 of exact prompt sent
retrieved_sources: [%{url, title, retrieved_at, passage_hash}],
raw_response_hash: String, # SHA-256 of raw API response
tokens_in: Integer,
tokens_out: Integer,
latency_ms: Integer,
confabulum_verdict: ConfabulumVerdict, # Pass | Halt(type, score)
confidence_score: Float,
certainty_vocab: CertaintyVocabulary,
score: Float, # BenchArena score for this SubTask
timestamp_utc: DateTime,
prev_record_hash: String, # hash of previous record in chain
record_hash: String # SHA-256(all fields except record_hash)
}
prev_record_hash — any modification to a historical record invalidates all subsequent hashes.valid_chain?/1 verifies the full chain integrity; proved in Lean 4 (chain_tamper_evident theorem).verify_record/1 performs single-record tamper check by recomputing and comparing the hash.audit_store_<run_id>.jsonl — append-only JSON-Lines, written to bench_results/.run_id, prev_record_hash, and record_count state.emit/3 called after each adapter step — builds record, appends to AuditStore, updates hash chain.[:bench_arena, :groundtrace, :emitted] telemetry event (logged at debug level).The GroundtraceRecord schema provides the foundation. Rule 17a-4 compliance is a storage policy on top: configurable 6-year retention, non-deletable records, retrieval SLA. File-based initially; production deployment requires WORM storage (AWS S3 Object Lock or equivalent).
run_regression_truthfulqa.exs — target ≥ 95% on stack adapter; gate in CI on merge to main
5 compliance modules built into the Élan orchestration layer, covering all CRITICAL gaps identified in the Block.xyz-grade fintech compliance audit.
Immutable HMAC-chained event log with 6-year retention, chain integrity verification, and regulatory export API.
RBAC with kill-switch per agent class. Autonomous agents gated by risk tier and human review thresholds.
Full SR 11-7 model inventory: risk tiers, validation status, pre-deployment approval workflow, board reporting.
Cryptographic HMAC chain-of-custody. Every agent action is provably linked — tampering is detected at verification.
P0/P1 incidents auto-kill affected agents. Circuit breaker halts all operations. Automated RCA + SOC2-compliant reports.
4 legal-grade compliance modules built into the Élan orchestration layer — covering attorney-client privilege [confidential comms between lawyer and client] compartmentalisation, per-matter data residency [where data is physically stored and processed], EU AI Act QMS conformance, and IRAP [Information Security Registered Assessors Program — Australia's gov't infosec certification] Essential Eight controls.
Attorney-client privilege compartmentalisation at the orchestration layer. Zero-retention tagging for privileged context windows — privilege waiver is the #1 enterprise law firm sales blocker.
Per-matter jurisdiction tagging. Agent inference routes to the correct regional execution zone — PROTECTED matters enforced as AU-only. Magic Circle / Big Law data-residency gate satisfied.
Quality Management System for EU AI Act conformance. Risk management, training data governance, human oversight chain, conformity assessment — ahead of the August 2026 Annex III deadline.
ACSC Essential Eight at Maturity Level 3: MFA, application control, patch management, audit logging. ISM-2074 AI usage policy enforced — required for Australian government legal work.
Élan is ready for first-consumer proof. If you care about resilient agents and provable execution, explore the PRD, wire it into a host app, and help validate the live provider and persistence paths.