02 · Agentic Systems

Agentic Systems & DeFAI

Architecture, pipelines, safety, explainability: how autonomous agents plan, act, and fail.

AI AgentsDeFAIOversightPipelinesSafety
Mar 2026

By John Wright-Nyingifa · Product Designer building infrastructure for DeFi, DePIN, and autonomous agents.

ElizaOS: the WordPress for AI agents

Live Signal · March 2026

Agent token mcap: $2.79B (247 tokens, down 60-75% from peaks). ElizaOS: 17,800 stars, 343 releases. Olas: 3,302 agents, 672 daily active, 11.4M agent-to-agent txns. Fetch.ai/ASI Alliance: 2.7M agents. Anthropic research: agents exploit 56% of vulnerable smart contracts.

Agentic systems combine LLM-driven decision making with permissioned on-chain execution. The hard part isn't "can the model plan." It's how plans become safe transactions, how autonomy stays bounded by constraints humans can read, and how the system stays reliable under latency, reorgs, MEV, and cross-chain uncertainty.

Direct experience: Meko Protocol (cross-chain execution agent) and ROVA Protocol (robot ops via agents, Base Batches 003). Core tension: enough autonomy to be useful, enough constraint to be safe.

The Landscape

WHAT AGENTS DO ON-CHAIN (March 2026)

  ElizaOS          17,800 ★   "WordPress for Agents"   TS/Rust/Python
  Olas             3,302 agents  75% of Gnosis Safe txns  11.4M a2a txns
  Virtuals         $460M mcap   Agent Commerce Protocol  Physical AI
  Coinbase AgentKit  EVM + Sol  LangChain/ElizaOS        Framework-agnostic

  USE CASES
  ├─ Prediction markets        Olas dominates Gnosis
  ├─ Agent-to-agent hiring     11.4M transactions
  ├─ Trading / yield           Swaps, rebalancing
  └─ Social                    ElizaOS bots (Twitter/Discord)

Architecture

Goal → Task → Action

Strict hierarchy. Goal: "Maintain 30% stables, earn yield." Task: "Rebalance portfolio." Action: "Swap," "Approve," "Bridge." Prevents agents from inventing unsupported actions.

Three planning loops

Strategic (hours/days): allocations. Tactical (minutes): venues, routes. Reactive (seconds): failures, re-quotes, nonce gaps. Fast loops only within tight constraints.

Evaluation gates

Every plan passes: Feasibility → Risk → Cost → Confidence. Four gates, not one "approve" button.

EXECUTION PIPELINE

  Proposal → Simulation → Approval → Execution → Verification
      │            │            │           │            │
  Constraints   Dry-run,      Human or    Sequenced    Reconcile
  + expected    slippage,     auto-policy  + monitored  balances
  outcomes      fail prob.    thresholds

  CROSS-CHAIN (state machine)
  Source → Bridge dispatch → Dest verify → Dest exec → Settle
Agent lifecycle: Observe → Propose → Evaluate → Execute → Verify → Learn

Safety

Capability model

Explicit action set: Swap, Bridge, Lend, Stake, Claim, LP. Everything else blocked. Not a suggestion. An enforcement boundary.

Layered permissions

Wallet (keys) → Contract (protocols) → Method (functions) → Token (assets). Four layers, independently configurable.

Spending limits in human language

"Max $50K per swap, 3% slippage ceiling, no new protocols." Not: max_amount: 50000, slippage: 0.03.

Sandbox progression

Read-only → Paper trading → Limited $ → Full execution. Trust earned, not given.

SANDBOX PROGRESSION

  Read-Only  →  Paper Trade  →  Limited $  →  Full Execution
  Analyze       Simulate        Capped         Earned trust
  No signing    No real txns    Full pipeline  Full autonomy

What Goes Wrong

Frozen funds

Cross-chain action stalled mid-execution. User can't intervene. The agent holds tx context. Surface stuck states before users discover them.

Unexpected trades

Agent acted within constraints but surprised the user. Technically allowed, intuitively wrong. Preview/confirmation failed to set expectations.

Swarm cascades

Agent A hired Agent B (11.4M a2a txns on Olas). B failed. Both stuck in retry loops, burning gas. Distributed failure is harder to trace, stop, and explain.

Exploit via agent

56% of vulnerable contracts exploitable by agents (Anthropic). Automated damage is fast. Anomaly detection must be faster than the execution loop.

Kill switch tiers: pause one / pause all / emergency stop

Explainability

The hardest problem is showing why, not what.

English-first logs

"What I tried → what happened → what's next → what you should change." The log IS the main UI.

Counterfactual reasoning

"Chose Route A over B: fees lower, trust tier higher." Shows the decision space, not just the decision.

Task diffs

Before state → After state → Delta. Makes complex rebalancing scannable at a glance.

Five activity states

Idle · Monitoring · Queued · Active · Paused. Not just on/off.

See this thinking applied