← All cheatsheets
Cheatsheet

AI Agent Reliability

Production patterns for building agents that survive the real world — idempotency, audit trails, approval gates, and clean architecture boundaries. Interviewers test whether you know how agents fail and how to design around those failures.

01

Agent Reliability

State Management

  • Track task progress to avoid duplicate actions and repeated onboarding steps
  • Store task state externally (DB, queue) so agents can resume after crashes
  • Use a task ID per unit of work to correlate logs, retries, and results
  • Idempotent state transitions: applying the same step twice must produce the same state
Store task state in a durable external store, not in agent memory alone
Rely on in-process variables to track progress — a crash loses everything

Idempotency

  • External actions should have unique task IDs and safe retry behavior
  • Assign a stable idempotency key before any external write or API call
  • Check-then-act pattern: verify current state before performing a side effect
  • Idempotency is especially critical for emails, payments, and database writes
Pass an idempotency key on every external write; test that double-submission is safe
Assume a failed request means the action didn't happen — it may have succeeded

Traceability

  • Record planner thoughts at a safe summary level, tool calls, outputs, and final decisions
  • Structured logs > unstructured text: include task_id, step, tool, args, result, latency
  • Trace IDs link planner, executor, and verifier spans for end-to-end visibility
  • Redact sensitive fields (PII, credentials) before logging tool inputs/outputs
Log every tool call and model decision with a trace ID — debugging without traces is guesswork
Log raw tool outputs without redaction — API keys and PII end up in log stores

Evals

  • Test task success, tool accuracy, hallucination rate, and safety violations
  • Evaluate trajectory quality (each step) and final outcome separately — both matter
  • Adversarial evals: include edge cases, ambiguous inputs, and hostile tool responses
  • Regression suite: run on every model/prompt change before deploying
Measure intermediate step quality, not just final answer — agents fail mostly in the middle
Evaluate only final answer quality — agent systems fail mostly in intermediate steps

Fallbacks & Recovery

  • Use deterministic workflows when agent confidence is low
  • Retry transient failures with exponential backoff; rollback bad actions
  • Escalate ambiguous cases to humans rather than guessing
  • Abort and alert after N consecutive failures; never loop indefinitely
Design for failure: retries, deduplication, validation, rollback, and human escalation
Let agents retry forever on systematic errors — set a hard failure budget
02

Tool Use & Permissions

Tool Registry

  • Defines available tools, schemas, permissions, cost, and risk level
  • Centralizes tool metadata so agents can discover, filter, and route calls safely
  • Tag tools by risk tier (read-only, write, destructive) to enforce policy
  • Versioned registry: changing a tool schema without bumping its version breaks agents silently
Treat tools as controlled system capabilities, not 'things the model owns'
Let agents call any available API — they will eventually call the wrong one

Function Calling

  • Structured tool invocation with typed arguments and deterministic validation
  • Validate all arguments against the schema before executing — never trust model output
  • Return structured responses; unstructured text wastes tokens and causes parse errors
  • Include an error field in every response schema so the agent knows how to recover
Validate tool arguments server-side even if the model generated them
Assume the model will produce valid JSON — always parse and validate defensively

Sandboxing

  • Run code/file/network actions in isolated environments (Docker, microVMs)
  • Limit tool permissions to least privilege: read-only unless write is explicitly needed
  • Dry-run mode: simulate irreversible actions before committing
  • Network egress controls prevent exfiltration of data via unexpected outbound calls
Scope tool permissions tightly and run code in isolated containers
Expose raw production APIs directly to the agent without guardrails

Permission Model

  • User-level, role-level, tenant-level, and action-level access control
  • Agents should inherit, not exceed, the permissions of the user they act on behalf of
  • Scope tokens and API keys to the minimum actions needed for the task
  • Audit permission grants and revocations; alert on unexpected escalation
Grant agents the minimum permissions required; re-check on every sensitive action

Approval Gates & Audit Trail

  • Ask before sending emails, deleting data, charging money, or changing production state
  • Log every tool call, input, output, decision, and failure to an immutable audit log
  • Time-box approval windows: auto-reject if the human doesn't respond within N minutes
  • Approval gates are the last line of defense before irreversible side effects
Require human approval for destructive, financial, or externally-visible actions
Treat approval gates as optional UX — they are a hard safety boundary
03

Agent Architecture

Agent Loop

  • user goal → planner → tool selection → execution → observation → next step → final answer
  • Each iteration produces an observation that updates the agent's working state
  • Maximum step count and timeout are mandatory — unbounded loops will eventually hang
  • Checkpoint state after each step to enable resume, replay, and debugging
Define the loop contract explicitly: inputs, outputs, termination conditions, and max iterations
Ship an agent loop without a step limit — it will hang or loop forever on edge cases

Planner

  • Breaks vague goals into concrete steps; should not directly perform risky actions
  • Outputs a structured plan (task list, DAG, or step sequence) for the executor
  • Dynamic replanning: the planner should revise the plan if an executor step fails
  • Planner prompts should be optimized for decomposition, not execution
Separate planning from execution; the planner should never directly call side-effecting tools

Executor

  • Calls tools/APIs/functions under system control
  • Operates on a single step at a time; reports result (success/failure/observation) back to planner
  • Should not make planning decisions — escalate ambiguity back to the planner
  • Retry logic, timeout handling, and error formatting live in the executor layer
Keep executor logic narrow: receive a step, call a tool, return a result
Let the executor improvise on tool selection — it should only call what the planner specified

Verifier

  • Checks outputs, catches errors, validates constraints before final response
  • Can be a separate LLM call, a rule-based check, or both
  • Verifier scope: factual correctness, safety policy compliance, format validity
  • Failed verification should trigger replanning, not silent delivery of bad output
Separate planning, execution, and verification clearly
Let the LLM freely call tools without permissions, validation, or rollback

Memory & Human Approval

  • Memory/context stores useful state, but should be scoped and permission-aware
  • Human approval is required for destructive, financial, external, or irreversible actions
  • Memory should not store secrets or PII unless explicitly encrypted and access-controlled
  • Approval history should be persisted so agents don't re-ask for already-granted permissions
Scope agent memory to the task; escalate to humans before any irreversible action