← All cheatsheets
Cheatsheet
AI Agent Reliability
Production patterns for building agents that survive the real world — idempotency, audit trails, approval gates, and clean architecture boundaries. Interviewers test whether you know how agents fail and how to design around those failures.
01
Agent Reliability
State Management
- Track task progress to avoid duplicate actions and repeated onboarding steps
- Store task state externally (DB, queue) so agents can resume after crashes
- Use a task ID per unit of work to correlate logs, retries, and results
- Idempotent state transitions: applying the same step twice must produce the same state
✓
Store task state in a durable external store, not in agent memory alone
✗
Rely on in-process variables to track progress — a crash loses everything
Idempotency
- External actions should have unique task IDs and safe retry behavior
- Assign a stable idempotency key before any external write or API call
- Check-then-act pattern: verify current state before performing a side effect
- Idempotency is especially critical for emails, payments, and database writes
✓
Pass an idempotency key on every external write; test that double-submission is safe
✗
Assume a failed request means the action didn't happen — it may have succeeded
Traceability
- Record planner thoughts at a safe summary level, tool calls, outputs, and final decisions
- Structured logs > unstructured text: include task_id, step, tool, args, result, latency
- Trace IDs link planner, executor, and verifier spans for end-to-end visibility
- Redact sensitive fields (PII, credentials) before logging tool inputs/outputs
✓
Log every tool call and model decision with a trace ID — debugging without traces is guesswork
✗
Log raw tool outputs without redaction — API keys and PII end up in log stores
Evals
- Test task success, tool accuracy, hallucination rate, and safety violations
- Evaluate trajectory quality (each step) and final outcome separately — both matter
- Adversarial evals: include edge cases, ambiguous inputs, and hostile tool responses
- Regression suite: run on every model/prompt change before deploying
✓
Measure intermediate step quality, not just final answer — agents fail mostly in the middle
✗
Evaluate only final answer quality — agent systems fail mostly in intermediate steps
Fallbacks & Recovery
- Use deterministic workflows when agent confidence is low
- Retry transient failures with exponential backoff; rollback bad actions
- Escalate ambiguous cases to humans rather than guessing
- Abort and alert after N consecutive failures; never loop indefinitely
✓
Design for failure: retries, deduplication, validation, rollback, and human escalation
✗
Let agents retry forever on systematic errors — set a hard failure budget
02
Tool Use & Permissions
Tool Registry
- Defines available tools, schemas, permissions, cost, and risk level
- Centralizes tool metadata so agents can discover, filter, and route calls safely
- Tag tools by risk tier (read-only, write, destructive) to enforce policy
- Versioned registry: changing a tool schema without bumping its version breaks agents silently
✓
Treat tools as controlled system capabilities, not 'things the model owns'
✗
Let agents call any available API — they will eventually call the wrong one
Function Calling
- Structured tool invocation with typed arguments and deterministic validation
- Validate all arguments against the schema before executing — never trust model output
- Return structured responses; unstructured text wastes tokens and causes parse errors
- Include an error field in every response schema so the agent knows how to recover
✓
Validate tool arguments server-side even if the model generated them
✗
Assume the model will produce valid JSON — always parse and validate defensively
Sandboxing
- Run code/file/network actions in isolated environments (Docker, microVMs)
- Limit tool permissions to least privilege: read-only unless write is explicitly needed
- Dry-run mode: simulate irreversible actions before committing
- Network egress controls prevent exfiltration of data via unexpected outbound calls
✓
Scope tool permissions tightly and run code in isolated containers
✗
Expose raw production APIs directly to the agent without guardrails
Permission Model
- User-level, role-level, tenant-level, and action-level access control
- Agents should inherit, not exceed, the permissions of the user they act on behalf of
- Scope tokens and API keys to the minimum actions needed for the task
- Audit permission grants and revocations; alert on unexpected escalation
✓
Grant agents the minimum permissions required; re-check on every sensitive action
Approval Gates & Audit Trail
- Ask before sending emails, deleting data, charging money, or changing production state
- Log every tool call, input, output, decision, and failure to an immutable audit log
- Time-box approval windows: auto-reject if the human doesn't respond within N minutes
- Approval gates are the last line of defense before irreversible side effects
✓
Require human approval for destructive, financial, or externally-visible actions
✗
Treat approval gates as optional UX — they are a hard safety boundary
03
Agent Architecture
Agent Loop
- user goal → planner → tool selection → execution → observation → next step → final answer
- Each iteration produces an observation that updates the agent's working state
- Maximum step count and timeout are mandatory — unbounded loops will eventually hang
- Checkpoint state after each step to enable resume, replay, and debugging
✓
Define the loop contract explicitly: inputs, outputs, termination conditions, and max iterations
✗
Ship an agent loop without a step limit — it will hang or loop forever on edge cases
Planner
- Breaks vague goals into concrete steps; should not directly perform risky actions
- Outputs a structured plan (task list, DAG, or step sequence) for the executor
- Dynamic replanning: the planner should revise the plan if an executor step fails
- Planner prompts should be optimized for decomposition, not execution
✓
Separate planning from execution; the planner should never directly call side-effecting tools
Executor
- Calls tools/APIs/functions under system control
- Operates on a single step at a time; reports result (success/failure/observation) back to planner
- Should not make planning decisions — escalate ambiguity back to the planner
- Retry logic, timeout handling, and error formatting live in the executor layer
✓
Keep executor logic narrow: receive a step, call a tool, return a result
✗
Let the executor improvise on tool selection — it should only call what the planner specified
Verifier
- Checks outputs, catches errors, validates constraints before final response
- Can be a separate LLM call, a rule-based check, or both
- Verifier scope: factual correctness, safety policy compliance, format validity
- Failed verification should trigger replanning, not silent delivery of bad output
✓
Separate planning, execution, and verification clearly
✗
Let the LLM freely call tools without permissions, validation, or rollback
Memory & Human Approval
- Memory/context stores useful state, but should be scoped and permission-aware
- Human approval is required for destructive, financial, external, or irreversible actions
- Memory should not store secrets or PII unless explicitly encrypted and access-controlled
- Approval history should be persisted so agents don't re-ask for already-granted permissions
✓
Scope agent memory to the task; escalate to humans before any irreversible action