Cheatsheet

AI Agent Reliability

Production patterns for building agents that survive the real world — idempotency, audit trails, approval gates, and clean architecture boundaries. Interviewers test whether you know how agents fail and how to design around those failures.

Agent Reliability

State Management

Track task progress to avoid duplicate actions and repeated onboarding steps
Store task state externally (DB, queue) so agents can resume after crashes
Use a task ID per unit of work to correlate logs, retries, and results
Idempotent state transitions: applying the same step twice must produce the same state

✓ Store task state in a durable external store, not in agent memory alone

✗ Rely on in-process variables to track progress — a crash loses everything

Idempotency

External actions should have unique task IDs and safe retry behavior
Assign a stable idempotency key before any external write or API call
Check-then-act pattern: verify current state before performing a side effect
Idempotency is especially critical for emails, payments, and database writes

✓ Pass an idempotency key on every external write; test that double-submission is safe

✗ Assume a failed request means the action didn't happen — it may have succeeded

Traceability

Record planner thoughts at a safe summary level, tool calls, outputs, and final decisions
Structured logs > unstructured text: include task_id, step, tool, args, result, latency
Trace IDs link planner, executor, and verifier spans for end-to-end visibility
Redact sensitive fields (PII, credentials) before logging tool inputs/outputs

✓ Log every tool call and model decision with a trace ID — debugging without traces is guesswork

✗ Log raw tool outputs without redaction — API keys and PII end up in log stores

Evals

Test task success, tool accuracy, hallucination rate, and safety violations
Evaluate trajectory quality (each step) and final outcome separately — both matter
Adversarial evals: include edge cases, ambiguous inputs, and hostile tool responses
Regression suite: run on every model/prompt change before deploying

✓ Measure intermediate step quality, not just final answer — agents fail mostly in the middle

✗ Evaluate only final answer quality — agent systems fail mostly in intermediate steps

Fallbacks & Recovery

Use deterministic workflows when agent confidence is low
Retry transient failures with exponential backoff; rollback bad actions
Escalate ambiguous cases to humans rather than guessing
Abort and alert after N consecutive failures; never loop indefinitely

✓ Design for failure: retries, deduplication, validation, rollback, and human escalation

✗ Let agents retry forever on systematic errors — set a hard failure budget

Tool Use & Permissions

Tool Registry

Defines available tools, schemas, permissions, cost, and risk level
Centralizes tool metadata so agents can discover, filter, and route calls safely
Tag tools by risk tier (read-only, write, destructive) to enforce policy
Versioned registry: changing a tool schema without bumping its version breaks agents silently

✓ Treat tools as controlled system capabilities, not 'things the model owns'

✗ Let agents call any available API — they will eventually call the wrong one

Function Calling

Structured tool invocation with typed arguments and deterministic validation
Validate all arguments against the schema before executing — never trust model output
Return structured responses; unstructured text wastes tokens and causes parse errors
Include an error field in every response schema so the agent knows how to recover

✓ Validate tool arguments server-side even if the model generated them

✗ Assume the model will produce valid JSON — always parse and validate defensively

Sandboxing

Run code/file/network actions in isolated environments (Docker, microVMs)
Limit tool permissions to least privilege: read-only unless write is explicitly needed
Dry-run mode: simulate irreversible actions before committing
Network egress controls prevent exfiltration of data via unexpected outbound calls

✓ Scope tool permissions tightly and run code in isolated containers

✗ Expose raw production APIs directly to the agent without guardrails

Permission Model

User-level, role-level, tenant-level, and action-level access control
Agents should inherit, not exceed, the permissions of the user they act on behalf of
Scope tokens and API keys to the minimum actions needed for the task
Audit permission grants and revocations; alert on unexpected escalation

✓ Grant agents the minimum permissions required; re-check on every sensitive action

Approval Gates & Audit Trail

Ask before sending emails, deleting data, charging money, or changing production state
Log every tool call, input, output, decision, and failure to an immutable audit log
Time-box approval windows: auto-reject if the human doesn't respond within N minutes
Approval gates are the last line of defense before irreversible side effects

✓ Require human approval for destructive, financial, or externally-visible actions

✗ Treat approval gates as optional UX — they are a hard safety boundary

Agent Architecture

Agent Loop

user goal → planner → tool selection → execution → observation → next step → final answer
Each iteration produces an observation that updates the agent's working state
Maximum step count and timeout are mandatory — unbounded loops will eventually hang
Checkpoint state after each step to enable resume, replay, and debugging

✓ Define the loop contract explicitly: inputs, outputs, termination conditions, and max iterations

✗ Ship an agent loop without a step limit — it will hang or loop forever on edge cases

Planner

Breaks vague goals into concrete steps; should not directly perform risky actions
Outputs a structured plan (task list, DAG, or step sequence) for the executor
Dynamic replanning: the planner should revise the plan if an executor step fails
Planner prompts should be optimized for decomposition, not execution

✓ Separate planning from execution; the planner should never directly call side-effecting tools

Executor

Calls tools/APIs/functions under system control
Operates on a single step at a time; reports result (success/failure/observation) back to planner
Should not make planning decisions — escalate ambiguity back to the planner
Retry logic, timeout handling, and error formatting live in the executor layer

✓ Keep executor logic narrow: receive a step, call a tool, return a result

✗ Let the executor improvise on tool selection — it should only call what the planner specified

Verifier

Checks outputs, catches errors, validates constraints before final response
Can be a separate LLM call, a rule-based check, or both
Verifier scope: factual correctness, safety policy compliance, format validity
Failed verification should trigger replanning, not silent delivery of bad output

✓ Separate planning, execution, and verification clearly

✗ Let the LLM freely call tools without permissions, validation, or rollback

Memory & Human Approval

Memory/context stores useful state, but should be scoped and permission-aware
Human approval is required for destructive, financial, external, or irreversible actions
Memory should not store secrets or PII unless explicitly encrypted and access-controlled
Approval history should be persisted so agents don't re-ask for already-granted permissions

✓ Scope agent memory to the task; escalate to humans before any irreversible action