← All writing9 December 2025 · 3 min read

Five custom Claude agents I run every day

The system prompts, the tools, the failure modes. Nothing theoretical, just what I actually use.

WorkflowTechnical

These are the five Claude agents I actually run, every day, in production or in my workflow. Names, system prompts, tools, failure modes. No theoretical examples.

1. Intake-router (Agency Genius)

Where it runs: inside Agency Genius, the CRM behind Web Hero.

What it does: every inbound ticket from a client or partner agency hits this agent first. It classifies the ticket (build, fix, content, audit, billing), pulls relevant context from the client record, and assigns it to the right delivery queue.

Tools: get_client_history, tag_ticket, assign_to_queue, escalate.

Failure mode: confidence calibration drift. When the agent isn't sure, it sometimes assigns rather than escalates. The fix was a hard-coded confidence threshold below which it must escalate and explain its uncertainty.

2. Compliance-checker (LoadSnap)

Where it runs: inside LoadSnap, the waste compliance SaaS.

What it does: validates a waste transfer note draft against DEFRA's schema before the driver hits send.

Tools: fetch_defra_schema, validate_field, request_correction.

Failure mode: schema version drift. DEFRA quietly updates the schema, the agent validates against a stale copy, the draft passes locally but fails server-side. The fix was a daily schema-refresh job, with an explicit version pin in the system prompt.

3. Brand-reader (Avago)

Where it runs: inside Avago, the AI website builder.

What it does: takes whatever the user pastes in (a Facebook page, a Google Business listing, a competitor's site) and returns a structured brand brief that the page-drafter agent can populate templates from.

Tools: tavily_search, dataforseo_competitors, enrich_business_info.

Failure mode: thin input, confident output. Users who pasted in barely anything got back over-confident brand briefs. The fix was a low-data branch in the prompt that returns "I need more from you, here's what to add" rather than fabricating.

4. Outbound copywriter (OutPitch)

Where it runs: inside OutPitch, drafting the first message for each lead before a human reviews.

What it does: takes the lead's company profile (enriched by another agent), the campaign's positioning brief, and a tone control, and writes the opening message. Drafts go to a queue, not directly out. The human approves, edits, or rejects.

Tools: get_lead_context, get_campaign_brief, flag_for_human.

Failure mode: drifts toward a kind of average B2B voice if not constrained. The fix was anti-pattern prompts: an explicit list of phrases to never use ("touching base", "circling back", "I hope this finds you well"). Worth more than any positive instruction.

5. QA-checker (Agency Genius)

Where it runs: inside Agency Genius, on every delivered build.

What it does: runs the finished work against the original brief. Flags drift. Suggests fixes. Asks the builder to confirm before the work goes to the client.

Tools: get_brief, fetch_site_snapshot, compare, flag_drift.

Failure mode: over-eagerness. Early versions flagged every cosmetic deviation as drift. The fix was an explicit "scope of drift" section in the system prompt that distinguishes acceptable interpretation from genuine miss.

Patterns across all five

A few things repeat.

Tight system prompts beat long ones. Every prompt above is under 500 words. The shorter the prompt, the more predictable the behaviour.

Tools are the contract. A tool with a clear name, a Zod schema for input, and a typed return value is the difference between agentic software and a clever chatbot. Every agent I run treats the tool spec as the source of truth.

Failure modes are documented. Each agent has a failure-modes.md in its directory. New failure modes get appended with the date. The system prompts evolve to handle them, not the application code.

Provenance is visible. Where an agent answer depends on a tool result, that result is shown alongside. No magic boxes.

That's it. Five agents, hand-built, running every day. None of them is a chatbot. Each does one job inside a workflow where a chatbot would be friction. That's the difference between agents that earn their keep and agents that drift to the bottom of someone's "AI tools" Notion page.

If you're building something where one of these patterns would help, drop me a message. I take on a small number of consultancy and build engagements each quarter.