Build a Multi-Agent Customer Support System with Handoffs in TypeScript
Friday 10/04/2026
·15 min readYour AI chatbot handles "what are your hours?" just fine. But the moment a customer asks about a partial refund on an invoice from three months ago, it falls apart. One monolithic prompt can't be an expert in billing, debugging technical issues, and processing returns. You need specialized agents — and a smart way to hand conversations between them.
Here's how to build a multi-agent customer support system in TypeScript where a triage agent routes queries to specialized agents, each with their own tools and system prompts, and any agent can escalate to a human when it's out of its depth.
Architecture: triage → specialist → escalation
The system has four agents:
- Triage agent — classifies the query and routes to the right specialist
- Billing agent — handles invoices, payments, refunds, plan changes
- Technical agent — troubleshoots product issues, checks service status
- Returns agent — processes returns, exchanges, shipping issues
Each specialist can either resolve the issue or escalate to a human. The triage agent can also escalate directly if the query doesn't fit any category.
pnpm add @anthropic-ai/sdk zod
Defining agent types and conversation state
First, let's set up the types that keep our multi-agent system organized:
// src/support/types.ts
import { MessageParam } from "@anthropic-ai/sdk/resources/messages";
export type AgentName = "triage" | "billing" | "technical" | "returns";
export interface AgentDefinition {
name: AgentName;
systemPrompt: string;
tools: Tool[];
}
export interface Tool {
name: string;
description: string;
input_schema: Record<string, unknown>;
}
export interface ConversationState {
messages: MessageParam[];
currentAgent: AgentName;
customerId: string | null;
escalated: boolean;
escalationReason: string | null;
metadata: Record<string, unknown>;
}
export interface HandoffResult {
targetAgent: AgentName;
reason: string;
context: string;
}
export interface AgentResponse {
text: string;
handoff: HandoffResult | null;
escalateToHuman: boolean;
escalationReason: string | null;
}
Building the specialist agents
Each agent gets its own system prompt and tools. Here's the key insight: give each agent a handoff tool and an escalate_to_human tool. This lets the LLM decide when it's out of its depth instead of relying on brittle keyword matching.
// src/support/agents.ts
import { AgentDefinition, Tool } from "./types";
const handoffTool: Tool = {
name: "handoff_to_agent",
description:
"Transfer this conversation to a different specialized agent. Use this when the customer's issue falls outside your expertise.",
input_schema: {
type: "object" as const,
properties: {
target_agent: {
type: "string",
enum: ["triage", "billing", "technical", "returns"],
description: "The agent to transfer to",
},
reason: {
type: "string",
description: "Why you're transferring this conversation",
},
context: {
type: "string",
description:
"Summary of the conversation so far for the next agent",
},
},
required: ["target_agent", "reason", "context"],
},
};
const escalateTool: Tool = {
name: "escalate_to_human",
description:
"Escalate this conversation to a human support agent. Use when: the customer is frustrated and wants a human, the issue requires manual intervention you cannot perform, or you've failed to resolve the issue after 2 attempts.",
input_schema: {
type: "object" as const,
properties: {
reason: {
type: "string",
description: "Why this needs human attention",
},
priority: {
type: "string",
enum: ["low", "medium", "high", "urgent"],
description: "How urgent is this escalation",
},
summary: {
type: "string",
description:
"Full summary of the issue and what has been tried so far",
},
},
required: ["reason", "priority", "summary"],
},
};
export const triageAgent: AgentDefinition = {
name: "triage",
systemPrompt: `You are a customer support triage agent. Your ONLY job is to understand the customer's issue and route them to the right specialist agent.
Routing rules:
- Billing questions (invoices, charges, refunds, plan changes, payment methods) → billing
- Technical issues (bugs, errors, performance, integrations, API problems) → technical
- Returns and exchanges (product returns, shipping issues, wrong item received) → returns
Ask ONE clarifying question if the intent is truly ambiguous. Do not try to resolve issues yourself.
If the customer is angry or explicitly asks for a human, escalate immediately.`,
tools: [handoffTool, escalateTool],
};
export const billingAgent: AgentDefinition = {
name: "billing",
systemPrompt: `You are a billing support specialist. You handle invoices, payments, refunds, and plan changes.
You have access to customer billing data and can perform actions like looking up invoices and processing refunds up to $100. For refunds over $100, escalate to a human.
Always verify the customer's identity by asking for their email before accessing account data.
Be precise with dollar amounts and dates. Never guess at billing information.`,
tools: [
handoffTool,
escalateTool,
{
name: "lookup_invoices",
description:
"Look up recent invoices for a customer by their email address",
input_schema: {
type: "object" as const,
properties: {
email: {
type: "string",
description: "Customer email address",
},
limit: {
type: "number",
description:
"Number of recent invoices to return (default 5)",
},
},
required: ["email"],
},
},
{
name: "process_refund",
description:
"Process a refund for a specific invoice. Only for amounts up to $100.",
input_schema: {
type: "object" as const,
properties: {
invoice_id: {
type: "string",
description: "The invoice ID to refund",
},
amount: {
type: "number",
description: "Refund amount in dollars",
},
reason: {
type: "string",
description: "Reason for the refund",
},
},
required: ["invoice_id", "amount", "reason"],
},
},
],
};
export const technicalAgent: AgentDefinition = {
name: "technical",
systemPrompt: `You are a technical support specialist. You help customers troubleshoot product issues, API errors, and integration problems.
You can check service status and look up error codes. Walk customers through solutions step-by-step.
If the issue requires a code change on our end (a bug), escalate to human with priority "high" and include reproduction steps.`,
tools: [
handoffTool,
escalateTool,
{
name: "check_service_status",
description:
"Check the current status of our services and any ongoing incidents",
input_schema: {
type: "object" as const,
properties: {
service: {
type: "string",
enum: ["api", "dashboard", "webhooks", "auth", "all"],
description: "Which service to check",
},
},
required: ["service"],
},
},
{
name: "lookup_error_code",
description:
"Look up what a specific error code means and known solutions",
input_schema: {
type: "object" as const,
properties: {
error_code: {
type: "string",
description:
"The error code the customer is seeing (e.g., ERR_4012)",
},
},
required: ["error_code"],
},
},
],
};
export const returnsAgent: AgentDefinition = {
name: "returns",
systemPrompt: `You are a returns and exchanges specialist. You handle product returns, exchanges, and shipping issues.
You can look up orders and initiate returns. Return window is 30 days from delivery.
For items outside the return window, you can offer a 15% discount on a new order instead — use your judgment.
For lost packages, escalate to human with priority "high".`,
tools: [
handoffTool,
escalateTool,
{
name: "lookup_order",
description: "Look up an order by order ID or customer email",
input_schema: {
type: "object" as const,
properties: {
order_id: {
type: "string",
description: "Order ID (optional if email provided)",
},
email: {
type: "string",
description:
"Customer email (optional if order_id provided)",
},
},
required: [],
},
},
{
name: "initiate_return",
description:
"Start a return process for an order. Generates a return shipping label.",
input_schema: {
type: "object" as const,
properties: {
order_id: {
type: "string",
description: "The order ID to return",
},
items: {
type: "array",
items: { type: "string" },
description: "List of item IDs to return",
},
reason: {
type: "string",
description: "Reason for return",
},
},
required: ["order_id", "items", "reason"],
},
},
],
};
export const agents: Record<string, AgentDefinition> = {
triage: triageAgent,
billing: billingAgent,
technical: technicalAgent,
returns: returnsAgent,
};
The agent runner: handling tool calls and handoffs
This is where it gets interesting. The runner executes the agent loop — sending messages to Claude, processing tool calls, and handling handoffs between agents. The critical design decision: handoff context travels with the conversation so the next agent doesn't start from zero.
// src/support/runner.ts
import Anthropic from "@anthropic-ai/sdk";
import {
AgentResponse,
ConversationState,
HandoffResult,
AgentName,
} from "./types";
import { agents } from "./agents";
const client = new Anthropic();
// Simulated tool implementations — replace with your real backends
function executeToolCall(
toolName: string,
input: Record<string, unknown>
): string {
switch (toolName) {
case "lookup_invoices":
return JSON.stringify({
invoices: [
{
id: "INV-2024-1234",
amount: 49.99,
date: "2026-03-15",
status: "paid",
},
{
id: "INV-2024-1190",
amount: 49.99,
date: "2026-02-15",
status: "paid",
},
],
});
case "process_refund":
if ((input.amount as number) > 100) {
return JSON.stringify({
success: false,
error: "Amount exceeds $100 limit. Escalate to human.",
});
}
return JSON.stringify({
success: true,
refund_id: "REF-" + Date.now(),
amount: input.amount,
});
case "check_service_status":
return JSON.stringify({
status: "operational",
incidents: [],
last_checked: new Date().toISOString(),
});
case "lookup_error_code":
return JSON.stringify({
code: input.error_code,
meaning: "Authentication token expired",
solution:
"Regenerate your API key in the dashboard under Settings → API Keys",
});
case "lookup_order":
return JSON.stringify({
order_id: input.order_id ?? "ORD-5678",
status: "delivered",
delivered_at: "2026-03-28",
items: [
{ id: "ITEM-001", name: "Pro Widget", price: 79.99 },
],
});
case "initiate_return":
return JSON.stringify({
success: true,
return_id: "RET-" + Date.now(),
shipping_label_url: "https://example.com/label/RET-" + Date.now(),
});
default:
return JSON.stringify({ error: `Unknown tool: ${toolName}` });
}
}
export async function runAgent(
state: ConversationState
): Promise<AgentResponse> {
const agentDef = agents[state.currentAgent];
if (!agentDef) {
throw new Error(`Unknown agent: ${state.currentAgent}`);
}
const claudeTools = agentDef.tools.map((tool) => ({
name: tool.name,
description: tool.description,
input_schema: tool.input_schema as Anthropic.Tool["input_schema"],
}));
// Agent loop — keep processing until we get a text response, handoff, or escalation
let loopCount = 0;
const maxLoops = 10;
while (loopCount < maxLoops) {
loopCount++;
const response = await client.messages.create({
model: "claude-sonnet-4-6-20250514",
max_tokens: 1024,
system: agentDef.systemPrompt,
tools: claudeTools,
messages: state.messages,
});
// Collect text and tool use blocks
let textResponse = "";
let handoff: HandoffResult | null = null;
let escalateToHuman = false;
let escalationReason: string | null = null;
const toolResults: Array<{
type: "tool_result";
tool_use_id: string;
content: string;
}> = [];
for (const block of response.content) {
if (block.type === "text") {
textResponse += block.text;
}
if (block.type === "tool_use") {
const input = block.input as Record<string, unknown>;
if (block.name === "handoff_to_agent") {
handoff = {
targetAgent: input.target_agent as AgentName,
reason: input.reason as string,
context: input.context as string,
};
// Don't execute — we'll handle this after the loop
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content: JSON.stringify({
success: true,
message: `Transferring to ${input.target_agent}`,
}),
});
} else if (block.name === "escalate_to_human") {
escalateToHuman = true;
escalationReason = input.reason as string;
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content: JSON.stringify({
success: true,
message:
"Escalation created. A human agent will take over shortly.",
}),
});
} else {
// Execute the actual tool
const result = executeToolCall(block.name, input);
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content: result,
});
}
}
}
// If there were tool calls, add the assistant message and tool results
if (toolResults.length > 0) {
state.messages.push({ role: "assistant", content: response.content });
state.messages.push({ role: "user", content: toolResults });
// If we got a handoff or escalation, break out
if (handoff || escalateToHuman) {
// Do one more call to get the transfer/escalation message
const finalResponse = await client.messages.create({
model: "claude-sonnet-4-6-20250514",
max_tokens: 512,
system: agentDef.systemPrompt,
tools: claudeTools,
messages: state.messages,
});
const finalText = finalResponse.content
.filter((b) => b.type === "text")
.map((b) => {
if (b.type === "text") return b.text;
return "";
})
.join("");
return {
text: finalText || textResponse,
handoff,
escalateToHuman,
escalationReason,
};
}
// Otherwise, continue the loop to process tool results
continue;
}
// No tool calls — we have a final text response
return {
text: textResponse,
handoff: null,
escalateToHuman: false,
escalationReason: null,
};
}
throw new Error("Agent exceeded maximum loop iterations");
}
The conversation orchestrator
The orchestrator ties everything together. It manages the conversation state and handles the handoff flow between agents:
// src/support/orchestrator.ts
import { ConversationState, AgentName } from "./types";
import { runAgent } from "./runner";
export class SupportOrchestrator {
private state: ConversationState;
constructor() {
this.state = {
messages: [],
currentAgent: "triage",
customerId: null,
escalated: false,
escalationReason: null,
metadata: {},
};
}
async handleMessage(userMessage: string): Promise<string> {
if (this.state.escalated) {
return `This conversation has been escalated to a human agent. Reason: ${this.state.escalationReason}`;
}
// Add the user's message
this.state.messages.push({ role: "user", content: userMessage });
// Run the current agent
const response = await runAgent(this.state);
// Handle escalation
if (response.escalateToHuman) {
this.state.escalated = true;
this.state.escalationReason = response.escalationReason;
// In production, you'd create a ticket in your helpdesk system here
console.log(
`[ESCALATION] Reason: ${response.escalationReason}`
);
return response.text;
}
// Handle handoff to another agent
if (response.handoff) {
const previousAgent = this.state.currentAgent;
this.state.currentAgent = response.handoff.targetAgent;
console.log(
`[HANDOFF] ${previousAgent} → ${response.handoff.targetAgent}: ${response.handoff.reason}`
);
// Inject handoff context so the new agent knows what happened
this.state.messages.push({
role: "assistant",
content: response.text,
});
this.state.messages.push({
role: "user",
content: `[System: You are now handling this conversation. It was transferred from the ${previousAgent} agent. Context: ${response.handoff.context}]\n\nPlease continue helping the customer based on the conversation history above.`,
});
// Run the new agent immediately
const handoffResponse = await runAgent(this.state);
// Add the new agent's response to history
this.state.messages.push({
role: "assistant",
content: handoffResponse.text,
});
return handoffResponse.text;
}
// Normal response — add to conversation history
this.state.messages.push({
role: "assistant",
content: response.text,
});
return response.text;
}
getCurrentAgent(): AgentName {
return this.state.currentAgent;
}
isEscalated(): boolean {
return this.state.escalated;
}
}
Wiring it up: a simple CLI demo
Let's test the whole system with a command-line interface:
// src/support/demo.ts
import * as readline from "readline";
import { SupportOrchestrator } from "./orchestrator";
async function main() {
const orchestrator = new SupportOrchestrator();
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
console.log("Customer Support System Ready");
console.log('Type your message (or "quit" to exit)\n');
const prompt = () => {
const agent = orchestrator.getCurrentAgent();
rl.question(`[${agent}] You: `, async (input) => {
const trimmed = input.trim();
if (trimmed.toLowerCase() === "quit") {
rl.close();
return;
}
try {
const response = await orchestrator.handleMessage(trimmed);
console.log(`\nAgent: ${response}\n`);
} catch (error) {
console.error("Error:", error);
}
if (!orchestrator.isEscalated()) {
prompt();
} else {
console.log(
"\n[Conversation escalated to human support]\n"
);
rl.close();
}
});
};
prompt();
}
main();
Run it with:
npx tsx src/support/demo.ts
A typical conversation flow looks like this:
[triage] You: I was charged twice for my subscription last month
[HANDOFF] triage → billing: Customer reports duplicate charge
Agent: I'd be happy to help you with that duplicate charge. To pull up your
billing information, could you please provide the email address associated
with your account?
[billing] You: sure, it's jane@example.com
Agent: I found your recent invoices. I can see two charges of $49.99 from
March 2026. Let me process a refund for the duplicate charge on invoice
INV-2024-1234. The refund of $49.99 has been submitted (REF-1712764800000)
and should appear on your statement within 5-7 business days.
Gotchas and things that'll trip you up
Message history grows fast. Each agent loop iteration adds messages. For long conversations, you'll blow past context windows. In production, summarize older messages or keep a sliding window of the last N turns plus a running summary.
Handoff loops. A billing agent might hand off to technical, which hands off back to billing. Add a simple guard:
// src/support/orchestrator.ts (add to handleMessage)
const handoffHistory: AgentName[] = [];
// Inside the handoff block:
if (response.handoff) {
handoffHistory.push(response.handoff.targetAgent);
if (handoffHistory.length > 3) {
// Too many handoffs — escalate to human
this.state.escalated = true;
this.state.escalationReason =
"Conversation bounced between agents too many times";
return "I'm sorry, it seems like your issue needs a specialist. Let me connect you with a human agent who can help.";
}
}
Tool execution errors. If your backend is down, the agent will get an error response and might hallucinate a result. Always return structured errors and instruct the agent in the system prompt to tell the customer when something went wrong instead of guessing.
The "I want a human" escape hatch. Some customers will ask for a human immediately. The triage agent's system prompt handles this, but for extra reliability, add a pre-check before even calling the LLM:
// src/support/orchestrator.ts
const HUMAN_KEYWORDS = [
"talk to a human",
"real person",
"speak to someone",
"agent please",
"representative",
];
async handleMessage(userMessage: string): Promise<string> {
const lower = userMessage.toLowerCase();
if (HUMAN_KEYWORDS.some((kw) => lower.includes(kw))) {
this.state.escalated = true;
this.state.escalationReason = "Customer explicitly requested a human agent";
return "Absolutely — I'm connecting you with a human agent now. Someone will be with you shortly.";
}
// ... rest of the method
}
Connecting to a real helpdesk for escalations
In production, you'd push escalations to your actual ticketing system. Here's a quick integration with a webhook (works with Zendesk, Intercom, Linear, or any system with an API):
// src/support/escalation.ts
interface EscalationPayload {
reason: string;
priority: string;
conversationHistory: string;
customerId: string | null;
metadata: Record<string, unknown>;
}
export async function createEscalationTicket(
payload: EscalationPayload
): Promise<{ ticketId: string; estimatedWaitMinutes: number }> {
const response = await fetch(process.env.ESCALATION_WEBHOOK_URL!, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${process.env.ESCALATION_API_KEY!}`,
},
body: JSON.stringify({
subject: `AI Escalation: ${payload.reason}`,
priority: payload.priority,
description: payload.conversationHistory,
customer_id: payload.customerId,
tags: ["ai-escalated"],
custom_fields: payload.metadata,
}),
});
if (!response.ok) {
throw new Error(`Escalation failed: ${response.status}`);
}
return response.json() as Promise<{
ticketId: string;
estimatedWaitMinutes: number;
}>;
}
What's next
This system handles the core multi-agent pattern, but there's more to build for production. You'll want observability to understand which agents handle what volume and where handoffs break down — check out How to Add LLM Observability and Tracing to Your TypeScript AI App with Langfuse for that piece.
You'll also want to think about cost. Each handoff means another LLM call with the full conversation history. The techniques in How to Cache AI Responses Without Breaking Your App can help reduce redundant calls for common support queries, and The Real Cost of Running an AI Feature in Production will help you model the per-conversation economics before your support volume scales up.