Building a Multi-Agent WhatsApp Assistant with LangGraph

Introduction

In this guide, I’ll show you how to create a multi-agent WhatsApp assistant using Python and LangGraph.

The goal is to have different AI agents working together: one to understand the user’s intent, another to perform the action, and a supervisor that coordinates both.

If you’ve already built WhatsApp bots with Twilio, this is the next step. Instead of a single prompt-based bot, you’ll have multiple specialized agents that can reason, delegate, and complete tasks together.

Why multi-agent assistants?

Most WhatsApp chatbots follow a simple pattern:

Receive a message
Send it to an LLM
Return the answer

That approach works, but it’s limited. You’ll quickly find problems when:

You want the bot to call APIs (calendar, CRM, database)
You need structured reasoning across steps
You want to separate tasks and reuse them in different contexts

LangGraph solves that by letting you define stateful workflows between multiple AI agents, similar to a flowchart but fully defined in code.

What we’ll build

We’ll create a simple multi-agent architecture that can be adapted for different use cases.

Example agents:

Interpreter: understands what the user wants
Executor: performs the action
Supervisor: coordinates both

Once you understand this structure, you can reuse it for:

Booking or scheduling systems
WhatsApp-based customer support
Sales assistants connected to CRMs
Internal automations or data lookups

Prerequisites

You’ll need:

Python 3.10 or higher
A WhatsApp sandbox in Twilio
An OpenAI API key
The following packages:

pip install langgraph langchain-openai fastapi python-dotenv

Create a .env file:

OPENAI_API_KEY=sk-your-key 
MODEL_NAME=gpt-4o-mini

Setting up the agents

We’ll start by creating two agents: one to interpret the user’s message and another to execute the corresponding action.

# interpreter_agent.py
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field
import os

class IntentOutput(BaseModel):
    intent: str = Field(description="The user’s intent, e.g., check_status, create_booking, cancel")
    parameters: dict = Field(description="Extracted entities or parameters from the user message")

SYSTEM_PROMPT = """
You are the Interpreter Agent.
Your job is to read a WhatsApp message, detect the user’s intent, and extract structured parameters.

Example:
User: "Can you book a table for two tomorrow at 8pm?"
Response (as JSON):
{
  "intent": "create_booking",
  "parameters": {"people": 2, "time": "2025-07-10T20:00:00"}
}
"""

llm = ChatOpenAI(model=os.getenv("MODEL_NAME", "gpt-4o-mini"), use_responses_api=True)

interpreter_agent = create_react_agent(
    name="interpreter_agent",
    model=llm,
    tools=[],
    prompt=SYSTEM_PROMPT
)

# executor_agent.py
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
import os

SYSTEM_PROMPT = """
You are the Executor Agent.
You receive an intent and parameters from the Interpreter.
Your job is to perform the action (or simulate it) and return a WhatsApp-friendly message.

If intent == "create_booking", pretend to create a booking.
If intent == "check_status", pretend to look up an order or reservation.
If intent == "cancel", confirm the cancellation.

Always return a short, user-friendly message, e.g.:
"Your table for two tomorrow at 8pm has been booked ✅"
"""

llm = ChatOpenAI(model=os.getenv("MODEL_NAME", "gpt-4o-mini"), use_responses_api=True)

executor_agent = create_react_agent(
    name="executor_agent",
    model=llm,
    tools=[],
    prompt=SYSTEM_PROMPT
)

The supervisor

This is the part that coordinates everything.
It defines the logic that decides which agent runs next and how the information flows between them.

from typing import Literal, TypedDict, Annotated
from langgraph.graph import StateGraph, START, END
from langchain_core.messages import HumanMessage, AIMessage, BaseMessage
import operator
from interpreter_agent import interpreter_agent
from executor_agent import executor_agent

class AssistantState(TypedDict):
    messages: Annotated[list[BaseMessage], operator.add]
    user_prompt: str
    intent: str | None
    parameters: dict | None
    result: str | None
    next_agent: Literal["interpreter", "executor", "finish"] | None

def route_to_next(state: AssistantState) -> Literal["interpreter", "executor", "finish"]:
    if not state.get("intent"):
        return "interpreter"
    elif not state.get("result"):
        return "executor"
    else:
        return "finish"

def call_interpreter(state: AssistantState) -> dict:
    result = interpreter_agent.invoke({
        "messages": [HumanMessage(content=state["user_prompt"])]
    })
    intent = None
    parameters = {}

    text = result["messages"][-1].content if "messages" in result else ""
    if "intent" in text:
        import json
        try:
            parsed = json.loads(text)
            intent = parsed.get("intent")
            parameters = parsed.get("parameters", {})
        except:
            pass

    return {
        "intent": intent,
        "parameters": parameters,
        "messages": [AIMessage(content=f"Detected intent: {intent}", name="interpreter")]
    }

def call_executor(state: AssistantState) -> dict:
    prompt = f"Intent: {state['intent']}\nParameters: {state['parameters']}"
    result = executor_agent.invoke({"messages": [HumanMessage(content=prompt)]})
    text = result["messages"][-1].content if "messages" in result else ""
    return {"result": text, "messages": [AIMessage(content=text, name="executor")]}

def create_assistant():
    workflow = StateGraph(AssistantState)
    workflow.add_node("interpreter", call_interpreter)
    workflow.add_node("executor", call_executor)
    workflow.add_edge(START, "interpreter")
    workflow.add_conditional_edges("interpreter", route_to_next, {"executor": "executor"})
    workflow.add_conditional_edges("executor", route_to_next, {"finish": END})
    return workflow.compile()

assistant_team = create_assistant()

Exposing it through WhatsApp

You can use FastAPI and Twilio’s WhatsApp webhook to handle messages.

from fastapi import FastAPI, Form
from fastapi.responses import PlainTextResponse
from langchain_core.messages import HumanMessage
from supervisor import assistant_team

app = FastAPI()

@app.post("/twilio/whatsapp")
async def whatsapp_webhook(From: str = Form(...), Body: str = Form("")):
    state = {"messages": [HumanMessage(content=Body)], "user_prompt": Body}
    result = assistant_team.invoke(state)
    reply = result.get("result") or "I couldn’t process that yet."
    twiml = f"""
<Response>
  <Message>{reply}</Message>
</Response>
"""
    return PlainTextResponse(twiml, media_type="application/xml")

How it works

The user sends a WhatsApp message.
Twilio forwards it to your webhook.
The Supervisor runs the Interpreter first to detect the intent.
Then it calls the Executor, which performs the action.
The final result is sent back to WhatsApp.

You can think of it as two agents having a short conversation: one understands what needs to be done, the other executes it.

Example flow

User: “Can you check my order status?”
Interpreter: { "intent": "check_status" }
Executor: “Your order is currently being prepared 🚚”

User: “Cancel my meeting at 5pm”
Interpreter: { "intent": "cancel", "parameters": {"time": "17:00"} }
Executor: “Meeting at 5pm has been cancelled ✅”

Why this pattern matters

Multi-agent systems let you scale beyond simple keyword bots.
You can reuse agents, connect them to APIs, and give them specific responsibilities.
That means less prompt complexity, better reliability, and cleaner integrations.

This approach fits perfectly in:

Healthcare: appointment scheduling and reminders
Real estate: showing availability and booking visits
E-commerce: order lookups and returns
Professional services: meeting scheduling or rescheduling

With this setup, you’ll reduce manual work, respond faster, and improve customer experience, all inside WhatsApp.

Final thoughts

LangGraph makes it easy to build structured, multi-agent workflows.
Combined with Twilio’s WhatsApp API, you can go from a static chatbot to an assistant that understands, acts, and collaborates.

Start with two agents and one workflow. Once it works, extend it to handle your entire communication flow, from lead qualification to post-sale follow-ups.

Liked the post? Subscribe to my newsletter below for more!

Twilio, Python

Published on October 03, 2025

Ready to take your project to the next level?

Contact me

About the author

Gonzalo Gomez

Sr. Software Engineer

Senior software engineer located in Buenos Aires, Argentina. I specialize in building highly scalable web applications and I've been delivering MVPs and helping companies with their digital transformation for the past 7 years. I am proficient in a variety of technologies, including Laravel, Vue.js, Twilio, AWS, React.js, Node.js and MySQL.

Subscribe to my newsletter

If you enjoy the content that I make, you can subscribe and receive insightful information through email. No spam is going to be sent, just updates about interesting posts or specialized content that I talk about.

Programming

How to handle multiple projects with Python using virtual environments

August 20, 2025

IntroductionPython is one of the most popular programming languages nowadays. It typically is the go-to option when doing Data Science, Machine Learning, and AI development.... Continue reading

Python

Programming

Stop Overbuilding: Start With Twilio Flex and Grow When You Need To

April 22, 2025

Introduction In the rush to deliver seamless customer service, too many companies fall into the trap of overbuilding their contact center infrastructure. They invest heavily upfront—custom... Continue reading

Twilio

Programming

How to integrate WhatsApp with Twilio: The practical guide

September 22, 2025

IntroductionHey there! If you're looking to add WhatsApp messaging to your app or business workflow, without worrying about the WhatsApp Business API, Twilio makes it... Continue reading

Twilio Python

Programming

Why a programming bootcamp is no longer enough

April 11, 2024

Let's be honest, most of the people that start in the IT industry go the programming path because it's one of the main branches that... Continue reading

MySQL HTML CSS JavaScript Journey PHP

Building a Multi-Agent WhatsApp Assistant with LangGraph

Introduction

Why multi-agent assistants?

What we’ll build

Prerequisites

Setting up the agents

The supervisor

Exposing it through WhatsApp

How it works

Example flow

Why this pattern matters

Final thoughts

About the author

Gonzalo Gomez

Subscribe to my newsletter

Related posts

How to handle multiple projects with Python using virtual environments

Stop Overbuilding: Start With Twilio Flex and Grow When You Need To

How to integrate WhatsApp with Twilio: The practical guide

Why a programming bootcamp is no longer enough