Skip to content

Sitemap## Towards AI

Towards AI

The leading AI community and content platform focused on making AI accessible to all. Check out our new course platform: https://academy.towardsai.net/courses/beginner-to-advanced-llm-dev

Large Language Models (LLMs) are great at reasoning, but real-world applications often require stateful, multi-step workflows. That’s where LangGraph comes in — it lets you build intelligent workflows using graphs of LLM-powered nodes.

But what if you want to expose these workflows as APIs, so other apps (or users) can call them? That’s where FastAPI comes in — a lightweight, high-performance Python web framework.

In this guide, you’ll learn how to wrap a LangGraph workflow inside FastAPI and turn it into a production-ready endpoint.

  • LangGraph: Create multi-step, stateful workflows with LLMs (e.g., multi-agent reasoning, data processing).
  • FastAPI: Easily expose these workflows as REST APIs for integration with web apps, microservices, or automation pipelines.
  • Together: Build scalable AI agents that can be accessed from anywhere.

Create a new project folder and install dependencies:

mkdir langgraph_fastapi_demo && cd langgraph_fastapi_demo
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install fastapi uvicorn langgraph langchain-openai python-dotenv

Create a .env file to store your API keys:

OPENAI_API_KEY=your_openai_key_here

Let’s build a simple LangGraph that takes a user question and returns an AI-generated answer.

workflow.py
from langgraph.graph import StateGraph, START, END
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
import os
from dotenv import load_dotenv
load_dotenv()
llm = ChatOpenAI(model="gpt-4o") # You can switch to gpt-4o-mini for cheaper calls
# Define state
def answer_question(state: dict) -> dict:
user_input = state["user_input"]
response = llm.invoke([HumanMessage(content=user_input)])
return {"answer": response.content}
# Build the graph
workflow = StateGraph(dict)
workflow.add_node("answer", answer_question)
workflow.add_edge(START, "answer")
workflow.add_edge("answer", END)
graph = workflow.compile()

This graph:

  1. Receives user_input
  2. Sends it to GPT-4o
  3. Returns the AI-generated response

Before exposing this to the world, let’s harden it for real use cases.

LLM APIs can fail or timeout. Wrap the call in try/except:

from tenacity import retry, wait_exponential, stop_after_attempt
@retry(wait=wait_exponential(multiplier=1, min=2, max=10), stop=stop_after_attempt(3))
def safe_invoke_llm(message):
return llm.invoke([HumanMessage(content=message)])
def answer_question(state: dict) -> dict:
user_input = state["user_input"]
try:
response = safe_invoke_llm(user_input)
return {"answer": response.content}
except Exception as e:
return {"answer": f"Error: {str(e)}"}

We don’t want someone sending huge payloads. Add Pydantic constraints:

from pydantic import BaseModel, constr
class RequestData(BaseModel):
user_input: constr(min_length=1, max_length=500) # limit input size

Add logging for visibility:

import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def answer_question(state: dict) -> dict:
logger.info(f"Received input: {state['user_input']}")
response = safe_invoke_llm(state['user_input'])
logger.info("LLM response generated")
return {"answer": response.content}

Now, let’s wrap this workflow inside FastAPI.

main.py
from fastapi import FastAPI
from workflow import graph, RequestData
app = FastAPI()
@app.post("/run")
async def run_workflow(data: RequestData):
result = graph.invoke({"user_input": data.user_input})
return {"result": result["answer"]}

Run the server:

uvicorn main:app --reload

You can test it using curl:

curl -X POST "http://127.0.0.1:8000/run" \
-H "Content-Type: application/json" \
-d '{"user_input":"What is LangGraph?"}'

Or open http://127.0.0.1:8000/docs in your browser — FastAPI auto-generates Swagger UI for you!

This interactive UI lets you test your endpoint directly in the browser.

A few steps to prepare for production:

  • Async execution: FastAPI is async-native. For multiple LLM calls, make functions async.
  • Workers: Run with multiple processes for concurrency:
uvicorn main:app --workers 4
  • Dockerization:
FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
  • Authentication: Use API keys or JWT tokens to protect endpoints (Part 2 coming soon).

Here’s how it all connects:

graph TD;
Client -->|POST /run| FastAPI --> LangGraph --> OpenAI_API --> Response

This simple architecture lets you turn any LangGraph into an API.

In just a few steps, we:

  • Built a LangGraph workflow
  • Exposed it as a REST API using FastAPI
  • Added production-readiness features (validation, retries, logging)
  • Laid the foundation for scalable AI microservices

This setup can power anything from chatbots to document processors to AI SaaS products.

What’s next?
I’m planning a Part 2 of this tutorial, but I want your input.

👉 Which one would you like me to cover next?

  1. Streaming responses (real-time chat)
  2. Authentication & security
  3. Docker & cloud deployment
  4. Error monitoring & observability

Comment below with your pick!

If you enjoyed this article and want more practical AI & LangGraph tutorials, follow GenAI Lab for weekly deep dives.

Towards AI

Towards AI

Last published 7 hours ago

The leading AI community and content platform focused on making AI accessible to all. Check out our new course platform: https://academy.towardsai.net/courses/beginner-to-advanced-llm-dev

Talbot Stevens

What are your thoughts?

Step by Step Instructions of a workflow 👍👍👍
Workflow = Taking a question & returning an answer? 🫤

25

authentication from the ground tjhough ,it would be great if you used this instance itself as an example.

42

[

See more recommendations

](https://medium.com/?source=post_page---read_next_recirc—599937ab84f3---------------------------------------)