Creating an AI Agent-Primarily based System with LangGraph: Including Persistence and Streaming (Step by Step Information)

February 2, 2025

0 Views

Creating an AI Agent-Primarily based System with LangGraph: Including Persistence and Streaming (Step by Step Information)

In our earlier tutorial, we constructed an AI agent able to answering queries by browsing the online. Nevertheless, when constructing brokers for longer-running duties, two important ideas come into play: persistence and streaming. Persistence permits you to save the state of an agent at any given level, enabling you to renew from that state in future interactions. That is essential for long-running purposes. However, streaming enables you to emit real-time indicators about what the agent is doing at any second, offering transparency and management over its actions. On this tutorial, we’ll improve our agent by including these highly effective options.

Setting Up the Agent

Let’s begin by recreating our agent. We’ll load the required atmosphere variables, set up and import the required libraries, arrange the Tavily search instrument, outline the agent state, and eventually, construct the agent.

pip set up langgraph==0.2.53 langgraph-checkpoint==2.0.6 langgraph-sdk==0.1.36 langchain-groq langchain-community langgraph-checkpoint-sqlite==2.0.1

import os
os.environ['TAVILY_API_KEY'] = ""
os.environ['GROQ_API_KEY'] = ""

from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator
from langchain_core.messages import AnyMessage, SystemMessage, HumanMessage, ToolMessage
from langchain_groq import ChatGroq
from langchain_community.instruments.tavily_search import TavilySearchResults

instrument = TavilySearchResults(max_results=2)

class AgentState(TypedDict):
    messages: Annotated[list[AnyMessage], operator.add]

class Agent:
    def __init__(self, mannequin, instruments, system=""):
        self.system = system
        graph = StateGraph(AgentState)
        graph.add_node("llm", self.call_openai)
        graph.add_node("motion", self.take_action)
        graph.add_conditional_edges("llm", self.exists_action, {True: "motion", False: END})
        graph.add_edge("motion", "llm")
        graph.set_entry_point("llm")
        self.graph = graph.compile()
        self.instruments = {t.identify: t for t in instruments}
        self.mannequin = mannequin.bind_tools(instruments)

    def call_openai(self, state: AgentState):
        messages = state['messages']
        if self.system:
            messages = [SystemMessage(content=self.system)] + messages
        message = self.mannequin.invoke(messages)
        return {'messages': [message]}

    def exists_action(self, state: AgentState):
        consequence = state['messages'][-1]
        return len(consequence.tool_calls) > 0

    def take_action(self, state: AgentState):
        tool_calls = state['messages'][-1].tool_calls
        outcomes = []
        for t in tool_calls:
            print(f"Calling: {t}")
            consequence = self.instruments[t['name']].invoke(t['args'])
            outcomes.append(ToolMessage(tool_call_id=t['id'], identify=t['name'], content material=str(consequence)))
        print("Again to the mannequin!")
        return {'messages': outcomes}

Including Persistence

So as to add persistence, we’ll use LangGraph’s checkpointer characteristic. A checkpointer saves the state of the agent after and between each node. For this tutorial, we’ll use SqliteSaver, a easy checkpointer that leverages SQLite, a built-in database. Whereas we’ll use an in-memory database for simplicity, you’ll be able to simply join it to an exterior database or use different checkpoints like Redis or Postgres for extra strong persistence.

from langgraph.checkpoint.sqlite import SqliteSaver
import sqlite3
sqlite_conn = sqlite3.join("checkpoints.sqlite",check_same_thread=False)
reminiscence = SqliteSaver(sqlite_conn)

Subsequent, we’ll modify our agent to simply accept a checkpointer:

class Agent:
    def __init__(self, mannequin, instruments, checkpointer, system=""):
        # All the things else stays the identical as earlier than
        self.graph = graph.compile(checkpointer=checkpointer)
    # All the things else after this stays the identical

Now, we will create our agent with persistence enabled:

immediate = """You're a good analysis assistant. Use the search engine to search for info. 
You're allowed to make a number of calls (both collectively or in sequence). 
Solely search for info if you find yourself certain of what you need. 
If it's worthwhile to search for some info earlier than asking a follow-up query, you're allowed to try this!
"""
mannequin = ChatGroq(mannequin="Llama-3.3-70b-Specdec")
bot = Agent(mannequin, [tool], system=immediate, checkpointer=reminiscence)

Including Streaming

Streaming is crucial for real-time updates. There are two forms of streaming we’ll give attention to:

1. Streaming Messages: Emitting intermediate messages like AI selections and power outcomes.

2. Streaming Tokens: Streaming particular person tokens from the LLM’s response.
Let’s begin by streaming messages. We’ll create a human message and use the stream methodology to look at the agent’s actions in real-time.

messages = [HumanMessage(content="What is the weather in Texas?")]
thread = {"configurable": {"thread_id": "1"}}
for occasion in bot.graph.stream({"messages": messages}, thread):
    for v in occasion.values():
        print(v['messages'])

Ultimate output: The present climate in Texas is sunny with a temperature of 19.4°C (66.9°F) and a wind pace of 4.3 mph (6.8 kph)…..

If you run this, you’ll see a stream of outcomes. First, an AI message instructing the agent to name Tavily, adopted by a instrument message with the search outcomes, and eventually, an AI message answering the query.

Understanding Thread IDs

The thread_id is a vital a part of the thread configuration. It permits the agent to keep up separate conversations with totally different customers or contexts. By assigning a singular thread_id to every dialog, the agent can hold monitor of a number of interactions concurrently with out mixing them up.

For instance, let’s proceed the dialog by asking, “What about in LA?” utilizing the identical thread_id:

messages = [HumanMessage(content="What about in LA?")]
thread = {"configurable": {"thread_id": "1"}}
for occasion in bot.graph.stream({"messages": messages}, thread):
    for v in occasion.values():
        print(v)

Ultimate output: The present climate in Los Angeles is sunny with a temperature of 17.2°C (63.0°F) and a wind pace of two.2 mph (3.6 kph) ….

The agent infers that we’re asking in regards to the climate, because of persistence. To confirm, let’s ask, “Which one is hotter?”:

messages = [HumanMessage(content="Which one is warmer?")]
thread = {"configurable": {"thread_id": "1"}}
for occasion in bot.graph.stream({"messages": messages}, thread):
    for v in occasion.values():
        print(v)

Ultimate output: Texas is hotter than Los Angeles. The present temperature in Texas is nineteen.4°C (66.9°F), whereas the present temperature in Los Angeles is 17.2°C (63.0°F)

The agent accurately compares the climate in Texas and LA. To check if persistence retains conversations separate, let’s ask the identical query with a distinct thread_id:

messages = [HumanMessage(content="Which one is warmer?")]
thread = {"configurable": {"thread_id": "2"}}
for occasion in bot.graph.stream({"messages": messages}, thread):
    for v in occasion.values():
        print(v)

Output: I want extra info to reply that query. Are you able to please present extra context or specify which two issues you’re evaluating?

This time, the agent will get confused as a result of it doesn’t have entry to the earlier dialog’s historical past.

Streaming Tokens

To stream tokens, we’ll use the astream_events methodology, which is asynchronous. We’ll additionally change to an async checkpointer.

from langgraph.checkpoint.sqlite.aio import AsyncSqliteSaver

async with AsyncSqliteSaver.from_conn_string(":reminiscence:") as checkpointer:
    abot = Agent(mannequin, [tool], system=immediate, checkpointer=checkpointer)
    messages = [HumanMessage(content="What is the weather in SF?")]
    thread = {"configurable": {"thread_id": "4"}}
    async for occasion in abot.graph.astream_events({"messages": messages}, thread, model="v1"):
        variety = occasion["event"]
        if variety == "on_chat_model_stream":
            content material = occasion["data"]["chunk"].content material
            if content material:
                # Empty content material within the context of OpenAI means
                # that the mannequin is asking for a instrument to be invoked.
                # So we solely print non-empty content material
                print(content material, finish="|")

This may stream tokens in real-time, supplying you with a stay view of the agent’s thought course of.

Conclusion

By including persistence and streaming, we’ve considerably enhanced our AI agent’s capabilities. Persistence permits the agent to keep up context throughout interactions, whereas streaming supplies real-time insights into its actions. These options are important for constructing production-ready purposes, particularly these involving a number of customers or human-in-the-loop interactions.

Within the subsequent tutorial, we’ll dive into human-in-the-loop interactions, the place persistence performs an important position in enabling seamless collaboration between people and AI brokers. Keep tuned!

References:

(DeepLearning.ai) https://study.deeplearning.ai/programs/ai-agents-in-langgraph

Additionally, don’t overlook to comply with us on Twitter and be part of our Telegram Channel and LinkedIn Group. Don’t Neglect to affix our 75k+ ML SubReddit.

🚨 Meet IntellAgent: An Open-Supply Multi-Agent Framework to Consider Complicated Conversational AI System ^(Promoted)

Vineet Kumar is a consulting intern at MarktechPost. He’s at the moment pursuing his BS from the Indian Institute of Expertise(IIT), Kanpur. He’s a Machine Studying fanatic. He’s enthusiastic about analysis and the newest developments in Deep Studying, Laptop Imaginative and prescient, and associated fields.

Fitness & Wellness Gadgets

Self-Care & Relaxation

Spa & Beauty Essentials

Relaxation Tools & Gadgets

Self-Help & Inspiration

High-End Makeup

Fitness & Wellness Gadgets

Self-Care & Relaxation

Spa & Beauty Essentials

Relaxation Tools & Gadgets

Self-Help & Inspiration

High-End Makeup

Creating an AI Agent-Primarily based System with LangGraph: Including Persistence and Streaming (Step by Step Information)

Setting Up the Agent

Including Persistence

Including Streaming

Understanding Thread IDs

Streaming Tokens

Conclusion

Intel Labs Explores Low-Rank Adapters and Neural Structure Seek for LLM Compression

Making a Medical Query-Answering Chatbot Utilizing Open-Supply BioMistral LLM, LangChain, Chroma’s Vector Storage, and RAG: A Step-by-Step Information

A Stepwise Python Code Implementation to Create Interactive Photorealistic Faces with NVIDIA StyleGAN2‑ADA

OpenAI introduces SWE-Lancer: A Benchmark for Evaluating Mannequin Efficiency on Actual-World Freelance Software program Engineering Work

A Step-by-Step Information to Setting Up a Customized BPE Tokenizer with Tiktoken for Superior NLP Purposes in Python

Nous Analysis Launched DeepHermes 3 Preview: A Llama-3-8B Based mostly Mannequin Combining Deep Reasoning, Superior Perform Calling, and Seamless Conversational Intelligence

Leave a reply Cancel reply

Smart Living with
AI Solutions!"

About Ai Insights Portal

Important Links

Quick Links

Shopping cart

Fitness & Wellness Gadgets

Self-Care & Relaxation

Spa & Beauty Essentials

Relaxation Tools & Gadgets

Self-Help & Inspiration

High-End Makeup

Fitness & Wellness Gadgets

Self-Care & Relaxation

Spa & Beauty Essentials

Relaxation Tools & Gadgets

Self-Help & Inspiration

High-End Makeup

Creating an AI Agent-Primarily based System with LangGraph: Including Persistence and Streaming (Step by Step Information)

Setting Up the Agent

Including Persistence

Including Streaming

Understanding Thread IDs

Streaming Tokens

Conclusion

Intel Labs Explores Low-Rank Adapters and Neural Structure Seek for LLM Compression

Making a Medical Query-Answering Chatbot Utilizing Open-Supply BioMistral LLM, LangChain, Chroma’s Vector Storage, and RAG: A Step-by-Step Information

A Stepwise Python Code Implementation to Create Interactive Photorealistic Faces with NVIDIA StyleGAN2‑ADA

OpenAI introduces SWE-Lancer: A Benchmark for Evaluating Mannequin Efficiency on Actual-World Freelance Software program Engineering Work

A Step-by-Step Information to Setting Up a Customized BPE Tokenizer with Tiktoken for Superior NLP Purposes in Python

Nous Analysis Launched DeepHermes 3 Preview: A Llama-3-8B Based mostly Mannequin Combining Deep Reasoning, Superior Perform Calling, and Seamless Conversational Intelligence

Leave a reply Cancel reply

Smart Living with AI Solutions!"

About Ai Insights Portal

Important Links

Quick Links

Shopping cart

Smart Living with
AI Solutions!"