Project Documentation: The Daily Agent

The daily agent Image.png

Version: 1.0 | GitHub Repo: TheDailyAgent
Date: August 26, 2025

A complete, in-depth guide to the architecture, code, and deployment of an autonomous, multi-agent AI newspaper generator.

1. Project Overview

"The Daily Agent" is a sophisticated, multi-agent application that autonomously generates a personalized daily newspaper. Built using Python, LangChain, and the powerful LangGraph library, it interprets a user's natural language request to research, summarize, and compose a polished, newspaper-style report.

The system is designed to handle various requests, from a general overview of the day's news to a special report on specific topics, showcasing a robust and intelligent agentic workflow. This project serves as a portfolio piece demonstrating advanced concepts in Agentic AI, including state management, self-correction, structured data processing, and front-end integration.

2. Key Features

🗣️ Natural Language Understanding: The agent intelligently parses user requests to handle three scenarios: General Edition, Special Report, and Hybrid requests, creating a seamless user experience.
🤖 Advanced Multi-Agent Architecture: The system orchestrates a team of specialized agents (Supervisor, Searcher, Summarizer, Creator) using a stateful graph to complete its tasks. This demonstrates an understanding of modern, modular AI design.
🧠 Self-Correcting Search: The search agent is designed to be resilient. If its initial search yields poor results, it is prompted to reformulate its query and try again, mimicking a human research process and ensuring higher quality data.
📊 Robust Structured Output: Uses Pydantic models to enforce a strict, predictable data structure for all information processed by the agent. This is a critical skill for building reliable, production-ready AI applications.
✍️ Dynamic Newspaper Generation: The final agent acts as an "Editor-in-Chief," creatively composing the gathered summaries into a polished, newspaper-style final report with headlines, sections, and an editorial note.
🖥️ Interactive Web Interface: A clean and user-friendly front-end built with Streamlit allows for easy interaction with the agent, making the project accessible and demonstrable.

3. System Architecture: A Deep Dive

The core of the application is a stateful graph built with LangGraph. This is not a simple linear chain; it's a cyclical graph that allows for complex logic, looping, and conditional routing. The state (AgentState) acts as a shared memory or "workbench" that each agent can read from and write to.

Input Parsing: The flow begins with the input_parser_node. It takes the user's raw query and uses an LLM with a Pydantic model (ParsedRequest) to extract the user's intent. It determines if the user wants a general newspaper, a report on specific topics, or a combination. This node's sole job is to populate the topics_to_process list in the state.
Supervision Loop: The supervisor_node is the primary controller or "manager." It checks the topics_to_process list. If the list is not empty, it takes the first topic, shortens the list, and formulates a clear, actionable instruction for the search_agent. This instruction is time-sensitive, as it includes the current date to ensure the search is for timely news. If the list is empty, it adds a "Finishing" message to the state, which signals the end of the research phase.
ReAct Search Sub-Loop: The search_agent receives its mission and engages in a sub-loop with the tool_executor. It uses its search tool (Tavily), then reviews the results. Its prompt contains self-correction logic, so if the results are irrelevant, it can formulate a new query and use the tool again. This "Reason-Act" (ReAct) pattern continues until it is satisfied with the information it has gathered.
Summarization: Once the search_agent is done, it passes the raw, messy data to the summarizer. This "analyst" agent's job is to enforce quality. It uses a PydanticOutputParser to reliably transform the raw text into a clean list of ArticleSummary objects, ensuring the data is structured and valid.
Final Composition: After the supervisor confirms all topics are processed, it routes the completed digests to the newspaper_creator_node. This "Editor-in-Chief" agent is given a creative prompt to synthesize all the structured summaries into a single, polished, human-readable newspaper.

Graph Visualization

graph TD
    A[START] --> B(input_parser);
    B --> C(supervisor);
    C -- Is Work Done? --> D{supervisor_condition};
    D -- No --> E(search_agent);
    D -- Yes --> F(newspaper_creator);
    E -- Tool Needed? --> G{tools_condition};
    G -- Yes --> H(tool_executor);
    H --> E;
    G -- No --> I(summarizer);
    I --> C;
    F --> J[END];

4. Codebase Explained

The project is organized into a modular structure for clarity and maintainability. Below is the complete code for each file with explanations.

4.1. `config.py` - Centralized Configuration

This file centralizes key settings, making them easy to modify without touching the core application logic.

# config.py

# The list of topics for a general news request. This acts as the
# default sections for the "Daily Edition" of the newspaper.
GENERAL_TOPICS = [
    "World News", 
    "India National News", 
    "Business & Economy", 
    "Technology", 
    "Sports"
]

# Defines the LLM models to be used. Separating them allows for using
# different models for different tasks (e.g., a faster model for logic,
# a more powerful one for creative writing).
MAIN_LLM_MODEL = "moonshotai/kimi-k2-instruct"
NEWSPAPER_CREATOR_LLM_MODEL = "openai/gpt-oss-120b"

4.2. `agent/state.py` - Data Models & State Definition

This file defines all data structures. Using Pydantic models ensures data validation and reliability, while the AgentState TypedDict provides the blueprint for the graph's memory.

# agent/state.py

from typing import Annotated, TypedDict, List, Optional
from pydantic import BaseModel, Field
from langchain_core.messages import AnyMessage
from langgraph.graph import add_messages

# Pydantic model for a single, structured news summary.
# It enforces the data types and includes descriptions for clarity.
# The 'url' field is Optional, making the agent resilient to search
# results that may not contain a source link.
class ArticleSummary(BaseModel):
    """A structured container for a single article's summary."""
    title: str = Field(description="The main headline of the news article.")
    url: Optional[str] = Field(description="The direct web link to the original article. Can be None if not found.")
    summary: str = Field(description="A detailed, AI-generated summary of the article.")

# A container model that holds a list of ArticleSummary objects.
# This is used by the PydanticOutputParser to parse the LLM's output.
class Summaries(BaseModel):
    """A container for a list of article summaries."""
    articles: List[ArticleSummary]

# Pydantic model for parsing the user's initial, unstructured request.
# It's more flexible than a simple Enum, allowing for hybrid requests.
class ParsedRequest(BaseModel):
    """A model to represent the parsed user request."""
    includes_general_news: bool = Field(description="Set to true if the user makes a general request for the newspaper or 'today's news'.")
    specific_topics: List[str] = Field(default=[], description="A list of specific news topics the user explicitly mentioned.")

# The main state definition for the entire graph.
# Each key represents a piece of memory the agents can access and modify.
# 'add_messages' is a special helper from LangGraph to easily append to the conversation history.
class AgentState(TypedDict):
    """Represents the state of the agent's workflow."""
    messages: Annotated[list[AnyMessage], add_messages]
    topics_to_process: List[str]
    completed_digests: dict[str, List[ArticleSummary]]
    current_topic: str
    final_output: Optional[str]

4.3. `agent/nodes.py` - The Agent's Core Logic

This file contains all the functions that act as nodes in our graph. Each function represents a specific agent or task.

# agent/nodes.py

from langchain_core.messages import HumanMessage
from langchain.output_parsers import PydanticOutputParser
from langchain_core.prompts import ChatPromptTemplate
from datetime import datetime
from .state import AgentState, Summaries, ParsedRequest, ArticleSummary
from config import GENERAL_TOPICS, MAIN_LLM_MODEL, NEWSPAPER_CREATOR_LLM_MODEL
from langchain_groq import ChatGroq
from langchain_tavily import TavilySearchResults

# --- LLM AND TOOL SETUP ---
# Initialize the LLMs and tools that the nodes will use.
llm = ChatGroq(model=MAIN_LLM_MODEL)
llm_for_newspaper_creation = ChatGroq(model=NEWSPAPER_CREATOR_LLM_MODEL)
search_tool = TavilySearchResults(max_results=3)
tools = [search_tool]
llm_with_tool = llm.bind_tools(tools)

# --- NODE FUNCTIONS ---

def input_parser_node(state: AgentState):
    """Parses the user's request to handle general, specific, and hybrid news requests."""
    user_message = state['messages'][-1].content
    intent_llm = llm.with_structured_output(ParsedRequest)
    prompt = f"""
    Analyze the user's request below. Your task is to determine two things:
    1.  Does the request include a general ask for the news (e.g., "get today's news", "daily newspaper")?
    2.  Does the request mention any specific topics (e.g., "Formula 1", "AI developments")?
    Return the structured analysis based on this.
    User Request: "{user_message}"
    """
    parsed_request = intent_llm.invoke(prompt)
    
    # Use a set for automatic de-duplication of topics
    final_topics = set()
    if parsed_request.includes_general_news:
        final_topics.update(GENERAL_TOPICS)
    if parsed_request.specific_topics:
        final_topics.update(parsed_request.specific_topics)
    if not final_topics:
         final_topics.update(GENERAL_TOPICS)

    return {"topics_to_process": list(final_topics)}


def supervisor_node(state: AgentState):
    """The 'manager' of our agent. It gives the searcher a clear, timely task."""
    if state['topics_to_process']:
        next_topic = state['topics_to_process'][0]
        remaining_topics = state['topics_to_process'][1:]
        today_date = datetime.now().strftime("%B %d, %Y")
        instruction = HumanMessage(
            content=f"""
            Your task is to find relevant news articles published on or around today's date, {today_date}, on the topic: '{next_topic}'.

            **CRITICAL INSTRUCTIONS:**
            1.  First, use your search tool to find relevant news articles.
            2.  After the initial search, **you must review the results**. 
            3.  If the results are not relevant to the topic '{next_topic}', you **must formulate a new, more specific search query and search again**.
            4.  Once you have a set of relevant articles, compile the search results into a single block of text for the summarizer agent.
            """
        )
        return {
            "messages": [instruction],
            "topics_to_process": remaining_topics,
            "current_topic": next_topic
        }
    else:
        return {"messages": [HumanMessage(content="All topics processed. Finishing.")]}


def search_agent_node(state: AgentState):
    """This is the ReAct agent. Its job is to use tools and return the conversation history."""
    response = llm_with_tool.invoke(state['messages'])
    # The 'add_messages' function expects a list, so we wrap the response in a list.
    return {"messages": [response]}


def summarizer_node(state: AgentState):
    """This node takes the final search results and transforms them into structured data."""
    last_message = state['messages'][-1].content
    parser = PydanticOutputParser(pydantic_object=Summaries)
    prompt_template = ChatPromptTemplate.from_template(
        """
        You are an expert news analyst. Your task is to take the provided text
        and convert it into a DETAILED, well-structured summary.
        The summary should ideally be at least 2-3 paragraphs long and cover background, the main event, and implications.
        **CRITICAL INSTRUCTION:** If the provided "Raw Data" is too sparse, create a shorter, one-paragraph summary based only on the information you have. Do NOT invent information.
        Extract the title and URL, and generate the summary.
        {format_instructions}
        Raw Data: {raw_data}
        """,
        partial_variables={"format_instructions": parser.get_format_instructions()},
    )
    chain = prompt_template | llm | parser
    summary_object = chain.invoke({"raw_data": last_message})
    current_topic = state['current_topic']
    existing_digests = state.get("completed_digests", {})
    existing_digests[current_topic] = summary_object.articles
    return {"completed_digests": existing_digests}


def newspaper_creator_node(state: AgentState):
    """The final node that composes all summaries into a polished, daily newspaper format."""
    all_digests = state["completed_digests"]
    user_message = state['messages'][0].content
    today_date = datetime.now().strftime("%B %d, %Y")
    formatted_digests = ""
    for topic, summaries in all_digests.items():
        formatted_digests += f"--- Section: {topic} ---\\n\\n"
        for summary in summaries:
            formatted_digests += f"Title: {summary.title}\\n"
            if summary.url:
                formatted_digests += f"Source: {summary.url}\\n"
            formatted_digests += f"Summary:\\n{summary.summary}\\n\\n"
    prompt = f"""
    You are the Editor-in-Chief of "The Daily Agent," a futuristic, AI-powered newspaper.
    Your task is to take today's compiled news summaries and create the front page for today, {today_date}.
    The user's original request was: "{user_message}".

    **INSTRUCTIONS:**
    1.  **Masthead:** Start with the newspaper's name, "The Daily Agent," and today's date.
    2.  **Main Headline:** Create a powerful, engaging headline for the entire day's edition.
    3.  **Editor's Note:** Write a short, 2-3 sentence introductory paragraph that summarizes the key themes of today's news.
    4.  **Featured Stories (If Applicable):** Check if the user's original request mentioned specific topics. If so, create a "Today's Featured Stories" section at the top and present the summaries for those specific topics first.
    5.  **Standard Sections:** After the featured stories (or if there are none), create a separate, clearly marked section for each news category (e.g., "World News," "Technology," etc.).
    6.  **Tone:** Use a professional, clean, and highly readable journalistic style.

    **Today's Compiled News Summaries:**
    {formatted_digests}

    Now, generate the complete, final newspaper layout.
    """
    newspaper_content = llm_for_newspaper_creation.invoke(prompt).content
    return {"final_output": newspaper_content}

4.4. `agent/graph.py` - Assembling the Agentic Workflow

This file imports all the components and wires them together into the final, compiled agent graph. It defines the structure and flow of the entire application.

# agent/graph.py

from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode, tools_condition

# Import from local package files
from .state import AgentState
from .nodes import (
    input_parser_node, 
    supervisor_node, 
    search_agent_node, 
    summarizer_node, 
    newspaper_creator_node,
    tools  # Import the tools list from nodes.py
)

def create_newspaper_agent():
    """Builds and compiles the LangGraph agent."""
    builder = StateGraph(AgentState)
    
    # Add Nodes to the graph
    builder.add_node("input_parser", input_parser_node)
    builder.add_node("supervisor", supervisor_node)
    builder.add_node("search_agent", search_agent_node)
    builder.add_node("tool_executor", ToolNode(tools))
    builder.add_node("summarizer", summarizer_node)
    builder.add_node("newspaper_creator", newspaper_creator_node)
    
    # Define the graph's edges and conditional logic
    builder.add_edge(START, "input_parser")
    builder.add_edge("input_parser", "supervisor")

    def supervisor_condition(state: AgentState):
        """Checks if the research phase is complete."""
        if "Finishing" in state['messages'][-1].content:
            return "newspaper_creator"
        return "search_agent"

    builder.add_conditional_edges("supervisor", supervisor_condition, {
        "newspaper_creator": "newspaper_creator",
        "search_agent": "search_agent"
    })
    
    # The ReAct sub-loop for searching
    builder.add_conditional_edges("search_agent", tools_condition, {
        "tools": "tool_executor",
        "__end__": "summarizer"
    })
    
    builder.add_edge("tool_executor", "search_agent")
    # After summarizing a topic, loop back to the supervisor
    builder.add_edge("summarizer", "supervisor")
    # The final step before ending
    builder.add_edge("newspaper_creator", END)
    
    # Compile the graph into a runnable object
    return builder.compile()

4.5. `app.py` - The Streamlit User Interface

The user-facing part of the application, providing a simple web interface for interaction.

# app.py

import streamlit as st
from langchain_core.messages import HumanMessage
from agent.graph import create_newspaper_agent
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# --- PAGE CONFIGURATION ---
st.set_page_config(page_title="The Daily Agent 📰", layout="wide")

st.title("The Daily Agent 📰")
st.subheader("Your AI-Powered Newspaper, Generated on Demand")

# --- AGENT INITIALIZATION ---
# Use st.cache_resource to create and cache the compiled agent graph.
# This ensures the agent is only built once, speeding up the app.
@st.cache_resource
def get_agent_graph():
    """Creates and returns the compiled newspaper agent graph."""
    # Check for necessary API keys
    if not os.getenv("GROQ_API_KEY") or not os.getenv("TAVILY_API_KEY"):
        st.error("API keys for Groq and Tavily not found. Please set them in your .env file.")
        return None
    return create_newspaper_agent()

graph = get_agent_graph()

# --- USER INTERACTION ---
if graph:
    user_request = st.text_input(
        "What news are you interested in today?",
        placeholder="e.g., 'Generate today's newspaper' or 'Tell me about AI and Formula 1'"
    )

    if st.button("Generate Newspaper"):
        if not user_request:
            st.warning("Please enter a request to generate your newspaper.")
        else:
            initial_input = {"messages": [HumanMessage(content=user_request)]}
            st.markdown("---")
            
            with st.spinner("🤖 The Daily Agent is writing your newspaper... This may take a moment."):
                final_response = None
                # Use an expander for the agent's "thoughts" to keep the UI clean
                with st.expander("Show Agent Thoughts 🧠"):
                    for event in graph.stream(initial_input, stream_mode="values", config={"recursion_limit": 50}):
                        if "messages" in event and event["messages"]:
                            latest_message = event["messages"][-1]
                            if hasattr(latest_message, 'content') and latest_message.content:
                                st.write(f"**Step: {type(latest_message).__name__}**")
                                st.write(latest_message.content)
                                st.write("---")
                        final_response = event

            st.markdown("---")
            st.header("Your Newspaper Is Ready! 🗞️")
            
            # Display the final, polished newspaper
            if final_response and final_response.get("final_output"):
                st.markdown(final_response["final_output"])
            else:
                st.error("Sorry, the agent finished but a newspaper could not be generated.")

4.6. `requirements.txt` - Project Dependencies

This file lists all the libraries required to run the project.

# requirements.txt
langchain
langgraph
langchain-groq
langchain-google-genai
langchain-tavily
pydantic
python-dotenv
streamlit
requests
beautifulsoup4

5. Setup & Deployment Guide

Follow these steps to run the project locally.

Prerequisites

Python 3.9+
Git

Installation

Clone the Repository:

git clone https://github.com/ChidambaraRaju/langgraph-news-agent.git

Create a Virtual Environment (using uv):
```
uv venv
```
Activate Environment:
```
source .venv/bin/activate
```
(On Windows, use .venv\Scripts\activate)
Install Dependencies:
```
uv pip install -r requirements.txt
```
Set Up API Keys:
Create a file named .env in the root of your project folder and add your API keys:
```
GROQ_API_KEY="YOUR_GROQ_API_KEY_HERE"
TAVILY_API_KEY="YOUR_TAVILY_API_KEY_HERE"
```

Running the Application

Execute the following command from the project's root directory:

streamlit run app.py

Your web browser will open a new tab with the application running.

6. Future Improvements

Additional Content Sources: Integrate with RSS feeds or specific news APIs (like the News API) to get a wider variety of content.
Sentiment Analysis: Add a step where an agent analyzes the sentiment of each article (positive, negative, neutral) and includes a summary chart in the final newspaper.
Fact-Checking Agent: Introduce a new agent to the graph that cross-references key claims in an article against other sources to provide a "confidence score."
Save & Share: Allow users to save their favorite generated newspapers to their profile or share them via a public link.

7. License

This project is licensed under the MIT License.

Project Documentation: The Daily Agent

The daily agent Image.png

Version: 1.0 | GitHub Repo: TheDailyAgent
Date: August 26, 2025

A complete, in-depth guide to the architecture, code, and deployment of an autonomous, multi-agent AI newspaper generator.

1. Project Overview

2. Key Features

🗣️ Natural Language Understanding: The agent intelligently parses user requests to handle three scenarios: General Edition, Special Report, and Hybrid requests, creating a seamless user experience.
🤖 Advanced Multi-Agent Architecture: The system orchestrates a team of specialized agents (Supervisor, Searcher, Summarizer, Creator) using a stateful graph to complete its tasks. This demonstrates an understanding of modern, modular AI design.
🧠 Self-Correcting Search: The search agent is designed to be resilient. If its initial search yields poor results, it is prompted to reformulate its query and try again, mimicking a human research process and ensuring higher quality data.
📊 Robust Structured Output: Uses Pydantic models to enforce a strict, predictable data structure for all information processed by the agent. This is a critical skill for building reliable, production-ready AI applications.
✍️ Dynamic Newspaper Generation: The final agent acts as an "Editor-in-Chief," creatively composing the gathered summaries into a polished, newspaper-style final report with headlines, sections, and an editorial note.
🖥️ Interactive Web Interface: A clean and user-friendly front-end built with Streamlit allows for easy interaction with the agent, making the project accessible and demonstrable.

3. System Architecture: A Deep Dive

Input Parsing: The flow begins with the input_parser_node. It takes the user's raw query and uses an LLM with a Pydantic model (ParsedRequest) to extract the user's intent. It determines if the user wants a general newspaper, a report on specific topics, or a combination. This node's sole job is to populate the topics_to_process list in the state.
Supervision Loop: The supervisor_node is the primary controller or "manager." It checks the topics_to_process list. If the list is not empty, it takes the first topic, shortens the list, and formulates a clear, actionable instruction for the search_agent. This instruction is time-sensitive, as it includes the current date to ensure the search is for timely news. If the list is empty, it adds a "Finishing" message to the state, which signals the end of the research phase.
ReAct Search Sub-Loop: The search_agent receives its mission and engages in a sub-loop with the tool_executor. It uses its search tool (Tavily), then reviews the results. Its prompt contains self-correction logic, so if the results are irrelevant, it can formulate a new query and use the tool again. This "Reason-Act" (ReAct) pattern continues until it is satisfied with the information it has gathered.
Summarization: Once the search_agent is done, it passes the raw, messy data to the summarizer. This "analyst" agent's job is to enforce quality. It uses a PydanticOutputParser to reliably transform the raw text into a clean list of ArticleSummary objects, ensuring the data is structured and valid.
Final Composition: After the supervisor confirms all topics are processed, it routes the completed digests to the newspaper_creator_node. This "Editor-in-Chief" agent is given a creative prompt to synthesize all the structured summaries into a single, polished, human-readable newspaper.

Graph Visualization

graph TD
    A[START] --> B(input_parser);
    B --> C(supervisor);
    C -- Is Work Done? --> D{supervisor_condition};
    D -- No --> E(search_agent);
    D -- Yes --> F(newspaper_creator);
    E -- Tool Needed? --> G{tools_condition};
    G -- Yes --> H(tool_executor);
    H --> E;
    G -- No --> I(summarizer);
    I --> C;
    F --> J[END];

4. Codebase Explained

The project is organized into a modular structure for clarity and maintainability. Below is the complete code for each file with explanations.

4.1. `config.py` - Centralized Configuration

This file centralizes key settings, making them easy to modify without touching the core application logic.

# config.py

# The list of topics for a general news request. This acts as the
# default sections for the "Daily Edition" of the newspaper.
GENERAL_TOPICS = [
    "World News", 
    "India National News", 
    "Business & Economy", 
    "Technology", 
    "Sports"
]

# Defines the LLM models to be used. Separating them allows for using
# different models for different tasks (e.g., a faster model for logic,
# a more powerful one for creative writing).
MAIN_LLM_MODEL = "moonshotai/kimi-k2-instruct"
NEWSPAPER_CREATOR_LLM_MODEL = "openai/gpt-oss-120b"

4.2. `agent/state.py` - Data Models & State Definition

This file defines all data structures. Using Pydantic models ensures data validation and reliability, while the AgentState TypedDict provides the blueprint for the graph's memory.

# agent/state.py

from typing import Annotated, TypedDict, List, Optional
from pydantic import BaseModel, Field
from langchain_core.messages import AnyMessage
from langgraph.graph import add_messages

# Pydantic model for a single, structured news summary.
# It enforces the data types and includes descriptions for clarity.
# The 'url' field is Optional, making the agent resilient to search
# results that may not contain a source link.
class ArticleSummary(BaseModel):
    """A structured container for a single article's summary."""
    title: str = Field(description="The main headline of the news article.")
    url: Optional[str] = Field(description="The direct web link to the original article. Can be None if not found.")
    summary: str = Field(description="A detailed, AI-generated summary of the article.")

# A container model that holds a list of ArticleSummary objects.
# This is used by the PydanticOutputParser to parse the LLM's output.
class Summaries(BaseModel):
    """A container for a list of article summaries."""
    articles: List[ArticleSummary]

# Pydantic model for parsing the user's initial, unstructured request.
# It's more flexible than a simple Enum, allowing for hybrid requests.
class ParsedRequest(BaseModel):
    """A model to represent the parsed user request."""
    includes_general_news: bool = Field(description="Set to true if the user makes a general request for the newspaper or 'today's news'.")
    specific_topics: List[str] = Field(default=[], description="A list of specific news topics the user explicitly mentioned.")

# The main state definition for the entire graph.
# Each key represents a piece of memory the agents can access and modify.
# 'add_messages' is a special helper from LangGraph to easily append to the conversation history.
class AgentState(TypedDict):
    """Represents the state of the agent's workflow."""
    messages: Annotated[list[AnyMessage], add_messages]
    topics_to_process: List[str]
    completed_digests: dict[str, List[ArticleSummary]]
    current_topic: str
    final_output: Optional[str]

4.3. `agent/nodes.py` - The Agent's Core Logic

This file contains all the functions that act as nodes in our graph. Each function represents a specific agent or task.

# agent/nodes.py

from langchain_core.messages import HumanMessage
from langchain.output_parsers import PydanticOutputParser
from langchain_core.prompts import ChatPromptTemplate
from datetime import datetime
from .state import AgentState, Summaries, ParsedRequest, ArticleSummary
from config import GENERAL_TOPICS, MAIN_LLM_MODEL, NEWSPAPER_CREATOR_LLM_MODEL
from langchain_groq import ChatGroq
from langchain_tavily import TavilySearchResults

# --- LLM AND TOOL SETUP ---
# Initialize the LLMs and tools that the nodes will use.
llm = ChatGroq(model=MAIN_LLM_MODEL)
llm_for_newspaper_creation = ChatGroq(model=NEWSPAPER_CREATOR_LLM_MODEL)
search_tool = TavilySearchResults(max_results=3)
tools = [search_tool]
llm_with_tool = llm.bind_tools(tools)

# --- NODE FUNCTIONS ---

def input_parser_node(state: AgentState):
    """Parses the user's request to handle general, specific, and hybrid news requests."""
    user_message = state['messages'][-1].content
    intent_llm = llm.with_structured_output(ParsedRequest)
    prompt = f"""
    Analyze the user's request below. Your task is to determine two things:
    1.  Does the request include a general ask for the news (e.g., "get today's news", "daily newspaper")?
    2.  Does the request mention any specific topics (e.g., "Formula 1", "AI developments")?
    Return the structured analysis based on this.
    User Request: "{user_message}"
    """
    parsed_request = intent_llm.invoke(prompt)
    
    # Use a set for automatic de-duplication of topics
    final_topics = set()
    if parsed_request.includes_general_news:
        final_topics.update(GENERAL_TOPICS)
    if parsed_request.specific_topics:
        final_topics.update(parsed_request.specific_topics)
    if not final_topics:
         final_topics.update(GENERAL_TOPICS)

    return {"topics_to_process": list(final_topics)}


def supervisor_node(state: AgentState):
    """The 'manager' of our agent. It gives the searcher a clear, timely task."""
    if state['topics_to_process']:
        next_topic = state['topics_to_process'][0]
        remaining_topics = state['topics_to_process'][1:]
        today_date = datetime.now().strftime("%B %d, %Y")
        instruction = HumanMessage(
            content=f"""
            Your task is to find relevant news articles published on or around today's date, {today_date}, on the topic: '{next_topic}'.

            **CRITICAL INSTRUCTIONS:**
            1.  First, use your search tool to find relevant news articles.
            2.  After the initial search, **you must review the results**. 
            3.  If the results are not relevant to the topic '{next_topic}', you **must formulate a new, more specific search query and search again**.
            4.  Once you have a set of relevant articles, compile the search results into a single block of text for the summarizer agent.
            """
        )
        return {
            "messages": [instruction],
            "topics_to_process": remaining_topics,
            "current_topic": next_topic
        }
    else:
        return {"messages": [HumanMessage(content="All topics processed. Finishing.")]}


def search_agent_node(state: AgentState):
    """This is the ReAct agent. Its job is to use tools and return the conversation history."""
    response = llm_with_tool.invoke(state['messages'])
    # The 'add_messages' function expects a list, so we wrap the response in a list.
    return {"messages": [response]}


def summarizer_node(state: AgentState):
    """This node takes the final search results and transforms them into structured data."""
    last_message = state['messages'][-1].content
    parser = PydanticOutputParser(pydantic_object=Summaries)
    prompt_template = ChatPromptTemplate.from_template(
        """
        You are an expert news analyst. Your task is to take the provided text
        and convert it into a DETAILED, well-structured summary.
        The summary should ideally be at least 2-3 paragraphs long and cover background, the main event, and implications.
        **CRITICAL INSTRUCTION:** If the provided "Raw Data" is too sparse, create a shorter, one-paragraph summary based only on the information you have. Do NOT invent information.
        Extract the title and URL, and generate the summary.
        {format_instructions}
        Raw Data: {raw_data}
        """,
        partial_variables={"format_instructions": parser.get_format_instructions()},
    )
    chain = prompt_template | llm | parser
    summary_object = chain.invoke({"raw_data": last_message})
    current_topic = state['current_topic']
    existing_digests = state.get("completed_digests", {})
    existing_digests[current_topic] = summary_object.articles
    return {"completed_digests": existing_digests}


def newspaper_creator_node(state: AgentState):
    """The final node that composes all summaries into a polished, daily newspaper format."""
    all_digests = state["completed_digests"]
    user_message = state['messages'][0].content
    today_date = datetime.now().strftime("%B %d, %Y")
    formatted_digests = ""
    for topic, summaries in all_digests.items():
        formatted_digests += f"--- Section: {topic} ---\\n\\n"
        for summary in summaries:
            formatted_digests += f"Title: {summary.title}\\n"
            if summary.url:
                formatted_digests += f"Source: {summary.url}\\n"
            formatted_digests += f"Summary:\\n{summary.summary}\\n\\n"
    prompt = f"""
    You are the Editor-in-Chief of "The Daily Agent," a futuristic, AI-powered newspaper.
    Your task is to take today's compiled news summaries and create the front page for today, {today_date}.
    The user's original request was: "{user_message}".

    **INSTRUCTIONS:**
    1.  **Masthead:** Start with the newspaper's name, "The Daily Agent," and today's date.
    2.  **Main Headline:** Create a powerful, engaging headline for the entire day's edition.
    3.  **Editor's Note:** Write a short, 2-3 sentence introductory paragraph that summarizes the key themes of today's news.
    4.  **Featured Stories (If Applicable):** Check if the user's original request mentioned specific topics. If so, create a "Today's Featured Stories" section at the top and present the summaries for those specific topics first.
    5.  **Standard Sections:** After the featured stories (or if there are none), create a separate, clearly marked section for each news category (e.g., "World News," "Technology," etc.).
    6.  **Tone:** Use a professional, clean, and highly readable journalistic style.

    **Today's Compiled News Summaries:**
    {formatted_digests}

    Now, generate the complete, final newspaper layout.
    """
    newspaper_content = llm_for_newspaper_creation.invoke(prompt).content
    return {"final_output": newspaper_content}

4.4. `agent/graph.py` - Assembling the Agentic Workflow

This file imports all the components and wires them together into the final, compiled agent graph. It defines the structure and flow of the entire application.

# agent/graph.py

from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode, tools_condition

# Import from local package files
from .state import AgentState
from .nodes import (
    input_parser_node, 
    supervisor_node, 
    search_agent_node, 
    summarizer_node, 
    newspaper_creator_node,
    tools  # Import the tools list from nodes.py
)

def create_newspaper_agent():
    """Builds and compiles the LangGraph agent."""
    builder = StateGraph(AgentState)
    
    # Add Nodes to the graph
    builder.add_node("input_parser", input_parser_node)
    builder.add_node("supervisor", supervisor_node)
    builder.add_node("search_agent", search_agent_node)
    builder.add_node("tool_executor", ToolNode(tools))
    builder.add_node("summarizer", summarizer_node)
    builder.add_node("newspaper_creator", newspaper_creator_node)
    
    # Define the graph's edges and conditional logic
    builder.add_edge(START, "input_parser")
    builder.add_edge("input_parser", "supervisor")

    def supervisor_condition(state: AgentState):
        """Checks if the research phase is complete."""
        if "Finishing" in state['messages'][-1].content:
            return "newspaper_creator"
        return "search_agent"

    builder.add_conditional_edges("supervisor", supervisor_condition, {
        "newspaper_creator": "newspaper_creator",
        "search_agent": "search_agent"
    })
    
    # The ReAct sub-loop for searching
    builder.add_conditional_edges("search_agent", tools_condition, {
        "tools": "tool_executor",
        "__end__": "summarizer"
    })
    
    builder.add_edge("tool_executor", "search_agent")
    # After summarizing a topic, loop back to the supervisor
    builder.add_edge("summarizer", "supervisor")
    # The final step before ending
    builder.add_edge("newspaper_creator", END)
    
    # Compile the graph into a runnable object
    return builder.compile()

4.5. `app.py` - The Streamlit User Interface

The user-facing part of the application, providing a simple web interface for interaction.

# app.py

import streamlit as st
from langchain_core.messages import HumanMessage
from agent.graph import create_newspaper_agent
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# --- PAGE CONFIGURATION ---
st.set_page_config(page_title="The Daily Agent 📰", layout="wide")

st.title("The Daily Agent 📰")
st.subheader("Your AI-Powered Newspaper, Generated on Demand")

# --- AGENT INITIALIZATION ---
# Use st.cache_resource to create and cache the compiled agent graph.
# This ensures the agent is only built once, speeding up the app.
@st.cache_resource
def get_agent_graph():
    """Creates and returns the compiled newspaper agent graph."""
    # Check for necessary API keys
    if not os.getenv("GROQ_API_KEY") or not os.getenv("TAVILY_API_KEY"):
        st.error("API keys for Groq and Tavily not found. Please set them in your .env file.")
        return None
    return create_newspaper_agent()

graph = get_agent_graph()

# --- USER INTERACTION ---
if graph:
    user_request = st.text_input(
        "What news are you interested in today?",
        placeholder="e.g., 'Generate today's newspaper' or 'Tell me about AI and Formula 1'"
    )

    if st.button("Generate Newspaper"):
        if not user_request:
            st.warning("Please enter a request to generate your newspaper.")
        else:
            initial_input = {"messages": [HumanMessage(content=user_request)]}
            st.markdown("---")
            
            with st.spinner("🤖 The Daily Agent is writing your newspaper... This may take a moment."):
                final_response = None
                # Use an expander for the agent's "thoughts" to keep the UI clean
                with st.expander("Show Agent Thoughts 🧠"):
                    for event in graph.stream(initial_input, stream_mode="values", config={"recursion_limit": 50}):
                        if "messages" in event and event["messages"]:
                            latest_message = event["messages"][-1]
                            if hasattr(latest_message, 'content') and latest_message.content:
                                st.write(f"**Step: {type(latest_message).__name__}**")
                                st.write(latest_message.content)
                                st.write("---")
                        final_response = event

            st.markdown("---")
            st.header("Your Newspaper Is Ready! 🗞️")
            
            # Display the final, polished newspaper
            if final_response and final_response.get("final_output"):
                st.markdown(final_response["final_output"])
            else:
                st.error("Sorry, the agent finished but a newspaper could not be generated.")

4.6. `requirements.txt` - Project Dependencies

This file lists all the libraries required to run the project.

# requirements.txt
langchain
langgraph
langchain-groq
langchain-google-genai
langchain-tavily
pydantic
python-dotenv
streamlit
requests
beautifulsoup4

5. Setup & Deployment Guide

Follow these steps to run the project locally.

Prerequisites

Python 3.9+
Git

Installation

Clone the Repository:

git clone https://github.com/ChidambaraRaju/langgraph-news-agent.git

Create a Virtual Environment (using uv):
```
uv venv
```
Activate Environment:
```
source .venv/bin/activate
```
(On Windows, use .venv\Scripts\activate)
Install Dependencies:
```
uv pip install -r requirements.txt
```
Set Up API Keys:
Create a file named .env in the root of your project folder and add your API keys:
```
GROQ_API_KEY="YOUR_GROQ_API_KEY_HERE"
TAVILY_API_KEY="YOUR_TAVILY_API_KEY_HERE"
```

Running the Application

Execute the following command from the project's root directory:

streamlit run app.py

Your web browser will open a new tab with the application running.

6. Future Improvements

Additional Content Sources: Integrate with RSS feeds or specific news APIs (like the News API) to get a wider variety of content.
Sentiment Analysis: Add a step where an agent analyzes the sentiment of each article (positive, negative, neutral) and includes a summary chart in the final newspaper.
Fact-Checking Agent: Introduce a new agent to the graph that cross-references key claims in an article against other sources to provide a "confidence score."
Save & Share: Allow users to save their favorite generated newspapers to their profile or share them via a public link.

7. License

This project is licensed under the MIT License.

The Daily Agent: A Multi-Agent System for Autonomous Newspaper Generation

Table of contents

Project Documentation: The Daily Agent

1. Project Overview

2. Key Features

3. System Architecture: A Deep Dive

Graph Visualization

4. Codebase Explained

4.1. config.py - Centralized Configuration

4.2. agent/state.py - Data Models & State Definition

4.3. agent/nodes.py - The Agent's Core Logic

4.4. agent/graph.py - Assembling the Agentic Workflow

4.5. app.py - The Streamlit User Interface

4.6. requirements.txt - Project Dependencies

5. Setup & Deployment Guide

Prerequisites

Installation

Running the Application

6. Future Improvements

7. License

Table of contents

Project Documentation: The Daily Agent

1. Project Overview

2. Key Features

3. System Architecture: A Deep Dive

Graph Visualization

4. Codebase Explained

4.1. config.py - Centralized Configuration

4.2. agent/state.py - Data Models & State Definition

4.3. agent/nodes.py - The Agent's Core Logic

4.4. agent/graph.py - Assembling the Agentic Workflow

4.5. app.py - The Streamlit User Interface

4.6. requirements.txt - Project Dependencies

5. Setup & Deployment Guide

Prerequisites

Installation

Running the Application

6. Future Improvements

7. License

Code

Code

4.1. `config.py` - Centralized Configuration

4.2. `agent/state.py` - Data Models & State Definition

4.3. `agent/nodes.py` - The Agent's Core Logic

4.4. `agent/graph.py` - Assembling the Agentic Workflow

4.5. `app.py` - The Streamlit User Interface

4.6. `requirements.txt` - Project Dependencies

4.1. `config.py` - Centralized Configuration

4.2. `agent/state.py` - Data Models & State Definition

4.3. `agent/nodes.py` - The Agent's Core Logic

4.4. `agent/graph.py` - Assembling the Agentic Workflow

4.5. `app.py` - The Streamlit User Interface

4.6. `requirements.txt` - Project Dependencies