There's Coffee In That Nebula. Part 7: Exploring the potential of emergent LLM behaviours

Written by

Mariano Cigliano

Published on

August 13, 2024

TL;DR

Explore how Project LEDA is revolutionizing retail customer analysis with LLM-driven Exploratory Data Analysis (EDA). This series follows the development of a Conversational Retail Customer Analysis system that combines the power of AI and human expertise. LEDA leverages autonomous agents to make data analysis as simple as chatting with a colleague. From single-agent systems to multi-agent teams, discover the challenges, innovations, and potential of AI in transforming data-driven decision-making.

Author

Mariano Cigliano

R&D Tech Leader

My LinkedIn

Download 2024 SaaS Report

Thank you! Your submission has been received

Oops! Something went wrong while submitting the form.

Welcome back to our "There's Coffee In That Nebula" series!

Following the exploration of Mobegí, our Retrieval-Augmented Generation (RAG) system, we're now embarking on a new project that pushes the boundaries of AI-assisted data analysis.

Today, we're excited to invite you on our next expedition: Project LEDA (LLM-driven Exploratory Data Analysis). Imagine a world where retail analysts can converse with their data as easily as chatting with a knowledgeable colleague. That's the promise of LEDA – a Conversational Retail Customer Analysis system that leverages the power of autonomous LLM agents to revolutionise how we approach exploratory data analysis.

In this three-part series, we'll document our 13-week journey from concept to prototype. We'll begin by outlining our assumptions and the project's foundation, then explore the capabilities of a single-agent system, and finally, investigate the potential of a multi-agent team approach. As we progress through our exploration of LEDA, we'll delve into the technical challenges and strategic implications of implementing autonomous agents for data analysis. We'll examine how this technology can potentially streamline operations, enhance decision-making processes, and create new opportunities for innovation within your organisations.

Buckle up for a thrilling ride as we navigate the complexities of natural language interfaces, contextual reasoning, and autonomous agents.

Who knows? By the end of this journey, you might find yourself looking at your spreadsheets and databases in a whole new light, asking, "Is there coffee in that data nebula?" Let's find out together!

‍

Assumptions:

About

Our primary goal was to develop a system capable of performing Exploratory Data Analysis (EDA) on retail datasets, with a particular focus on customer segmentation and behaviour analysis. We envisioned a tool that could allow retail analysts to interact with their data through natural language queries, democratising access to complex analytical capabilities.

While traditional statistical methods and Machine Learning (ML) techniques have been widely used for EDA and customer segmentation, we chose to explore an emerging approach leveraging Large Language Models (LLMs).
This decision was driven by our particular interest in exploring

the emergent reasoning capabilities of LLMs, when applied to structured data analysis
the flexibility of an agentic system to adapt to various types of retail data and queries without extensive reprogramming

We were intrigued by the possibility that these models could develop novel approaches to data exploration, potentially revolutionising how we conduct EDA in retail contexts.

We recognized that a fully automated EDA solution would likely be less effective than an interactive approach. By keeping humans in the loop, we aimed to combine the advanced capabilities of LLMs with human expertise and intuition. This interactive model allows for real-time adjustments, incorporation of domain knowledge, and nuanced exploration that a purely automated system might miss, ensuring our analysis remains both innovative and practically grounded.

The system components are based on Python 3.11.*.

Our main dependencies include:

Pydantic, a powerful Python library for defining data structures, ensuring data conformity, and validating/deserializing JSON data.
Pandas, used for dataframe operations and data manipulation.
Matplotlib, employed for data visualisation and plotting.
When components are LLM-based, they are implemented through LangChain 0.1.5 and its Expression Language declarative syntax (more details are provided in the Stack section).
LangGraph is a library for building stateful, multi-actor applications we used to orchestrate the flow of the solution.

‍

As a client we picked Streamlit, a popular Python library for creating web applications with minimal code, particularly suited for data science and machine learning projects. However, our system's architecture emphasises separation of concerns, which allows for client flexibility.

We named our system LEDA, which stands for LLM-Powered Exploratory Data Analysis. This acronym not only describes the core functionality of our tool but also draws inspiration from Greek mythology. In myth, Leda was a queen of Sparta known for her beauty and intelligence.

Just as Leda's encounter with divinity led to extraordinary outcomes, our LEDA system aims to bring together human expertise and the 'divine-like' capabilities of LLMs, potentially giving birth to remarkable insights in data analysis.

First thoughts

Our team's previous experiences with agent-based systems, dating back to the end of 2022, informed our approach to LEDA and helped shape our expectations.

Stability and Consistency: We anticipated challenges in maintaining stable performance and consistent outputs, especially when dealing with structured data.
Avoiding Loops: In our initial work with agents, we observed their tendency to occasionally get stuck in repetitive patterns or logical loops. We knew this would be a critical area to address, especially in the context of exploratory data analysis where diverse, non-repetitive insights are crucial.
Evaluation Complexity: Our experience taught us that traditional evaluation metrics for AI systems often fall short when applied to autonomous agents. The open-ended nature of their operations, and the potential for emergent behaviours, meant we needed to imagine more nuanced evaluation frameworks.
Balancing Autonomy and Human Oversight: While the power of agents lies in their autonomy, it can easily become a challenge to address. We recognized the importance of keeping humans in the loop to de-risk this factor and overall provide a more efficient solution.
Cost Predictability: We acknowledged that the agentic approach, and its potentially iterative nature, could lead to less predictable costs compared to traditional LLMs use.

Much like our approach with Mobegí, we viewed LEDA as more than just a prototype. We saw it as an opportunity to grasp the complexities that would inevitably arise in real-world deployments of such systems.

What are agents, anyway?

In the context of Generative AI, agents are autonomous or semi-autonomous software entities powered by large language models (LLMs). They're designed to perceive their environment (in our case, datasets and user queries), make decisions, and take actions to achieve specific goals. Unlike simple chatbots or query systems, agents can plan, reason, and even use tools to accomplish complex tasks.
At the end of 2022, the paper “ReAct: Synergizing Reasoning and Acting in Language Models” by Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao highlighted a possible framework for mirroring human problem-solving methodologies,

Tools

External capabilities and integrations that expand an agent's functionality beyond its core language processing abilities. These can include APIs for data retrieval, computational services for complex calculations, code generation utilities, and various other task-specific interfaces. By leveraging these tools, LLM Agents can interact with the external world, manipulate data, and perform actions that would be impossible through text generation alone.

Memory

It encompasses both short-term and long-term information storage mechanisms that allow the agent to maintain context, learn from experiences, and adapt over time. Short-term memory typically handles immediate context within a conversation or task, while long-term memory stores more persistent information, including learned strategies, frequently accessed data, and outcomes from past interactions. Advanced memory systems may include experience learners that process and preserve significant experiences, and memory retrievers that can recall relevant information when faced with new but similar situations.

Reasoning

It refers to the cognitive processes that enable problem-solving, decision-making, and task completion. This includes the ability to break down complex problems into manageable subtasks, a function often handled by a planning module. Reasoning also encompasses the agent's capacity to analyse information, draw inferences, and generate logical conclusions based on available data and prior knowledge.

It's important to understand that reasoning, like many advanced capabilities of Large Language Models, is not an ability explicitly encoded or designed into the model's architecture or training objectives.

Instead, it's an emergent property that manifests as these models increase in size and are exposed to more diverse training data. This phenomenon of emergence, where complex behaviours arise without being directly programmed, is a fascinating aspect of LLM development. Reasoning, along with abilities like arithmetic, question-answering, and summarization, surfaces as the model scales up, seemingly arising from the intricate patterns learned from vast amounts of text data.

This emergence challenges our traditional understanding of AI capabilities and raises intriguing questions about the nature of intelligence and learning. For those interested in exploring this topic further, the following are all valuable reads:

The stack

MongoDB

Given our usual practice of deploying on AWS, we transitioned to MongoDB as our database solution. Its scalability and robust querying capabilities provide a solid foundation for handling the complex data needs of our LLM agents.

Pandas

We incorporated Pandas into our stack to enhance our data manipulation and analysis capabilities. Pandas excels at handling structured data, offering powerful tools for data cleaning, transformation, and exploration. Its integration with other Python libraries made it an ideal choice for preprocessing data before feeding it into our LLM agents.

Langchain

LangChain's continuous improvements and community-driven development have kept it at the forefront of LLM application development, validating our choice to stick with this technology. The abstraction it provides allowed us to focus on developing sophisticated agent behaviours rather than reinventing lower-level functionalities.

A good starting point is theri brand new documentation.

LCEL (LangChain Expression Language)

Building on our previous experience with LCEL, we fully embraced it in this project. The composability it offers for creating complex flows and the support for batch processing, asynchronous operations, and streaming capabilities out of the box speeded up development and improved code readability.

LangGraph

As a new addition to our stack, LangGraph provided us with powerful tools for building and managing multi-agent systems. Its ability to model agent interactions as a graph allowed us to create more nuanced and dynamic agent behaviours.

Langsmith

We leveraged LangSmith for fine-tuning our prompts, tracking model performance over time, and identifying areas for improvement in our agent's decision-making processes. The insights gained from LangSmith's analytics helped us iteratively refine our agents, resulting in more efficient and effective AI systems.

Streamlit

Streamlit's simplicity in turning data scripts into shareable web apps allowed us to rapidly prototype and iterate on our user interface designs. This approach enabled us to gather feedback quickly and make adjustments on the fly.

A note about LlamaIndex and PandasAI

We did explore these technologies during our development process, as alternatives to LangChain implementation of a Panda ReAct agent.
We ultimately decided to fully use PandasAI, leveraging its advanced response.
The insights gained from evaluating these technologies informed some of our design decisions and may influence future iterations of our system.

More information about LlamaIndex implementation is available here.

Coming next

Thank you for joining us! We hope this overview has provided a solid foundation for understanding the context and approach behind LEDA. The intersection of AI agents and data analysis presents fascinating possibilities for enhancing how we extract insights from complex datasets.

In the next article, we'll dive deep into the first phase of our journey: developing and testing a single agent for EDA. We'll explore the various paths we investigated, the challenges we encountered, and the valuable findings that shaped our approach moving forward.

Stay tuned as we continue to unravel the potential of autonomous AI agents in the realm of data analysis!

‍