Skip to Content

From Retrieval-Augmented Generation (RAG) to Function Calls: Building an Express Delivery AI Assistant

Yesterday, we explored using Retrieval-Augmented Generation (RAG) to build a database AI assistant. Today, we shift gears and delve into utilizing function calls to create a more streamlined and efficient express delivery AI assistant. The core business logic remains consistent: users interact with a Large Language Model (LLM) through conversational prompts to retrieve and display their express delivery details. The key difference lies in how the LLM accesses this data.

Understanding Function Calls (Tool Calls) in Large Language Models

Function calls, also known as tool calls, represent a significant advancement in LLM capabilities. They allow LLMs to interact directly with external APIs or tools, extending their functionality beyond their inherent knowledge base. This contrasts with RAG, which relies on retrieving relevant information from a pre-indexed database. Function calls offer a more dynamic and efficient approach, especially when dealing with real-time data or complex interactions. Think of function calls as giving the LLM the ability to "call" upon specialized tools to perform specific tasks, receiving the results and integrating them into its response.

Other techniques, such as Memory-Centric Prompts (MCP) and RAG, also serve to enhance the capabilities of LLMs. However, each technique approaches this enhancement differently. MCP focuses on improving context management within the LLM's interaction, whereas RAG relies on external document retrieval. Function calls, in contrast, directly enable interaction with external services or tools. All these methods ultimately contribute to pushing the boundaries of AI's capabilities.

The Function Call Workflow

The execution of a function call typically follows these steps:

  1. User Input: The user provides a natural language query to the LLM.

  2. LLM Interpretation: The LLM interprets the query and identifies the need for external information or a specific action.

  3. Function Selection: The LLM selects the appropriate function or API based on its understanding of the query.

  4. Function Execution: The LLM sends the necessary parameters to the selected function and executes it.

  5. Result Retrieval: The function returns the results of its execution.

  6. Response Generation: The LLM integrates the results into a coherent and informative response to the user.

This process significantly enhances the LLM's ability to handle complex tasks and access up-to-date information, resulting in more accurate and helpful responses.

Alibaba Cloud's Bailian (Tongyi Qianwen) as an Example

To illustrate the implementation of function calls, let's consider Alibaba Cloud's Bailian (Tongyi Qianwen) large language model. While connecting to a live database to query express information is crucial for a real-world application, we'll focus here on demonstrating the core functionality using simulated test data. This allows us to focus on the function call mechanism itself without the complexities of database integration.

Simulated Data and Code Implementation

We'll create a simplified representation of express delivery data. This data would normally be fetched from a database, but for demonstration purposes, we'll use a Python dictionary:

```python expressdata = { "trackingnumbers": [ {"trackingnumber": "1Z999AA10123456789", "status": "Transported", "location": "Shanghai"}, {"trackingnumber": "1Z999BB10123456780", "status": "Transported", "location": "Beijing"}, {"tracking_number": "1Z999CC10123456781", "status": "In Transit", "location": "Guangzhou"} ] }

def gettransportedexpress(trackingnumbers): transported = [] for item in expressdata["tracking_numbers"]: if item["status"] == "Transported": transported.append(item) return transported

```

This get_transported_express function simulates fetching data from a database. It filters the express_data to only return packages with the status "Transported".

Next, we demonstrate interacting with the large language model using a hypothetical ChatClient. The actual implementation would depend on the specific LLM API.

```python

Hypothetical ChatClient interaction

class ChatClient: def sendmessage(self, message, functions): # This is a placeholder. A real ChatClient would handle the interaction with the LLM. if "transported" in message.lower(): response = {"transportedpackages": gettransportedexpress(expressdata["trackingnumbers"])} return response else: return {"error":"Invalid query"}

client = ChatClient() userquery = "I have several 'transported' express deliveries." response = client.sendmessage(user_query, None) #Replace None with actual function definitions if needed

if "transportedpackages" in response: print("Transported Packages:") for package in response["transportedpackages"]: print(f"- Tracking Number: {package['tracking_number']}, Location: {package['location']}")

```

This code snippet simulates the interaction. The ChatClient receives the user query, determines the appropriate function call (in this case get_transported_express), executes it, and presents the results. A real-world implementation would involve detailed error handling and more sophisticated interaction with the LLM API.

Further Enhancements and Considerations

This simplified example lays the foundation for a more complex AI assistant. Real-world improvements would include:

  • Robust Database Integration: Replace the simulated express_data with a connection to a real database (e.g., using SQLAlchemy or other ORMs).

  • Error Handling: Implement comprehensive error handling to manage potential issues like database connection failures or invalid user queries.

  • Input Validation: Validate user input to ensure data integrity and prevent errors.

  • Advanced Query Processing: Implement natural language understanding (NLU) techniques to better interpret user queries and extract relevant information.

  • Multi-modal Capabilities: Integrate other data types such as images or location data.

The Future of Programming and Large Language Models

Large language model application development represents a paradigm shift in software development. It's crucial for programmers to adapt and embrace this new technology. Mastering the skills necessary for developing LLM applications, including understanding techniques like function calls, RAG, MCP, prompt engineering, and working with vector databases and embedding models, will be critical for career advancement and staying competitive in the evolving tech landscape.

The continuous learning aspect is paramount. The field of AI evolves rapidly, demanding constant upskilling and adaptation. This inherent need for ongoing learning underscores the importance of embracing this technological revolution and leveraging it to build more intelligent and powerful applications. By participating in this AI revolution, developers can enhance their skills, contribute to cutting-edge advancements, and contribute to shaping the future of technology.

Resources and Further Learning

For those interested in diving deeper into the world of large language models and their applications, I highly recommend exploring these areas:

  • Spring AI: Frameworks and libraries designed specifically for integrating LLMs into Spring applications.

  • Multimodal AI: Techniques that combine different data modalities (text, images, audio) within LLM applications.

  • Vector Databases: Specialized databases optimized for storing and querying vector embeddings, crucial for semantic search and similarity analysis within LLM applications.

  • Embedding Models: Models that generate vector representations of text, enabling semantic comparison and analysis.

  • Prompt Engineering: The art and science of crafting effective prompts to elicit the desired responses from LLMs.

This constant exploration and learning is not merely an adaptation to new technology; it's a continuous process of self-improvement and professional growth. Embracing this ongoing learning is not just an option; it's a necessity for anyone who wishes to thrive in the ever-evolving landscape of technology. This commitment to lifelong learning directly translates to enhanced capabilities, increased market value, and greater success in the professional sphere. So let's collectively embrace this AI feast and shape the future of technology together.

Understanding the Cost of Your Digital Presence: A Deep Dive into SEK 99/Quarter