Build Your First RAG System
Learn Retrieval-Augmented Generation with Local Tools
Master the fundamentals of RAG systems using ChromaDB for vector storage and Ollama for local LLM hosting. Perfect for beginners who want to understand how RAG works before moving to production.
What You'll Build
A complete RAG system that runs entirely on your local machine
ChromaDB stores and searches through food data using vector embeddings
Llama3.2 model running locally via Ollama for generating responses
mxbai-embed-large converts text into numerical vectors for similarity search
Rich JSON dataset with Indian foods, fruits, and nutritional information
Workshop Steps
Follow these detailed steps to build your RAG system from scratch
This video explains how AI systems convert text into numerical vectors that capture semantic meaning. Understanding embeddings is crucial for RAG systems as they enable similarity search across documents.
AI's Secret Language: Vector Embeddings
Note: You need to login to the LMS via Google authentication and have access via the coupon provided during enrollment.
Chapter 19: Install LLM on your laptop
Learn how to install and set up Ollama for running large language models locally
Access ChapterChapter 20: Using LLM locally using Python
Understand how to interact with local LLMs through Python code
Access ChapterChapter 21: Using LLM with Python (Code samples)
Practical code examples for integrating LLMs into your applications
Access ChapterOllama is a tool that allows you to run large language models locally on your computer. This means you don't need to send your data to external services, providing better privacy and control.
Installation Steps:
- Visit the Ollama website and download the installer for your operating system
- Run the installer and follow the setup wizard
- Open a terminal/command prompt to verify installation
- Test that Ollama is working by running the version command
ollama --version
Check if Ollama is installed correctly
We need two types of models: an embedding model (mxbai-embed-large) to convert text into vectors, and a language model (llama3.2) to generate responses.
mxbai-embed-large
Text Embedding ModelConverts text into numerical vectors for similarity search
Size: ~670MB
ollama pull mxbai-embed-large
llama3.2
Language ModelGenerates human-like text responses based on context
Size: ~2GB
ollama pull llama3.2
Test Your Installation
ollama run llama3.2
Test the language model by starting an interactive chat
Python is the programming language we'll use to build our RAG application. If you already have Python installed, you can skip this step.
Installation Steps:
- Download Python from the official website
- Run the installer (make sure to check 'Add Python to PATH')
- Verify installation by checking the version
- Install pip (Python package manager) if not included
python --version
Check Python version (should be 3.8 or higher)
pip --version
Check pip (package manager) version
VS Code Insiders is the preview version of Visual Studio Code with the latest features, including enhanced AI capabilities and GitHub Copilot integration.
Installation Steps:
- Download VS Code Insiders from the official website
- Install the application following the setup wizard
- Launch VS Code Insiders
- Familiarize yourself with the interface
GitHub is a platform for code hosting and collaboration. GitHub Copilot is an AI coding assistant that helps you write code faster and more efficiently.
Install GitHub Copilot Extension
Add the Copilot extension to VS Code for AI-powered coding assistance
Install GitHub Copilot ExtensionTip: GitHub Copilot significantly speeds up coding by suggesting completions and entire functions based on context.
Git is a version control system that tracks changes in your code and allows you to collaborate with others. We'll use it to download the RAG project code.
Installation Steps:
- Download Git from the official website
- Run the installer with default settings
- Open a new terminal/command prompt
- Verify Git installation
git --version
Check if Git is installed correctly
We'll clone the RAG-Food repository, which contains a complete working example of a Retrieval-Augmented Generation system using local models.
RAG-Food
Simple Retrieval-Augmented Generation with ChromaDB + Ollama
Features:
- Local LLM via Ollama
- Local embeddings via mxbai-embed-large
- ChromaDB as the vector database
- Simple food dataset in JSON (Indian foods, fruits, etc.)
git clone https://github.com/gocallum/ragfood
Clone the RAG-Food repository to your local machine
cd ragfood
Navigate into the project directory
Python packages are pre-written code libraries that provide specific functionality. We need ChromaDB for vector storage and requests for HTTP communication.
chromadb
Vector database for storing and searching embeddingsHandles the storage and retrieval of text embeddings
requests
HTTP library for making API callsCommunicates with the Ollama API to get embeddings and responses
pip install chromadb requests
Install the required Python packages
Now we'll run the RAG application and test it with questions about food. The system will search through the food database and provide relevant answers.
python rag_run.py
Start the RAG application
Try These Sample Questions:
"What is tandoori chicken?"
Tests the system's knowledge about specific Indian dishes
"Which foods are spicy and vegetarian?"
Tests the system's ability to filter and categorize foods
"Tell me about healthy fruits"
Tests retrieval of nutritional information
How It Works:
- Your question is converted into a vector embedding using mxbai-embed-large
- ChromaDB searches for similar food items in the vector database
- Relevant food information is retrieved and sent to llama3.2
- The language model generates a comprehensive answer based on the context
Congratulations! 🎉
You've successfully completed the basic RAG workshop! You now understand the fundamentals of Retrieval-Augmented Generation systems.
- Local vector database with ChromaDB
- Local LLM hosting with Ollama
- Understanding of embedding generation
- RAG query processing pipeline
- Interactive question-answering system