RAG Workshop

Build Your First RAG System

Learn Retrieval-Augmented Generation with Local Tools

Master the fundamentals of RAG systems using ChromaDB for vector storage and Ollama for local LLM hosting. Perfect for beginners who want to understand how RAG works before moving to production.

Advanced Cloud Workshop

What You'll Build

A complete RAG system that runs entirely on your local machine

Vector Database

ChromaDB stores and searches through food data using vector embeddings

Local LLM

Llama3.2 model running locally via Ollama for generating responses

Embeddings

mxbai-embed-large converts text into numerical vectors for similarity search

Food Dataset

Rich JSON dataset with Indian foods, fruits, and nutritional information

Workshop Steps

Follow these detailed steps to build your RAG system from scratch

1
Understanding Vector Embeddings
Learn the fundamental concepts behind RAG and vector embeddings

This video explains how AI systems convert text into numerical vectors that capture semantic meaning. Understanding embeddings is crucial for RAG systems as they enable similarity search across documents.

AI's Secret Language: Vector Embeddings

Watch on YouTube
2
LMS Training Videos
Watch comprehensive tutorials on local LLM installation and usage

Note: You need to login to the LMS via Google authentication and have access via the coupon provided during enrollment.

Chapter 19: Install LLM on your laptop

Learn how to install and set up Ollama for running large language models locally

Access Chapter

Chapter 20: Using LLM locally using Python

Understand how to interact with local LLMs through Python code

Access Chapter

Chapter 21: Using LLM with Python (Code samples)

Practical code examples for integrating LLMs into your applications

Access Chapter
3
Install Ollama
Set up Ollama to run large language models locally on your machine

Ollama is a tool that allows you to run large language models locally on your computer. This means you don't need to send your data to external services, providing better privacy and control.

Installation Steps:

  1. Visit the Ollama website and download the installer for your operating system
  2. Run the installer and follow the setup wizard
  3. Open a terminal/command prompt to verify installation
  4. Test that Ollama is working by running the version command
ollama --version

Check if Ollama is installed correctly

4
Download AI Models
Install the required language and embedding models

We need two types of models: an embedding model (mxbai-embed-large) to convert text into vectors, and a language model (llama3.2) to generate responses.

mxbai-embed-large

Text Embedding Model

Converts text into numerical vectors for similarity search

Size: ~670MB

ollama pull mxbai-embed-large
View Model Details

llama3.2

Language Model

Generates human-like text responses based on context

Size: ~2GB

ollama pull llama3.2
View Model Details

Test Your Installation

ollama run llama3.2

Test the language model by starting an interactive chat

5
Install Python
Set up Python programming environment (if not already installed)

Python is the programming language we'll use to build our RAG application. If you already have Python installed, you can skip this step.

Installation Steps:

  1. Download Python from the official website
  2. Run the installer (make sure to check 'Add Python to PATH')
  3. Verify installation by checking the version
  4. Install pip (Python package manager) if not included
python --version

Check Python version (should be 3.8 or higher)

pip --version

Check pip (package manager) version

6
Install VS Code Insiders
Set up the development environment with the latest VS Code features

VS Code Insiders is the preview version of Visual Studio Code with the latest features, including enhanced AI capabilities and GitHub Copilot integration.

Installation Steps:

  1. Download VS Code Insiders from the official website
  2. Install the application following the setup wizard
  3. Launch VS Code Insiders
  4. Familiarize yourself with the interface
7
Setup GitHub Account
Create a GitHub account and set up GitHub Copilot for AI assistance

GitHub is a platform for code hosting and collaboration. GitHub Copilot is an AI coding assistant that helps you write code faster and more efficiently.

Create GitHub Account

Sign up for a free GitHub account if you don't have one

Create GitHub Account

Install GitHub Copilot Extension

Add the Copilot extension to VS Code for AI-powered coding assistance

Install GitHub Copilot Extension

Start Free Trial

Get 30 days free trial of GitHub Copilot ($10/month after trial)

Start Free Trial

Tip: GitHub Copilot significantly speeds up coding by suggesting completions and entire functions based on context.

8
Install Git
Set up Git version control system to clone and manage code repositories

Git is a version control system that tracks changes in your code and allows you to collaborate with others. We'll use it to download the RAG project code.

Installation Steps:

  1. Download Git from the official website
  2. Run the installer with default settings
  3. Open a new terminal/command prompt
  4. Verify Git installation
git --version

Check if Git is installed correctly

9
Clone RAG Project
Download the RAG-Food project code from GitHub

We'll clone the RAG-Food repository, which contains a complete working example of a Retrieval-Augmented Generation system using local models.

RAG-Food

Simple Retrieval-Augmented Generation with ChromaDB + Ollama

Features:
  • Local LLM via Ollama
  • Local embeddings via mxbai-embed-large
  • ChromaDB as the vector database
  • Simple food dataset in JSON (Indian foods, fruits, etc.)
git clone https://github.com/gocallum/ragfood

Clone the RAG-Food repository to your local machine

cd ragfood

Navigate into the project directory

RAG-Food Repository
10
Install Python Dependencies
Install the required Python packages for the RAG application

Python packages are pre-written code libraries that provide specific functionality. We need ChromaDB for vector storage and requests for HTTP communication.

chromadb

Vector database for storing and searching embeddings

Handles the storage and retrieval of text embeddings

requests

HTTP library for making API calls

Communicates with the Ollama API to get embeddings and responses

pip install chromadb requests

Install the required Python packages

11
Run the RAG Application
Execute the RAG system and test it with sample questions

Now we'll run the RAG application and test it with questions about food. The system will search through the food database and provide relevant answers.

python rag_run.py

Start the RAG application

Try These Sample Questions:

"What is tandoori chicken?"

Tests the system's knowledge about specific Indian dishes

"Which foods are spicy and vegetarian?"

Tests the system's ability to filter and categorize foods

"Tell me about healthy fruits"

Tests retrieval of nutritional information

How It Works:

  1. Your question is converted into a vector embedding using mxbai-embed-large
  2. ChromaDB searches for similar food items in the vector database
  3. Relevant food information is retrieved and sent to llama3.2
  4. The language model generates a comprehensive answer based on the context

Congratulations! 🎉

You've successfully completed the basic RAG workshop! You now understand the fundamentals of Retrieval-Augmented Generation systems.

  • Local vector database with ChromaDB
  • Local LLM hosting with Ollama
  • Understanding of embedding generation
  • RAG query processing pipeline
  • Interactive question-answering system