Home

HPC Training Series - Course 15 "A Practical Guide to LLM Deployment and RAG Systems / Materials"

Europe/Athens
Description

This repository contains all the material that was shown during the aforementioned training event.

You can find the material of Dr. Bakas from NCC Greece here on Google Colab.

Running these notebooks

To run these notebooks you need a device with cuda GPUs. You can either use Google Colab, a local PC or if you have access to a hosting service.

Create the Virtual Environment using UV

To install the required libraries, simply go into the directory that contains the uv.lock and pyproject.toml files and type:

uv sync
 

If you do not have UV installed, follow these instructions.

UV will go ahead and create a .venv folder inside the directory.

Once you have the Virtual Env ready, you can go ahead and start running the jupyter notebooks.

Running VLLM

Before running the launch scripts, you need to install the virtual environment as well. The procedure is the same with the one you did for the notebooks.

  1. Install UV if not installed
  2. Run uv sync inside the vllm directory where the uv.lock and pyproject.toml files are located.

To run the VLLM scripts you need to modify them a bit.

Specifically:

  • If you are not running on an HPC Facility, you need to remove the SBATCH part of the script.

  • You need to change these directories

    1. Source command: Point it to the directory of your .venv.
    source /nvme/h/${USER}/vllm_launcher/.venv/bin/activate
     
  1. OUTPUT_FILE directory:
OUTPUT_FILE=/nvme/scratch/edu29/llm.env
 
  1. HF_HOME directory. This is the directory where the HF models will be installed.
export HF_HOME=/nvme/scratch/p255/HF_cache #HF_HOME=/nvme/h/${USER}/scratch/HF_cache
mkdir -p ${HF_HOME}
 
  • Feel free to change the vllm serve parameters as well. You can find the documentation here