Please use this session to ensure you can access the HPC system.
This presentation covers the process of deploying large language models on local machines and high-performance computing systems. It focuses on the tools and workflows needed to run models efficiently without relying on cloud infrastructure.
The talk will include practical tips for setting up environments, managing resources, and avoiding common issues during deployment. It will also...
In this hands-on session, participants will deploy large language models on Cyclone, the National High Performance Computing (HPC) infrastructure, using tools like vLLM for efficient inference and Haystack for building retrieval-augmented generation (RAG) pipelines. The session will guide attendees through the end-to-end process of setting up model environments, running local inference, and...