Session: Training and Serving LLM’s on Kubernetes: A beginner’s guide.

Large Language Models (LLMs) are revolutionizing natural language processing, but their size and complexity can make them challenging to deploy and manage. In this talk, we’ll provide a beginner-friendly introduction to using Kubernetes for training and serving LLMs.

We’ll cover:

  • The Basics of Kubernetes: A quick overview of core Kubernetes concepts (pods, containers, deployments, services) essential for understanding LLM deployment.
  • LLMs and Resource Demands: Discussions on the unique computational resource requirements of LLMs and how Kubernetes helps manage them effectively.
  • Training LLMs on Kubernetes: Practical guidance on setting up training pipelines, addressing data distribution, and model optimization within a Kubernetes environment.
  • Serving LLMs for Inference: Walkthroughs of strategies for deploying LLMs as services, load balancing, and scaling to handle real-world traffic.

If you’re interested in harnessing the power of LLMs for your own projects, this talk will provide a solid foundation for utilizing Kubernetes to streamline your workflow.

This session will be recorded

Presenters: