Lectures
Lecture recordings are available on YouTube.
Hands-On Tutorials are available on GitHub.
Detailed instructions for different stages of course projects are available here.
-
Lecture 1: Machine Learning Systems: Course Overview
tl;dr: This lecture reviews a brief overview of the course, its requirements, learning goals, policies, and expectations. -
Lecture 2: Machine Learning Systems in Production
tl;dr: This lecture contrasts challenges in building and deploying real-world ML systems in production vs in research. We also discuss the evolution of ML Systems from deep learning (Software 2.0) to LLMs (Software 3.0). -
Lecture 3: Designing Agentic AI Systems
tl;dr: In this lecture, we discuss design patterns for building Agentic AI systems using LLMs. -
Lecture 4: Designing Agentic AI Systems (Case Study)
tl;dr: In this lecture, we discuss a case study of building an Agentic AI system using LLMs. -
Lecture 5: LLM Inference (LLMOps)
tl;dr: In this lecture, we discuss the operational aspects of LLM inference, including optimization and deployment strategies. -
Lecture 6: Monitoring, Debugging, and Optimizing the Performance of ML and LLM Pipelines
tl;dr: In this lecture, we discuss techniques for monitoring, debugging, and optimizing the performance of ML and LLM pipelines. -
Lecture 7: Perceptrons and Logistic Regression
tl;dr: This lecture is about simple models and algorithms for supervised learning of binary and multi-class classifiers -
Lecture 8: Optimization and Neural Networks
tl;dr: This lecture is about optimization techniques to train models and build deep neural networks from multi-class logistic regression models. -
Lecture 9: Backpropagation and Automatic Differentiation
tl;dr: This lecture reviews backprop and automatic differentiation. -
Lecture 10: Machine Learning System Stack
tl;dr: This lecture reviews the full-stack machine learning system development. -
Lecture 11: Hardware for Machine Learning Systems
tl;dr: In this lecture, we look at the interface between software and hardware, and opportunities for optimization at the interface using ideas such as co-design. -
Lecture 12: Security of ML Systems (Adversarial ML)
tl;dr: This lecture discusses security challenges in ML systems, focusing on adversarial machine learning techniques and defenses. -
Lecture 13: Scalable & Distributed Machine Learning
tl;dr: This lecture introduces variations of gradient descent and ideas how to it scale up using parallel computing. -
Lecture 14: Trustworthy AI and Fairness in ML Systems
tl;dr: This lecture discusses important aspects of AI trustworthiness, including fairness, transparency, accountability, and robustness. -
Lecture 15: Building the Next Impactful ML System: Lessons, Strategies, and Inspirations from CSCE 585
tl;dr: This final lecture discusses how to build the next impactful ML Systems, drawing inspiration from current course materials. -
Bonus Lecture 1: How to Read an MLSys Paper?
tl;dr: In this lecture, we discuss a systematic approach for understanding, both high-level ideas and technical details in MLSys papers. -
Bonus Lecture 2: Designing and Motivating (ML) Systems Experiments
tl;dr: This lecture offers students both theoretical understanding and practical guidance by using InferLine as a concrete example, while giving them a clear roadmap for how to motivate their own projects experimentally. -
Bonus Lecture 3: Replicating Results in Machine Learning Systems Research
tl;dr: This lecture discusses the importance of replication in machine learning systems research and how you can integrate it into your projects. -
Bonus Lecture 4: Reconciling Accuracy, Cost, and Latency of Inference Serving Systems
tl;dr: This lecture reviews three related works out of AISys lab to set the context for the course and will be served as an example of MLSys research.