Numerical Optimization and Machine Learning

COURSE DESCRIPTION

This course aims at providing students with relevant modern computational numerical techniques using linear algebra, optimization and machine learning, so the undergraduate will be able to attack data science and machine learning problems in an integrated manner.

Prerequisites

Common requirements for the semester and some knowledge of discrete mathematics.

COURSE GOALS

On completion of the course, students will:

understand and use the most important concepts from numerical methods and numerical optimization to understand the deep roots of machine learning in these areas;
be proficient with numerical tools to develop intuition in data science and statistics, as compared to analytical methods;
understand the importance of the implementation of numerical solutions for data exploration;
be able to take a multidisciplinary approach to data science projects, involving methodologies from computational statistics and numerical analysis.

COURSE CONTENT

Numerical optimization and data science (7 weeks)
1. Linear systems: Iterative methods (e.g., Gauss-Seidel), matrix spectrum (eigenvalues and eigenvectors), matrix factorization (e.g., SVD), overdetermined systems (least squares). Interpolation and curve fitting (splines, optional)
2. Generalities about numerical optimization and machine learning. Gradient descent. Stochastic gradient descent. Conjugate gradient. Metaheuristics: Search spaces, neighborhoods, sampling of the search space: exploration and exploitation, meta-modeling

Introduction to machine learning (7 weeks)
1. Overview of models and challenges in machine learning and data science: geometric vs. probabilistic approaches; predictive models vs. inference models
2. Probabilistic methods; Bayesian classification
3. Geometrical methods: K-NN, decision trees
4. Margin-based methods, kernel methods
5. Introduction to Neural networks
6. Introduction to ensemble methods

Bibliography

Eldén, Lars (2007). Matrix Methods in Data Mining and Pattern Recognition (Fundamentals of Algorithms), SIAM.
Nocedal, Jorge; Wright, Stephen J. (2006). Numerical Optimization, 2nd Edition, Springer.
Bishop, Christopher M. (2006). Pattern Recognition and Machine Learning, Springer.
Bengio, Yoshua; Goodfellow, Ian; Courville, Aaron (2016). Deep Learning, MIT Press.
Gareth, James; Witten, Daniela; Hastie, Trevor; Tibshirani, Robert (2023). Introduction to Statistical Learning, Springer.

Support Sessions

2 hours a week with a teaching assistant.

Grading

Partial exams in each block (25%), projects in each block (25%), homework assignments (50%).