Numerical Optimization and Machine Learning


This course aims at providing students with relevant modern computational numerical techniques using linear algebra, optimization and machine learning, so the undergraduate will be able to attack data science and machine learning problems in an integrated manner.


Basic knowledge of discrete mathematics, calculus and linear algebra. At least one introductory programming course.


On completion of the course, students will:

  • understand and use the most important concepts from numerical methods and numerical optimization to understand the deep roots of machine learning in these areas;
  • be proficient with numerical tools to develop intuition in data science and statistics, as compared to analytical methods;
  • understand the importance of the implementation of numerical solutions for data exploration;
  • be able to take a multidisciplinary approach to data science projects, involving methodologies from computational statistics and numerical analysis.
  1. Numerical optimization and data science (7 weeks)
    1. Linear systems: Iterative methods (e.g., Gauss-Seidel), matrix spectrum (eigenvalues and eigenvectors), matrix factorization (e.g., SVD), overdetermined systems (least squares). Interpolation and curve fitting (splines, optional)
    2. Generalities about numerical optimization and machine learning. Gradient descent. Stochastic gradient descent. Conjugate gradient. Metaheuristics: Search spaces, neighborhoods, sampling of the search space: exploration and exploitation, meta-modeling
  1. Introduction to machine learning (7 weeks)
    1. Overview of models and challenges in machine learning and data science: geometric vs. probabilistic approaches; predictive models vs. inference models
    2. Probabilistic methods: Bayesian classification (naive/optimal)
    3. Geometrical methods: Knn, decision trees
    4. Margin-based methods. SVMs, kernel methods
    5. Neural networks
    6. Introduction to ensemble methods


  1. Eldén, Lars (2007). Matrix Methods in Data Mining and Pattern Recognition (Fundamentals of Algorithms), SIAM.
  2. Nocedal, Jorge; Wright, Stephen J. (2006). Numerical Optimization, 2nd Edition, Springer.
  3. Bishop, Christopher M. (2006). Pattern Recognition and Machine Learning, Springer.
  4. Bengio, Yoshua; Goodfellow, Ian; Courville, Aaron (2016). Deep Learning, MIT Press.
  5. Gareth, James; Witten, Daniela; Hastie, Trevor; Tibshirani, Robert (2013). Introduction to Statistical Learning, Springer.

Support Sessions

2 hours a week with a teaching assistant.


Partial exams in each block (25%), projects in each block (25%), homework assignments (50%).