I am an assistant professor at McGill University in the Mathematics and Statistics department . I am a CIFAR AI Chair and I am an active member of the Montreal Machine Learning Optimization Group (MTL MLOpt) at MILA. Moreover I am the lead organizer of the OPT-ML Workshop for NeurIPS 2020. Previously, I was a research scientist at Google Brain, Montreal. You can view my CV here if you are interested in more details.

I received my Ph.D. from the Mathematics department at the University of Washington (2017) under Prof. Dmitriy Drusvyatskiy then I held a postdoctoral position in the Industrial and Systems Engineering at Lehigh University where I worked with Prof. Katya Scheinberg. I held an NSF postdoctoral fellowship (2018-2019) under Prof. Stephen Vavasis in the Combinatorics and Optimization Department at the University of Waterloo.

My research broadly focuses on designing and analyzing algorithms for large-scale optimization problems, motivated by applications in data science. The techniques I use draw from a variety of fields including probability, complexity theory, and convex and nonsmooth analysis.

University of Washington, Lehigh University, University of Waterloo, McGill University, and MIlA have strong optimization groups which spans across many departments: Math, Stats, CSE, EE, and ISE. If you are interested in optimization talks at these places, check out the following seminars:

EMAIL: yumiko88(at)uw(dot)edu or yumiko88(at)u(dot)washington(dot)edu or courtney(dot)paquette(at)mcgill(dot)ca



My research interests lie at the frontier of large-scale continuous optimization. Nonconvexity, nonsmooth analysis, complexity bounds, and interactions with random matrix theory and high-dimensional statistics appear throughout work. Modern applications of machine learning demand these advanced tools and motivate me to develop theoretical guarantees with an eye towards immediate practical value. My current research program is concerned with developing a coherent mathematical framework for analyzing average-case (typical) complexity and exact dynamics of learning algorithms in the high-dimensional setting.

You can view my CV here if you are interested in more details.

You can view my thesis titled: Structure and complexity in non-convex and nonsmooth optimization.


  • C. Paquette and E. Paquette. Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models. (2021) (to appear at NeurIPS 2021), arXiv pdf
  • C. Paquette, K. Lee, F. Pedregosa, and E. Paquette. SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize Criticality. Proceedings of Thirty Fourth Conference on Learning Theory (COLT) (2021) no. 134, 3548--3626, pdf
  • C. Paquette, B. van Merrienboer, F. Pedregosa, and E. Paquette. Halting time is predictable for large models: A Universality Property and Average-case Analysis. (2020) (to appear in Found. Comput. Math.), arXiv pdf
  • S. Baghal, C. Paquette, and SA Vavasis. A termination criterion for stochastic gradient for binary classification. (2020) (submitted), arXiv pdf
  • C. Paquette and S. Vavasis. Potential-based analyses of first-order methods for constrained and composite optimization. (2019) (submitted), arXiv pdf
  • C. Paquette and K. Scheinberg. A stochastic line-search method with convergence rate. SIAM J. Optim. (30) (2020) no. 1, 349-376, doi:10.1137/18M1216250, arXiv pdf
  • D. Davis, D. Drusvyatskiy, K. MacPhee, and C. Paquette. Subgradient methods for sharp weakly convex functions. J. Optim. Theory Appl. (179) (2018) no. 3, 962-982, doi:10.1007/s10957-018-1372-8, arXiv pdf
  • D. Davis, D. Drusvyatskiy, and C. Paquette. The nonsmooth landscape of phase retrieval. IMA J. Numer. Anal. (40) (2020) no.4, 2652-2695, doi:10.1093/imanum/drz031, arXiv pdf
  • C. Paquette, H. Lin, D. Drusvyatskiy, J. Mairal, and Z. Harchaoui. Acceleration for Gradient-Based Non-Convex Optimization. 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2018), arXiv pdf
  • D. Drusvyatskiy and C. Paquette. Efficiency of minimizing compositions of convex functions and smooth maps. Math. Program. 178 (2019), no. 1-2, Ser. A, 503-558, doi:10.1007/s10107-018-1311-3, arXiv pdf
  • D. Drusvyatskiy and C. Paquette. Variational analysis of spectral functions simplified. J. Convex Anal. 25(1), 2018. arXiv pdf


I have given talks on the research above at the following conferences:


Current Course

  • Math 315 (Ordinary Differential Equations) Website
  • Math 597 (Topics course: Convex Analysis and Optimization) Website

Past Courses

I have taught the following courses:

    McGill University, Mathematics and Statistics Department
  • Math 560 (graduate, instructor): Numerical Optimization, Spring 2021
  • Math 315 (undergraduate, instructor): Ordinary Differential Equations, Fall 2020
    Lehigh University, Industrial and Systems Engineering
  • ISE 417 (graduate, instructor): Nonlinear Optimization, Spring 2018
    University of Washington, Mathematics Department
  • Math 125 BC/BD (undergraduate, TA): Calculus II Quiz Section, Winter 2017
  • Math 307 E (undergraduate, instructor): Intro to Differential Equations, Winter 2016
  • Math 124 CC (undergraduate, TA): Calculus 1, Autumn 2015
  • Math 307 I (undergraduate, instructor): Intro to Differential Equations, Spring 2015
  • Math 125 BA/BC (undergraduate, TA): Calculus 2, Winter 2015
  • Math 307 K (undergraduate, instructor): Intro to Differential Equations, Autumn 2014
  • Math 307 L (undergraduate, instructor): Intro to Differential Equations, Spring 2014