AI4OPT Seminar Series

Date: Thursday, January 19, 2023

Time: Noon – 1:00 pm

Location: Instructional Center 115 (Scale Up Room) - (759 Ferst Dr, Atlanta, GA 30318)

Join Virtually: https://gatech.zoom.us/j/99381428980

Speaker: Anirbit Mukherjee


Provable Training of Neural Nets With One Layer of Activation

Abstract: Provable neural training is a fundamental challenge in the field of deep-learning theory – and it largely remains an open question for almost any neural net of practical relevance. The quest for provable convergence for neural training algorithms almost always leads to exciting new questions in mathematics. In this talk I shall give an overview of three convergence proofs of ours in this territory: (1) in 2016 we had shown the first deterministic algorithm that converges to the exact global minima of any convex loss function for any depth 2 ReLU neural net for any training data in time that is only polynomial in the training data size. (2) in 2020 we showed the first stochastic algorithm that converges to the global minima of a single ReLU gate in linear time (exponentially fast convergence) for realizable data whilst not assuming any specific distribution for the inputs. (3) in 2022, in a first-of-its-kind result we leveraged the theory of SDEs and Villani functions to show that SGD converges to the global minima of an appropriately Frobenius norm regularized squared loss on any depth 2 neural net with tanh or sigmoid activations – for arbitrary width and data. We shall end the talk delineating various open questions in this direction that can possibly be tackled in the near future.  

Bio: Anirbit Mukherjee obtained his undergraduate and master’s degrees in physics at the Chennai Mathematical Institute (CMI) and at the Tata Institute of Fundamental Research (TIFR), respectively. In 2020, he completed his Ph.D. in the Johns Hopkins University (JHU) Department of Applied Mathematics and Statistics. His doctoral research was recognized by two fellowships from JHU: “MINDS Data Science Fellowship" and the “Walter L. Robb Fellowship.” Mukherjee is currently an assistant professor at The University of Manchester. Prior to that he was a postdoc at the Wharton School in the Department of Statistics at the University of Pennsylvania. Mukherjee specializes in provable training algorithms and generalization guarantees for neural networks. Most recently, Mukherjee has been selected as a member of the European Laboratory for Learning and Intelligent Systems (ELLIS) Society.

Lunch will be served at the seminar. So, please stop by 15 minutes before the seminar to pick up lunch.

To receive AI4OPT seminar announcements, please sign up to our mailing list. at https://lists.isye.gatech.edu/mailman/listinfo/ai4opt-seminars.



Past seminars can be found at https://www.ai4opt.org/seminars/past-seminars.