{"citation":{"short":"S. Bombari, M.H. Amani, M. Mondelli, in:, 36th Conference on Neural Information Processing Systems, Curran Associates, 2022, pp. 7628–7640.","ista":"Bombari S, Amani MH, Mondelli M. 2022. Memorization and optimization in deep neural networks with minimum over-parameterization. 36th Conference on Neural Information Processing Systems. vol. 35, 7628–7640.","apa":"Bombari, S., Amani, M. H., & Mondelli, M. (2022). Memorization and optimization in deep neural networks with minimum over-parameterization. In 36th Conference on Neural Information Processing Systems (Vol. 35, pp. 7628–7640). Curran Associates.","chicago":"Bombari, Simone, Mohammad Hossein Amani, and Marco Mondelli. “Memorization and Optimization in Deep Neural Networks with Minimum Over-Parameterization.” In 36th Conference on Neural Information Processing Systems, 35:7628–40. Curran Associates, 2022.","ama":"Bombari S, Amani MH, Mondelli M. Memorization and optimization in deep neural networks with minimum over-parameterization. In: 36th Conference on Neural Information Processing Systems. Vol 35. Curran Associates; 2022:7628-7640.","ieee":"S. Bombari, M. H. Amani, and M. Mondelli, “Memorization and optimization in deep neural networks with minimum over-parameterization,” in 36th Conference on Neural Information Processing Systems, 2022, vol. 35, pp. 7628–7640.","mla":"Bombari, Simone, et al. “Memorization and Optimization in Deep Neural Networks with Minimum Over-Parameterization.” 36th Conference on Neural Information Processing Systems, vol. 35, Curran Associates, 2022, pp. 7628–40."},"_id":"12537","language":[{"iso":"eng"}],"publication":"36th Conference on Neural Information Processing Systems","author":[{"id":"ca726dda-de17-11ea-bc14-f9da834f63aa","full_name":"Bombari, Simone","first_name":"Simone","last_name":"Bombari"},{"full_name":"Amani, Mohammad Hossein","last_name":"Amani","first_name":"Mohammad Hossein"},{"first_name":"Marco","last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020"}],"page":"7628-7640","external_id":{"arxiv":["2205.10217"]},"publication_identifier":{"isbn":["9781713871088"]},"acknowledgement":"The authors were partially supported by the 2019 Lopez-Loreta prize, and they would like to thank\r\nQuynh Nguyen, Mahdi Soltanolkotabi and Adel Javanmard for helpful discussions.\r\n","project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"intvolume":" 35","volume":35,"year":"2022","publication_status":"published","date_published":"2022-07-24T00:00:00Z","department":[{"_id":"MaMo"}],"abstract":[{"lang":"eng","text":"The Neural Tangent Kernel (NTK) has emerged as a powerful tool to provide memorization, optimization and generalization guarantees in deep neural networks. A line of work has studied the NTK spectrum for two-layer and deep networks with at least a layer with Ω(N) neurons, N being the number of training samples. Furthermore, there is increasing evidence suggesting that deep networks with sub-linear layer widths are powerful memorizers and optimizers, as long as the number of parameters exceeds the number of samples. Thus, a natural open question is whether the NTK is well conditioned in such a challenging sub-linear setup. In this paper, we answer this question in the affirmative. Our key technical contribution is a lower bound on the smallest NTK eigenvalue for deep networks with the minimum possible over-parameterization: the number of parameters is roughly Ω(N) and, hence, the number of neurons is as little as Ω(N−−√). To showcase the applicability of our NTK bounds, we provide two results concerning memorization capacity and optimization guarantees for gradient descent training."}],"main_file_link":[{"open_access":"1","url":" https://doi.org/10.48550/arXiv.2205.10217"}],"type":"conference","day":"24","article_processing_charge":"No","month":"07","date_updated":"2024-09-10T13:03:19Z","title":"Memorization and optimization in deep neural networks with minimum over-parameterization","quality_controlled":"1","status":"public","oa_version":"Preprint","oa":1,"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","date_created":"2023-02-10T13:46:37Z","publisher":"Curran Associates"}