---
_id: '14924'
abstract:
- lang: eng
  text: "The stochastic heavy ball method (SHB), also known as stochastic gradient
    descent (SGD) with Polyak's momentum, is widely used in training neural networks.
    However, despite the remarkable success of such algorithm in practice, its theoretical
    characterization remains limited. In this paper, we focus on neural networks with
    two and three layers and provide a rigorous understanding of the properties of
    the solutions found by SHB: \\emph{(i)} stability after dropping out part of the
    neurons, \\emph{(ii)} connectivity along a low-loss path, and \\emph{(iii)} convergence
    to the global optimum.\r\nTo achieve this goal, we take a mean-field view and
    relate the SHB dynamics to a certain partial differential equation in the limit
    of large network widths. This mean-field perspective has inspired a recent line
    of work focusing on SGD while, in contrast, our paper considers an algorithm with
    momentum. More specifically, after proving existence and uniqueness of the limit
    differential equations, we show convergence to the global optimum and give a quantitative
    bound between the mean-field limit and the SHB dynamics of a finite-width network.
    Armed with this last bound, we are able to establish the dropout-stability and
    connectivity of SHB solutions."
acknowledgement: D. Wu and M. Mondelli are partially supported by the 2019 Lopez-Loreta
  Prize. V. Kungurtsev was supported by the OP VVV project CZ.02.1.01/0.0/0.0/16_019/0000765
  "Research Center for Informatics".
alternative_title:
- TMLR
article_processing_charge: No
arxiv: 1
author:
- first_name: Diyuan
  full_name: Wu, Diyuan
  id: 1a5914c2-896a-11ed-bdf8-fb80621a0635
  last_name: Wu
- first_name: Vyacheslav
  full_name: Kungurtsev, Vyacheslav
  last_name: Kungurtsev
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Wu D, Kungurtsev V, Mondelli M. Mean-field analysis for heavy ball methods:
    Dropout-stability, connectivity, and global convergence. In: <i>Transactions on
    Machine Learning Research</i>. ML Research Press; 2023.'
  apa: 'Wu, D., Kungurtsev, V., &#38; Mondelli, M. (2023). Mean-field analysis for
    heavy ball methods: Dropout-stability, connectivity, and global convergence. In
    <i>Transactions on Machine Learning Research</i>. ML Research Press.'
  chicago: 'Wu, Diyuan, Vyacheslav Kungurtsev, and Marco Mondelli. “Mean-Field Analysis
    for Heavy Ball Methods: Dropout-Stability, Connectivity, and Global Convergence.”
    In <i>Transactions on Machine Learning Research</i>. ML Research Press, 2023.'
  ieee: 'D. Wu, V. Kungurtsev, and M. Mondelli, “Mean-field analysis for heavy ball
    methods: Dropout-stability, connectivity, and global convergence,” in <i>Transactions
    on Machine Learning Research</i>, 2023.'
  ista: 'Wu D, Kungurtsev V, Mondelli M. 2023. Mean-field analysis for heavy ball
    methods: Dropout-stability, connectivity, and global convergence. Transactions
    on Machine Learning Research. , TMLR, .'
  mla: 'Wu, Diyuan, et al. “Mean-Field Analysis for Heavy Ball Methods: Dropout-Stability,
    Connectivity, and Global Convergence.” <i>Transactions on Machine Learning Research</i>,
    ML Research Press, 2023.'
  short: D. Wu, V. Kungurtsev, M. Mondelli, in:, Transactions on Machine Learning
    Research, ML Research Press, 2023.
date_created: 2024-02-02T11:21:56Z
date_published: 2023-02-28T00:00:00Z
date_updated: 2024-09-10T13:03:20Z
day: '28'
department:
- _id: MaMo
external_id:
  arxiv:
  - '2210.06819'
has_accepted_license: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2210.06819
month: '02'
oa: 1
oa_version: Published Version
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: Transactions on Machine Learning Research
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
status: public
title: 'Mean-field analysis for heavy ball methods: Dropout-stability, connectivity,
  and global convergence'
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
