---
_id: '14459'
abstract:
- lang: eng
  text: Autoencoders are a popular model in many branches of machine learning and
    lossy data compression. However, their fundamental limits, the performance of
    gradient methods and the features learnt during optimization remain poorly understood,
    even in the two-layer setting. In fact, earlier work has considered either linear
    autoencoders or specific training regimes (leading to vanishing or diverging compression
    rates). Our paper addresses this gap by focusing on non-linear two-layer autoencoders
    trained in the challenging proportional regime in which the input dimension scales
    linearly with the size of the representation. Our results characterize the minimizers
    of the population risk, and show that such minimizers are achieved by gradient
    methods; their structure is also unveiled, thus leading to a concise description
    of the features obtained via training. For the special case of a sign activation
    function, our analysis establishes the fundamental limits for the lossy compression
    of Gaussian sources via (shallow) autoencoders. Finally, while the results are
    proved for Gaussian data, numerical simulations on standard datasets display the
    universality of the theoretical predictions.
acknowledgement: Aleksandr Shevchenko, Kevin Kogler and Marco Mondelli are supported
  by the 2019 Lopez-Loreta Prize. Hamed Hassani acknowledges the support by the NSF
  CIF award (1910056) and the NSF Institute for CORE Emerging Methods in Data Science
  (EnCORE).
alternative_title:
- PMLR
article_processing_charge: No
arxiv: 1
author:
- first_name: Aleksandr
  full_name: Shevchenko, Aleksandr
  id: F2B06EC2-C99E-11E9-89F0-752EE6697425
  last_name: Shevchenko
- first_name: Kevin
  full_name: Kögler, Kevin
  id: 94ec913c-dc85-11ea-9058-e5051ab2428b
  last_name: Kögler
- first_name: Hamed
  full_name: Hassani, Hamed
  last_name: Hassani
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Shevchenko A, Kögler K, Hassani H, Mondelli M. Fundamental limits of two-layer
    autoencoders, and achieving them with gradient methods. In: <i>Proceedings of
    the 40th International Conference on Machine Learning</i>. Vol 202. ML Research
    Press; 2023:31151-31209.'
  apa: 'Shevchenko, A., Kögler, K., Hassani, H., &#38; Mondelli, M. (2023). Fundamental
    limits of two-layer autoencoders, and achieving them with gradient methods. In
    <i>Proceedings of the 40th International Conference on Machine Learning</i> (Vol.
    202, pp. 31151–31209). Honolulu, Hawaii, HI, United States: ML Research Press.'
  chicago: Shevchenko, Aleksandr, Kevin Kögler, Hamed Hassani, and Marco Mondelli.
    “Fundamental Limits of Two-Layer Autoencoders, and Achieving Them with Gradient
    Methods.” In <i>Proceedings of the 40th International Conference on Machine Learning</i>,
    202:31151–209. ML Research Press, 2023.
  ieee: A. Shevchenko, K. Kögler, H. Hassani, and M. Mondelli, “Fundamental limits
    of two-layer autoencoders, and achieving them with gradient methods,” in <i>Proceedings
    of the 40th International Conference on Machine Learning</i>, Honolulu, Hawaii,
    HI, United States, 2023, vol. 202, pp. 31151–31209.
  ista: 'Shevchenko A, Kögler K, Hassani H, Mondelli M. 2023. Fundamental limits of
    two-layer autoencoders, and achieving them with gradient methods. Proceedings
    of the 40th International Conference on Machine Learning. ICML: International
    Conference on Machine Learning, PMLR, vol. 202, 31151–31209.'
  mla: Shevchenko, Aleksandr, et al. “Fundamental Limits of Two-Layer Autoencoders,
    and Achieving Them with Gradient Methods.” <i>Proceedings of the 40th International
    Conference on Machine Learning</i>, vol. 202, ML Research Press, 2023, pp. 31151–209.
  short: A. Shevchenko, K. Kögler, H. Hassani, M. Mondelli, in:, Proceedings of the
    40th International Conference on Machine Learning, ML Research Press, 2023, pp.
    31151–31209.
conference:
  end_date: 2023-07-29
  location: Honolulu, Hawaii, HI, United States
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2023-07-23
date_created: 2023-10-29T23:01:17Z
date_published: 2023-07-30T00:00:00Z
date_updated: 2024-09-10T13:03:19Z
day: '30'
department:
- _id: MaMo
- _id: DaAl
external_id:
  arxiv:
  - '2212.13468'
intvolume: '       202'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2212.13468
month: '07'
oa: 1
oa_version: Preprint
page: 31151-31209
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: Proceedings of the 40th International Conference on Machine Learning
publication_identifier:
  eissn:
  - 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: Fundamental limits of two-layer autoencoders, and achieving them with gradient
  methods
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 202
year: '2023'
...
---
_id: '14921'
abstract:
- lang: eng
  text: Neural collapse (NC) refers to the surprising structure of the last layer
    of deep neural networks in the terminal phase of gradient descent training. Recently,
    an increasing amount of experimental evidence has pointed to the propagation of
    NC to earlier layers of neural networks. However, while the NC in the last layer
    is well studied theoretically, much less is known about its multi-layered counterpart
    - deep neural collapse (DNC). In particular, existing work focuses either on linear
    layers or only on the last two layers at the price of an extra assumption. Our
    paper fills this gap by generalizing the established analytical framework for
    NC - the unconstrained features model - to multiple non-linear layers. Our key
    technical contribution is to show that, in a deep unconstrained features model,
    the unique global optimum for binary classification exhibits all the properties
    typical of DNC. This explains the existing experimental evidence of DNC. We also
    empirically show that (i) by optimizing deep unconstrained features models via
    gradient descent, the resulting solution agrees well with our theory, and (ii)
    trained networks recover the unconstrained features suitable for the occurrence
    of DNC, thus supporting the validity of this modeling principle.
acknowledgement: M. M. is partially supported by the 2019 Lopez-Loreta Prize. The
  authors would like to thank Eugenia Iofinova, Bernd Prach and Simone Bombari for
  valuable feedback on the manuscript.
alternative_title:
- NeurIPS
article_processing_charge: No
arxiv: 1
author:
- first_name: Peter
  full_name: Súkeník, Peter
  id: d64d6a8d-eb8e-11eb-b029-96fd216dec3c
  last_name: Súkeník
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Súkeník P, Mondelli M, Lampert C. Deep neural collapse is provably optimal
    for the deep unconstrained features model. In: <i>37th Annual Conference on Neural
    Information Processing Systems</i>.'
  apa: Súkeník, P., Mondelli, M., &#38; Lampert, C. (n.d.). Deep neural collapse is
    provably optimal for the deep unconstrained features model. In <i>37th Annual
    Conference on Neural Information Processing Systems</i>. New Orleans, LA, United
    States.
  chicago: Súkeník, Peter, Marco Mondelli, and Christoph Lampert. “Deep Neural Collapse
    Is Provably Optimal for the Deep Unconstrained Features Model.” In <i>37th Annual
    Conference on Neural Information Processing Systems</i>, n.d.
  ieee: P. Súkeník, M. Mondelli, and C. Lampert, “Deep neural collapse is provably
    optimal for the deep unconstrained features model,” in <i>37th Annual Conference
    on Neural Information Processing Systems</i>, New Orleans, LA, United States.
  ista: 'Súkeník P, Mondelli M, Lampert C. Deep neural collapse is provably optimal
    for the deep unconstrained features model. 37th Annual Conference on Neural Information
    Processing Systems. NeurIPS: Neural Information Processing Systems, NeurIPS, .'
  mla: Súkeník, Peter, et al. “Deep Neural Collapse Is Provably Optimal for the Deep
    Unconstrained Features Model.” <i>37th Annual Conference on Neural Information
    Processing Systems</i>.
  short: P. Súkeník, M. Mondelli, C. Lampert, in:, 37th Annual Conference on Neural
    Information Processing Systems, n.d.
conference:
  end_date: 2023-12-16
  location: New Orleans, LA, United States
  name: 'NeurIPS: Neural Information Processing Systems'
  start_date: 2023-12-10
date_created: 2024-02-02T11:17:41Z
date_published: 2023-12-15T00:00:00Z
date_updated: 2024-09-10T13:03:19Z
day: '15'
department:
- _id: MaMo
- _id: ChLa
external_id:
  arxiv:
  - '2305.13165'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: ' https://doi.org/10.48550/arXiv.2305.13165'
month: '12'
oa: 1
oa_version: Preprint
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: 37th Annual Conference on Neural Information Processing Systems
publication_status: inpress
quality_controlled: '1'
status: public
title: Deep neural collapse is provably optimal for the deep unconstrained features
  model
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14922'
abstract:
- lang: eng
  text: 'We propose a novel approach to concentration for non-independent random variables.
    The main idea is to ``pretend'''' that the random variables are independent and
    pay a multiplicative price measuring how far they are from actually being independent.
    This price is encapsulated in the Hellinger integral between the joint and the
    product of the marginals, which is then upper bounded leveraging tensorisation
    properties. Our bounds represent a natural generalisation of concentration inequalities
    in the presence of dependence: we recover exactly the classical bounds (McDiarmid''s
    inequality) when the random variables are independent. Furthermore, in a ``large
    deviations'''' regime, we obtain the same decay in the probability as for the
    independent case, even when the random variables display non-trivial dependencies.
    To show this, we consider a number of applications of interest. First, we provide
    a bound for Markov chains with finite state space. Then, we consider the Simple
    Symmetric Random Walk, which is a non-contracting Markov chain, and a non-Markovian
    setting in which the stochastic process depends on its entire past. To conclude,
    we propose an application to Markov Chain Monte Carlo methods, where our approach
    leads to an improved lower bound on the minimum burn-in period required to reach
    a certain accuracy. In all of these settings, we provide a regime of parameters
    in which our bound fares better than what the state of the art can provide.'
acknowledgement: The authors are partially supported by the 2019 Lopez-Loreta Prize.
  They would also like to thank Professor Jan Maas for providing valuable suggestions
  and comments on an early version of the work.
article_processing_charge: No
arxiv: 1
author:
- first_name: Amedeo Roberto
  full_name: Esposito, Amedeo Roberto
  id: 9583e921-e1ad-11ec-9862-cef099626dc9
  last_name: Esposito
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Esposito AR, Mondelli M. Concentration without independence via information
    measures. In: <i>Proceedings of 2023 IEEE International Symposium on Information
    Theory</i>. IEEE. doi:<a href="https://doi.org/10.1109/isit54713.2023.10206899">10.1109/isit54713.2023.10206899</a>'
  apa: 'Esposito, A. R., &#38; Mondelli, M. (n.d.). Concentration without independence
    via information measures. In <i>Proceedings of 2023 IEEE International Symposium
    on Information Theory</i>. Taipei, Taiwan: IEEE. <a href="https://doi.org/10.1109/isit54713.2023.10206899">https://doi.org/10.1109/isit54713.2023.10206899</a>'
  chicago: Esposito, Amedeo Roberto, and Marco Mondelli. “Concentration without Independence
    via Information Measures.” In <i>Proceedings of 2023 IEEE International Symposium
    on Information Theory</i>. IEEE, n.d. <a href="https://doi.org/10.1109/isit54713.2023.10206899">https://doi.org/10.1109/isit54713.2023.10206899</a>.
  ieee: A. R. Esposito and M. Mondelli, “Concentration without independence via information
    measures,” in <i>Proceedings of 2023 IEEE International Symposium on Information
    Theory</i>, Taipei, Taiwan.
  ista: 'Esposito AR, Mondelli M. Concentration without independence via information
    measures. Proceedings of 2023 IEEE International Symposium on Information Theory.
    ISIT: IEEE International Symposium on Information Theory.'
  mla: Esposito, Amedeo Roberto, and Marco Mondelli. “Concentration without Independence
    via Information Measures.” <i>Proceedings of 2023 IEEE International Symposium
    on Information Theory</i>, IEEE, doi:<a href="https://doi.org/10.1109/isit54713.2023.10206899">10.1109/isit54713.2023.10206899</a>.
  short: A.R. Esposito, M. Mondelli, in:, Proceedings of 2023 IEEE International Symposium
    on Information Theory, IEEE, n.d.
conference:
  end_date: 2023-06-30
  location: Taipei, Taiwan
  name: 'ISIT: IEEE International Symposium on Information Theory'
  start_date: 2023-06-25
date_created: 2024-02-02T11:18:40Z
date_published: 2023-06-30T00:00:00Z
date_updated: 2024-02-14T14:24:25Z
day: '30'
department:
- _id: MaMo
doi: 10.1109/isit54713.2023.10206899
external_id:
  arxiv:
  - '2303.07245'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2303.07245
month: '06'
oa: 1
oa_version: Preprint
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: Proceedings of 2023 IEEE International Symposium on Information Theory
publication_status: inpress
publisher: IEEE
quality_controlled: '1'
status: public
title: Concentration without independence via information measures
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14924'
abstract:
- lang: eng
  text: "The stochastic heavy ball method (SHB), also known as stochastic gradient
    descent (SGD) with Polyak's momentum, is widely used in training neural networks.
    However, despite the remarkable success of such algorithm in practice, its theoretical
    characterization remains limited. In this paper, we focus on neural networks with
    two and three layers and provide a rigorous understanding of the properties of
    the solutions found by SHB: \\emph{(i)} stability after dropping out part of the
    neurons, \\emph{(ii)} connectivity along a low-loss path, and \\emph{(iii)} convergence
    to the global optimum.\r\nTo achieve this goal, we take a mean-field view and
    relate the SHB dynamics to a certain partial differential equation in the limit
    of large network widths. This mean-field perspective has inspired a recent line
    of work focusing on SGD while, in contrast, our paper considers an algorithm with
    momentum. More specifically, after proving existence and uniqueness of the limit
    differential equations, we show convergence to the global optimum and give a quantitative
    bound between the mean-field limit and the SHB dynamics of a finite-width network.
    Armed with this last bound, we are able to establish the dropout-stability and
    connectivity of SHB solutions."
acknowledgement: D. Wu and M. Mondelli are partially supported by the 2019 Lopez-Loreta
  Prize. V. Kungurtsev was supported by the OP VVV project CZ.02.1.01/0.0/0.0/16_019/0000765
  "Research Center for Informatics".
alternative_title:
- TMLR
article_processing_charge: No
arxiv: 1
author:
- first_name: Diyuan
  full_name: Wu, Diyuan
  id: 1a5914c2-896a-11ed-bdf8-fb80621a0635
  last_name: Wu
- first_name: Vyacheslav
  full_name: Kungurtsev, Vyacheslav
  last_name: Kungurtsev
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Wu D, Kungurtsev V, Mondelli M. Mean-field analysis for heavy ball methods:
    Dropout-stability, connectivity, and global convergence. In: <i>Transactions on
    Machine Learning Research</i>. ML Research Press; 2023.'
  apa: 'Wu, D., Kungurtsev, V., &#38; Mondelli, M. (2023). Mean-field analysis for
    heavy ball methods: Dropout-stability, connectivity, and global convergence. In
    <i>Transactions on Machine Learning Research</i>. ML Research Press.'
  chicago: 'Wu, Diyuan, Vyacheslav Kungurtsev, and Marco Mondelli. “Mean-Field Analysis
    for Heavy Ball Methods: Dropout-Stability, Connectivity, and Global Convergence.”
    In <i>Transactions on Machine Learning Research</i>. ML Research Press, 2023.'
  ieee: 'D. Wu, V. Kungurtsev, and M. Mondelli, “Mean-field analysis for heavy ball
    methods: Dropout-stability, connectivity, and global convergence,” in <i>Transactions
    on Machine Learning Research</i>, 2023.'
  ista: 'Wu D, Kungurtsev V, Mondelli M. 2023. Mean-field analysis for heavy ball
    methods: Dropout-stability, connectivity, and global convergence. Transactions
    on Machine Learning Research. , TMLR, .'
  mla: 'Wu, Diyuan, et al. “Mean-Field Analysis for Heavy Ball Methods: Dropout-Stability,
    Connectivity, and Global Convergence.” <i>Transactions on Machine Learning Research</i>,
    ML Research Press, 2023.'
  short: D. Wu, V. Kungurtsev, M. Mondelli, in:, Transactions on Machine Learning
    Research, ML Research Press, 2023.
date_created: 2024-02-02T11:21:56Z
date_published: 2023-02-28T00:00:00Z
date_updated: 2024-09-10T13:03:20Z
day: '28'
department:
- _id: MaMo
external_id:
  arxiv:
  - '2210.06819'
has_accepted_license: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2210.06819
month: '02'
oa: 1
oa_version: Published Version
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: Transactions on Machine Learning Research
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
status: public
title: 'Mean-field analysis for heavy ball methods: Dropout-stability, connectivity,
  and global convergence'
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '13315'
abstract:
- lang: eng
  text: How do statistical dependencies in measurement noise influence high-dimensional
    inference? To answer this, we study the paradigmatic spiked matrix model of principal
    components analysis (PCA), where a rank-one matrix is corrupted by additive noise.
    We go beyond the usual independence assumption on the noise entries, by drawing
    the noise from a low-order polynomial orthogonal matrix ensemble. The resulting
    noise correlations make the setting relevant for applications but analytically
    challenging. We provide characterization of the Bayes optimal limits of inference
    in this model. If the spike is rotation invariant, we show that standard spectral
    PCA is optimal. However, for more general priors, both PCA and the existing approximate
    message-passing algorithm (AMP) fall short of achieving the information-theoretic
    limits, which we compute using the replica method from statistical physics. We
    thus propose an AMP, inspired by the theory of adaptive Thouless–Anderson–Palmer
    equations, which is empirically observed to saturate the conjectured theoretical
    limit. This AMP comes with a rigorous state evolution analysis tracking its performance.
    Although we focus on specific noise distributions, our methodology can be generalized
    to a wide class of trace matrix ensembles at the cost of more involved expressions.
    Finally, despite the seemingly strong assumption of rotation-invariant noise,
    our theory empirically predicts algorithmic performance on real data, pointing
    at strong universality properties.
acknowledgement: J.B. was funded by the European Union (ERC, CHORAL, project number
  101039794). Views and opinions expressed are however those of the author(s) only
  and do not necessarily reflect those of the European Union or the European Research
  Council. Neither the European Union nor the granting authority can be held responsible
  for them. M.M. was supported by the 2019 Lopez-Loreta Prize. We would like to thank
  the reviewers for the insightful comments and, in particular, for suggesting the
  BAMP-inspired denoisers leading to AMP-AP.
article_number: e2302028120
article_processing_charge: Yes (in subscription journal)
article_type: original
author:
- first_name: Jean
  full_name: Barbier, Jean
  last_name: Barbier
- first_name: Francesco
  full_name: Camilli, Francesco
  last_name: Camilli
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
- first_name: Manuel
  full_name: Sáenz, Manuel
  last_name: Sáenz
citation:
  ama: Barbier J, Camilli F, Mondelli M, Sáenz M. Fundamental limits in structured
    principal component analysis and how to reach them. <i>Proceedings of the National
    Academy of Sciences of the United States of America</i>. 2023;120(30). doi:<a
    href="https://doi.org/10.1073/pnas.2302028120">10.1073/pnas.2302028120</a>
  apa: Barbier, J., Camilli, F., Mondelli, M., &#38; Sáenz, M. (2023). Fundamental
    limits in structured principal component analysis and how to reach them. <i>Proceedings
    of the National Academy of Sciences of the United States of America</i>. National
    Academy of Sciences. <a href="https://doi.org/10.1073/pnas.2302028120">https://doi.org/10.1073/pnas.2302028120</a>
  chicago: Barbier, Jean, Francesco Camilli, Marco Mondelli, and Manuel Sáenz. “Fundamental
    Limits in Structured Principal Component Analysis and How to Reach Them.” <i>Proceedings
    of the National Academy of Sciences of the United States of America</i>. National
    Academy of Sciences, 2023. <a href="https://doi.org/10.1073/pnas.2302028120">https://doi.org/10.1073/pnas.2302028120</a>.
  ieee: J. Barbier, F. Camilli, M. Mondelli, and M. Sáenz, “Fundamental limits in
    structured principal component analysis and how to reach them,” <i>Proceedings
    of the National Academy of Sciences of the United States of America</i>, vol.
    120, no. 30. National Academy of Sciences, 2023.
  ista: Barbier J, Camilli F, Mondelli M, Sáenz M. 2023. Fundamental limits in structured
    principal component analysis and how to reach them. Proceedings of the National
    Academy of Sciences of the United States of America. 120(30), e2302028120.
  mla: Barbier, Jean, et al. “Fundamental Limits in Structured Principal Component
    Analysis and How to Reach Them.” <i>Proceedings of the National Academy of Sciences
    of the United States of America</i>, vol. 120, no. 30, e2302028120, National Academy
    of Sciences, 2023, doi:<a href="https://doi.org/10.1073/pnas.2302028120">10.1073/pnas.2302028120</a>.
  short: J. Barbier, F. Camilli, M. Mondelli, M. Sáenz, Proceedings of the National
    Academy of Sciences of the United States of America 120 (2023).
date_created: 2023-07-30T22:01:02Z
date_published: 2023-07-25T00:00:00Z
date_updated: 2024-09-10T13:03:18Z
day: '25'
ddc:
- '000'
department:
- _id: MaMo
doi: 10.1073/pnas.2302028120
external_id:
  pmid:
  - '37463204'
file:
- access_level: open_access
  checksum: 1fc06228afdb3aa80cf8e7766bcf9dc5
  content_type: application/pdf
  creator: dernst
  date_created: 2023-07-31T07:30:48Z
  date_updated: 2023-07-31T07:30:48Z
  file_id: '13323'
  file_name: 2023_PNAS_Barbier.pdf
  file_size: 995933
  relation: main_file
  success: 1
file_date_updated: 2023-07-31T07:30:48Z
has_accepted_license: '1'
intvolume: '       120'
issue: '30'
language:
- iso: eng
month: '07'
oa: 1
oa_version: Published Version
pmid: 1
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: Proceedings of the National Academy of Sciences of the United States
  of America
publication_identifier:
  eissn:
  - 1091-6490
publication_status: published
publisher: National Academy of Sciences
quality_controlled: '1'
related_material:
  link:
  - relation: software
    url: https://github.com/fcamilli95/Structured-PCA-
scopus_import: '1'
status: public
title: Fundamental limits in structured principal component analysis and how to reach
  them
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 120
year: '2023'
...
---
_id: '13321'
abstract:
- lang: eng
  text: We consider the problem of reconstructing the signal and the hidden variables
    from observations coming from a multi-layer network with rotationally invariant
    weight matrices. The multi-layer structure models inference from deep generative
    priors, and the rotational invariance imposed on the weights generalizes the i.i.d.
    Gaussian assumption by allowing for a complex correlation structure, which is
    typical in applications. In this work, we present a new class of approximate message
    passing (AMP) algorithms and give a state evolution recursion which precisely
    characterizes their performance in the large system limit. In contrast with the
    existing multi-layer VAMP (ML-VAMP) approach, our proposed AMP – dubbed multilayer
    rotationally invariant generalized AMP (ML-RI-GAMP) – provides a natural generalization
    beyond Gaussian designs, in the sense that it recovers the existing Gaussian AMP
    as a special case. Furthermore, ML-RI-GAMP exhibits a significantly lower complexity
    than ML-VAMP, as the computationally intensive singular value decomposition is
    replaced by an estimation of the moments of the design matrices. Finally, our
    numerical results show that this complexity gain comes at little to no cost in
    the performance of the algorithm.
acknowledgement: Marco Mondelli was partially supported by the 2019 Lopez-Loreta prize.
article_processing_charge: No
arxiv: 1
author:
- first_name: Yizhou
  full_name: Xu, Yizhou
  last_name: Xu
- first_name: Tian Qi
  full_name: Hou, Tian Qi
  last_name: Hou
- first_name: Shan Suo
  full_name: Liang, Shan Suo
  last_name: Liang
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Xu Y, Hou TQ, Liang SS, Mondelli M. Approximate message passing for multi-layer
    estimation in rotationally invariant models. In: <i>2023 IEEE Information Theory
    Workshop</i>. Institute of Electrical and Electronics Engineers; 2023:294-298.
    doi:<a href="https://doi.org/10.1109/ITW55543.2023.10160238">10.1109/ITW55543.2023.10160238</a>'
  apa: 'Xu, Y., Hou, T. Q., Liang, S. S., &#38; Mondelli, M. (2023). Approximate message
    passing for multi-layer estimation in rotationally invariant models. In <i>2023
    IEEE Information Theory Workshop</i> (pp. 294–298). Saint-Malo, France: Institute
    of Electrical and Electronics Engineers. <a href="https://doi.org/10.1109/ITW55543.2023.10160238">https://doi.org/10.1109/ITW55543.2023.10160238</a>'
  chicago: Xu, Yizhou, Tian Qi Hou, Shan Suo Liang, and Marco Mondelli. “Approximate
    Message Passing for Multi-Layer Estimation in Rotationally Invariant Models.”
    In <i>2023 IEEE Information Theory Workshop</i>, 294–98. Institute of Electrical
    and Electronics Engineers, 2023. <a href="https://doi.org/10.1109/ITW55543.2023.10160238">https://doi.org/10.1109/ITW55543.2023.10160238</a>.
  ieee: Y. Xu, T. Q. Hou, S. S. Liang, and M. Mondelli, “Approximate message passing
    for multi-layer estimation in rotationally invariant models,” in <i>2023 IEEE
    Information Theory Workshop</i>, Saint-Malo, France, 2023, pp. 294–298.
  ista: 'Xu Y, Hou TQ, Liang SS, Mondelli M. 2023. Approximate message passing for
    multi-layer estimation in rotationally invariant models. 2023 IEEE Information
    Theory Workshop. ITW: Information Theory Workshop, 294–298.'
  mla: Xu, Yizhou, et al. “Approximate Message Passing for Multi-Layer Estimation
    in Rotationally Invariant Models.” <i>2023 IEEE Information Theory Workshop</i>,
    Institute of Electrical and Electronics Engineers, 2023, pp. 294–98, doi:<a href="https://doi.org/10.1109/ITW55543.2023.10160238">10.1109/ITW55543.2023.10160238</a>.
  short: Y. Xu, T.Q. Hou, S.S. Liang, M. Mondelli, in:, 2023 IEEE Information Theory
    Workshop, Institute of Electrical and Electronics Engineers, 2023, pp. 294–298.
conference:
  end_date: 2023-04-28
  location: Saint-Malo, France
  name: 'ITW: Information Theory Workshop'
  start_date: 2023-04-23
date_created: 2023-07-30T22:01:04Z
date_published: 2023-05-01T00:00:00Z
date_updated: 2024-09-10T13:03:19Z
day: '01'
department:
- _id: MaMo
doi: 10.1109/ITW55543.2023.10160238
external_id:
  arxiv:
  - '2212.01572'
  isi:
  - '001031733100053'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2212.01572
month: '05'
oa: 1
oa_version: Preprint
page: 294-298
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: 2023 IEEE Information Theory Workshop
publication_identifier:
  eissn:
  - 2475-4218
  isbn:
  - '9798350301496'
publication_status: published
publisher: Institute of Electrical and Electronics Engineers
quality_controlled: '1'
scopus_import: '1'
status: public
title: Approximate message passing for multi-layer estimation in rotationally invariant
  models
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '12859'
abstract:
- lang: eng
  text: 'Machine learning models are vulnerable to adversarial perturbations, and
    a thought-provoking paper by Bubeck and Sellke has analyzed this phenomenon through
    the lens of over-parameterization: interpolating smoothly the data requires significantly
    more parameters than simply memorizing it. However, this "universal" law provides
    only a necessary condition for robustness, and it is unable to discriminate between
    models. In this paper, we address these gaps by focusing on empirical risk minimization
    in two prototypical settings, namely, random features and the neural tangent kernel
    (NTK). We prove that, for random features, the model is not robust for any degree
    of over-parameterization, even when the necessary condition coming from the universal
    law of robustness is satisfied. In contrast, for even activations, the NTK model
    meets the universal lower bound, and it is robust as soon as the necessary condition
    on over-parameterization is fulfilled. This also addresses a conjecture in prior
    work by Bubeck, Li and Nagaraj. Our analysis decouples the effect of the kernel
    of the model from an "interaction matrix", which describes the interaction with
    the test data and captures the effect of the activation. Our theoretical results
    are corroborated by numerical evidence on both synthetic and standard datasets
    (MNIST, CIFAR-10).'
acknowledgement: "Simone Bombari and Marco Mondelli were partially supported by the
  2019 Lopez-Loreta prize, and\r\nthe authors would like to thank Hamed Hassani for
  helpful discussions.\r\n"
alternative_title:
- PMLR
article_processing_charge: No
arxiv: 1
author:
- first_name: Simone
  full_name: Bombari, Simone
  id: ca726dda-de17-11ea-bc14-f9da834f63aa
  last_name: Bombari
- first_name: Shayan
  full_name: Kiyani, Shayan
  id: f5a2b424-e339-11ed-8435-ff3b4fe70cf8
  last_name: Kiyani
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Bombari S, Kiyani S, Mondelli M. Beyond the universal law of robustness: Sharper
    laws for random features and neural tangent kernels. In: <i>Proceedings of the
    40th International Conference on Machine Learning</i>. Vol 202. ML Research Press;
    2023:2738-2776.'
  apa: 'Bombari, S., Kiyani, S., &#38; Mondelli, M. (2023). Beyond the universal law
    of robustness: Sharper laws for random features and neural tangent kernels. In
    <i>Proceedings of the 40th International Conference on Machine Learning</i> (Vol.
    202, pp. 2738–2776). Honolulu, HI, United States: ML Research Press.'
  chicago: 'Bombari, Simone, Shayan Kiyani, and Marco Mondelli. “Beyond the Universal
    Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels.”
    In <i>Proceedings of the 40th International Conference on Machine Learning</i>,
    202:2738–76. ML Research Press, 2023.'
  ieee: 'S. Bombari, S. Kiyani, and M. Mondelli, “Beyond the universal law of robustness:
    Sharper laws for random features and neural tangent kernels,” in <i>Proceedings
    of the 40th International Conference on Machine Learning</i>, Honolulu, HI, United
    States, 2023, vol. 202, pp. 2738–2776.'
  ista: 'Bombari S, Kiyani S, Mondelli M. 2023. Beyond the universal law of robustness:
    Sharper laws for random features and neural tangent kernels. Proceedings of the
    40th International Conference on Machine Learning. ICML: International Conference
    on Machine Learning, PMLR, vol. 202, 2738–2776.'
  mla: 'Bombari, Simone, et al. “Beyond the Universal Law of Robustness: Sharper Laws
    for Random Features and Neural Tangent Kernels.” <i>Proceedings of the 40th International
    Conference on Machine Learning</i>, vol. 202, ML Research Press, 2023, pp. 2738–76.'
  short: S. Bombari, S. Kiyani, M. Mondelli, in:, Proceedings of the 40th International
    Conference on Machine Learning, ML Research Press, 2023, pp. 2738–2776.
conference:
  end_date: 2023-07-29
  location: Honolulu, HI, United States
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2023-07-23
date_created: 2023-04-23T16:11:03Z
date_published: 2023-10-27T00:00:00Z
date_updated: 2024-09-10T13:03:19Z
day: '27'
department:
- _id: GradSch
- _id: MaMo
external_id:
  arxiv:
  - '2302.01629'
intvolume: '       202'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2302.01629
month: '10'
oa: 1
oa_version: Preprint
page: 2738-2776
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: Proceedings of the 40th International Conference on Machine Learning
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
related_material:
  link:
  - relation: software
    url: https://github.com/simone-bombari/beyond-universal-robustness
status: public
title: 'Beyond the universal law of robustness: Sharper laws for random features and
  neural tangent kernels'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 202
year: '2023'
...
---
_id: '11420'
abstract:
- lang: eng
  text: 'Understanding the properties of neural networks trained via stochastic gradient
    descent (SGD) is at the heart of the theory of deep learning. In this work, we
    take a mean-field view, and consider a two-layer ReLU network trained via noisy-SGD
    for a univariate regularized regression problem. Our main result is that SGD with
    vanishingly small noise injected in the gradients is biased towards a simple solution:
    at convergence, the ReLU network implements a piecewise linear map of the inputs,
    and the number of “knot” points -- i.e., points where the tangent of the ReLU
    network estimator changes -- between two consecutive training inputs is at most
    three. In particular, as the number of neurons of the network grows, the SGD dynamics
    is captured by the solution of a gradient flow and, at convergence, the distribution
    of the weights approaches the unique minimizer of a related free energy, which
    has a Gibbs form. Our key technical contribution consists in the analysis of the
    estimator resulting from this minimizer: we show that its second derivative vanishes
    everywhere, except at some specific locations which represent the “knot” points.
    We also provide empirical evidence that knots at locations distinct from the data
    points might occur, as predicted by our theory.'
acknowledgement: "We would like to thank Mert Pilanci for several exploratory discussions
  in the early stage\r\nof the project, Jan Maas for clarifications about Jordan et
  al. (1998), and Max Zimmer for\r\nsuggestive numerical experiments. A. Shevchenko
  and M. Mondelli are partially supported\r\nby the 2019 Lopez-Loreta Prize. V. Kungurtsev
  acknowledges support to the OP VVV\r\nproject CZ.02.1.01/0.0/0.0/16 019/0000765
  Research Center for Informatics.\r\n"
article_processing_charge: No
article_type: original
arxiv: 1
author:
- first_name: Aleksandr
  full_name: Shevchenko, Aleksandr
  id: F2B06EC2-C99E-11E9-89F0-752EE6697425
  last_name: Shevchenko
- first_name: Vyacheslav
  full_name: Kungurtsev, Vyacheslav
  last_name: Kungurtsev
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: Shevchenko A, Kungurtsev V, Mondelli M. Mean-field analysis of piecewise linear
    solutions for wide ReLU networks. <i>Journal of Machine Learning Research</i>.
    2022;23(130):1-55.
  apa: Shevchenko, A., Kungurtsev, V., &#38; Mondelli, M. (2022). Mean-field analysis
    of piecewise linear solutions for wide ReLU networks. <i>Journal of Machine Learning
    Research</i>. Journal of Machine Learning Research.
  chicago: Shevchenko, Aleksandr, Vyacheslav Kungurtsev, and Marco Mondelli. “Mean-Field
    Analysis of Piecewise Linear Solutions for Wide ReLU Networks.” <i>Journal of
    Machine Learning Research</i>. Journal of Machine Learning Research, 2022.
  ieee: A. Shevchenko, V. Kungurtsev, and M. Mondelli, “Mean-field analysis of piecewise
    linear solutions for wide ReLU networks,” <i>Journal of Machine Learning Research</i>,
    vol. 23, no. 130. Journal of Machine Learning Research, pp. 1–55, 2022.
  ista: Shevchenko A, Kungurtsev V, Mondelli M. 2022. Mean-field analysis of piecewise
    linear solutions for wide ReLU networks. Journal of Machine Learning Research.
    23(130), 1–55.
  mla: Shevchenko, Aleksandr, et al. “Mean-Field Analysis of Piecewise Linear Solutions
    for Wide ReLU Networks.” <i>Journal of Machine Learning Research</i>, vol. 23,
    no. 130, Journal of Machine Learning Research, 2022, pp. 1–55.
  short: A. Shevchenko, V. Kungurtsev, M. Mondelli, Journal of Machine Learning Research
    23 (2022) 1–55.
date_created: 2022-05-29T22:01:54Z
date_published: 2022-04-01T00:00:00Z
date_updated: 2024-09-10T13:03:17Z
day: '01'
ddc:
- '000'
department:
- _id: MaMo
- _id: DaAl
external_id:
  arxiv:
  - '2111.02278'
file:
- access_level: open_access
  checksum: d4ff5d1affb34848b5c5e4002483fc62
  content_type: application/pdf
  creator: cchlebak
  date_created: 2022-05-30T08:22:55Z
  date_updated: 2022-05-30T08:22:55Z
  file_id: '11422'
  file_name: 21-1365.pdf
  file_size: 1521701
  relation: main_file
  success: 1
file_date_updated: 2022-05-30T08:22:55Z
has_accepted_license: '1'
intvolume: '        23'
issue: '130'
language:
- iso: eng
month: '04'
oa: 1
oa_version: Published Version
page: 1-55
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: Journal of Machine Learning Research
publication_identifier:
  eissn:
  - 1533-7928
  issn:
  - 1532-4435
publication_status: published
publisher: Journal of Machine Learning Research
quality_controlled: '1'
related_material:
  link:
  - relation: other
    url: https://www.jmlr.org/papers/v23/21-1365.html
scopus_import: '1'
status: public
title: Mean-field analysis of piecewise linear solutions for wide ReLU networks
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: journal_article
user_id: 8b945eb4-e2f2-11eb-945a-df72226e66a9
volume: 23
year: '2022'
...
---
_id: '10364'
abstract:
- lang: eng
  text: 'This paper characterizes the latency of the simplified successive-cancellation
    (SSC) decoding scheme for polar codes under hardware resource constraints. In
    particular, when the number of processing elements P that can perform SSC decoding
    operations in parallel is limited, as is the case in practice, the latency of
    SSC decoding is O(N1-1/μ + N/P log2 log2 N/P), where N is the block length of
    the code and μ is the scaling exponent of the channel. Three direct consequences
    of this bound are presented. First, in a fully-parallel implementation where P
    = N/2, the latency of SSC decoding is O(N1-1/μ), which is sublinear in the block
    length. This recovers a result from our earlier work. Second, in a fully-serial
    implementation where P = 1, the latency of SSC decoding scales as O(N log2 log2
    N). The multiplicative constant is also calculated: we show that the latency of
    SSC decoding when P = 1 is given by (2 + o(1))N log2 log2 N. Third, in a semi-parallel
    implementation, the smallest P that gives the same latency as that of the fully-parallel
    implementation is P = N1/μ. The tightness of our bound on SSC decoding latency
    and the applicability of the foregoing results is validated through extensive
    simulations.'
acknowledgement: "S. A. Hashemi is supported by a Postdoctoral Fellowship from the
  Natural Sciences and\r\nEngineering Research Council of Canada (NSERC) and by Huawei.
  M. Mondelli is partially\r\nsupported by the 2019 Lopez-Loreta Prize. A. Fazeli
  and A. Vardy were supported in part by\r\nthe National Science Foundation under
  Grant CCF-1764104."
article_processing_charge: No
article_type: original
arxiv: 1
author:
- first_name: Seyyed Ali
  full_name: Hashemi, Seyyed Ali
  last_name: Hashemi
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
- first_name: Arman
  full_name: Fazeli, Arman
  last_name: Fazeli
- first_name: Alexander
  full_name: Vardy, Alexander
  last_name: Vardy
- first_name: John
  full_name: Cioffi, John
  last_name: Cioffi
- first_name: Andrea
  full_name: Goldsmith, Andrea
  last_name: Goldsmith
citation:
  ama: Hashemi SA, Mondelli M, Fazeli A, Vardy A, Cioffi J, Goldsmith A. Parallelism
    versus latency in simplified successive-cancellation decoding of polar codes.
    <i>IEEE Transactions on Wireless Communications</i>. 2022;21(6):3909-3920. doi:<a
    href="https://doi.org/10.1109/TWC.2021.3125626">10.1109/TWC.2021.3125626</a>
  apa: Hashemi, S. A., Mondelli, M., Fazeli, A., Vardy, A., Cioffi, J., &#38; Goldsmith,
    A. (2022). Parallelism versus latency in simplified successive-cancellation decoding
    of polar codes. <i>IEEE Transactions on Wireless Communications</i>. Institute
    of Electrical and Electronics Engineers. <a href="https://doi.org/10.1109/TWC.2021.3125626">https://doi.org/10.1109/TWC.2021.3125626</a>
  chicago: Hashemi, Seyyed Ali, Marco Mondelli, Arman Fazeli, Alexander Vardy, John
    Cioffi, and Andrea Goldsmith. “Parallelism versus Latency in Simplified Successive-Cancellation
    Decoding of Polar Codes.” <i>IEEE Transactions on Wireless Communications</i>.
    Institute of Electrical and Electronics Engineers, 2022. <a href="https://doi.org/10.1109/TWC.2021.3125626">https://doi.org/10.1109/TWC.2021.3125626</a>.
  ieee: S. A. Hashemi, M. Mondelli, A. Fazeli, A. Vardy, J. Cioffi, and A. Goldsmith,
    “Parallelism versus latency in simplified successive-cancellation decoding of
    polar codes,” <i>IEEE Transactions on Wireless Communications</i>, vol. 21, no.
    6. Institute of Electrical and Electronics Engineers, pp. 3909–3920, 2022.
  ista: Hashemi SA, Mondelli M, Fazeli A, Vardy A, Cioffi J, Goldsmith A. 2022. Parallelism
    versus latency in simplified successive-cancellation decoding of polar codes.
    IEEE Transactions on Wireless Communications. 21(6), 3909–3920.
  mla: Hashemi, Seyyed Ali, et al. “Parallelism versus Latency in Simplified Successive-Cancellation
    Decoding of Polar Codes.” <i>IEEE Transactions on Wireless Communications</i>,
    vol. 21, no. 6, Institute of Electrical and Electronics Engineers, 2022, pp. 3909–20,
    doi:<a href="https://doi.org/10.1109/TWC.2021.3125626">10.1109/TWC.2021.3125626</a>.
  short: S.A. Hashemi, M. Mondelli, A. Fazeli, A. Vardy, J. Cioffi, A. Goldsmith,
    IEEE Transactions on Wireless Communications 21 (2022) 3909–3920.
date_created: 2021-11-28T23:01:29Z
date_published: 2022-06-01T00:00:00Z
date_updated: 2024-09-10T13:03:18Z
day: '01'
department:
- _id: MaMo
doi: 10.1109/TWC.2021.3125626
external_id:
  arxiv:
  - '2012.13378'
  isi:
  - '000809406400028'
intvolume: '        21'
isi: 1
issue: '6'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2012.13378
month: '06'
oa: 1
oa_version: Preprint
page: 3909-3920
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: IEEE Transactions on Wireless Communications
publication_identifier:
  eissn:
  - 1558-2248
  issn:
  - 1536-1276
publication_status: published
publisher: Institute of Electrical and Electronics Engineers
quality_controlled: '1'
related_material:
  record:
  - id: '10053'
    relation: earlier_version
    status: public
scopus_import: '1'
status: public
title: Parallelism versus latency in simplified successive-cancellation decoding of
  polar codes
type: journal_article
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
volume: 21
year: '2022'
...
---
_id: '12016'
abstract:
- lang: eng
  text: We consider the problem of coded distributed computing using polar codes.
    The average execution time of a coded computing system is related to the error
    probability for transmission over the binary erasure channel in recent work by
    Soleymani, Jamali and Mahdavifar, where the performance of binary linear codes
    is investigated. In this paper, we focus on polar codes and unveil a connection
    between the average execution time and the scaling exponent μ of the family of
    codes. In the finite-length characterization of polar codes, the scaling exponent
    is a key object capturing the speed of convergence to capacity. In particular,
    we show that (i) the gap between the normalized average execution time of polar
    codes and that of optimal MDS codes is O(n –1/μ ), and (ii) this upper bound can
    be improved to roughly O(n –1/2 ) by considering polar codes with large kernels.
    We conjecture that these bounds could be improved to O(n –2/μ ) and O(n –1 ),
    respectively, and provide a heuristic argument as well as numerical evidence supporting
    this view.
acknowledgement: D. Fathollahi and M. Mondelli were partially supported by the 2019
  Lopez-Loreta Prize. The authors thank Hamed Hassani and Hessam Mahdavifar for helpful
  discussions.
article_processing_charge: No
arxiv: 1
author:
- first_name: Dorsa
  full_name: Fathollahi, Dorsa
  last_name: Fathollahi
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Fathollahi D, Mondelli M. Polar coded computing: The role of the scaling exponent.
    In: <i>2022 IEEE International Symposium on Information Theory</i>. Vol 2022.
    IEEE; 2022:2154-2159. doi:<a href="https://doi.org/10.1109/ISIT50566.2022.9834712">10.1109/ISIT50566.2022.9834712</a>'
  apa: 'Fathollahi, D., &#38; Mondelli, M. (2022). Polar coded computing: The role
    of the scaling exponent. In <i>2022 IEEE International Symposium on Information
    Theory</i> (Vol. 2022, pp. 2154–2159). Espoo, Finland: IEEE. <a href="https://doi.org/10.1109/ISIT50566.2022.9834712">https://doi.org/10.1109/ISIT50566.2022.9834712</a>'
  chicago: 'Fathollahi, Dorsa, and Marco Mondelli. “Polar Coded Computing: The Role
    of the Scaling Exponent.” In <i>2022 IEEE International Symposium on Information
    Theory</i>, 2022:2154–59. IEEE, 2022. <a href="https://doi.org/10.1109/ISIT50566.2022.9834712">https://doi.org/10.1109/ISIT50566.2022.9834712</a>.'
  ieee: 'D. Fathollahi and M. Mondelli, “Polar coded computing: The role of the scaling
    exponent,” in <i>2022 IEEE International Symposium on Information Theory</i>,
    Espoo, Finland, 2022, vol. 2022, pp. 2154–2159.'
  ista: 'Fathollahi D, Mondelli M. 2022. Polar coded computing: The role of the scaling
    exponent. 2022 IEEE International Symposium on Information Theory. ISIT: Internation
    Symposium on Information Theory vol. 2022, 2154–2159.'
  mla: 'Fathollahi, Dorsa, and Marco Mondelli. “Polar Coded Computing: The Role of
    the Scaling Exponent.” <i>2022 IEEE International Symposium on Information Theory</i>,
    vol. 2022, IEEE, 2022, pp. 2154–59, doi:<a href="https://doi.org/10.1109/ISIT50566.2022.9834712">10.1109/ISIT50566.2022.9834712</a>.'
  short: D. Fathollahi, M. Mondelli, in:, 2022 IEEE International Symposium on Information
    Theory, IEEE, 2022, pp. 2154–2159.
conference:
  end_date: 2022-07-01
  location: Espoo, Finland
  name: 'ISIT: Internation Symposium on Information Theory'
  start_date: 2022-06-26
date_created: 2022-09-04T22:02:05Z
date_published: 2022-08-03T00:00:00Z
date_updated: 2024-09-10T13:03:17Z
day: '03'
department:
- _id: MaMo
doi: 10.1109/ISIT50566.2022.9834712
external_id:
  arxiv:
  - '2201.10082'
intvolume: '      2022'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2201.10082
month: '08'
oa: 1
oa_version: Preprint
page: 2154-2159
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: 2022 IEEE International Symposium on Information Theory
publication_identifier:
  isbn:
  - '9781665421591'
  issn:
  - 2157-8095
publication_status: published
publisher: IEEE
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'Polar coded computing: The role of the scaling exponent'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 2022
year: '2022'
...
---
_id: '12480'
abstract:
- lang: eng
  text: 'We consider the problem of estimating a signal from measurements obtained
    via a generalized linear model. We focus on estimators based on approximate message
    passing (AMP), a family of iterative algorithms with many appealing features:
    the performance of AMP in the high-dimensional limit can be succinctly characterized
    under suitable model assumptions; AMP can also be tailored to the empirical distribution
    of the signal entries, and for a wide class of estimation problems, AMP is conjectured
    to be optimal among all polynomial-time algorithms. However, a major issue of
    AMP is that in many models (such as phase retrieval), it requires an initialization
    correlated with the ground-truth signal and independent from the measurement matrix.
    Assuming that such an initialization is available is typically not realistic.
    In this paper, we solve this problem by proposing an AMP algorithm initialized
    with a spectral estimator. With such an initialization, the standard AMP analysis
    fails since the spectral estimator depends in a complicated way on the design
    matrix. Our main contribution is a rigorous characterization of the performance
    of AMP with spectral initialization in the high-dimensional limit. The key technical
    idea is to define and analyze a two-phase artificial AMP algorithm that first
    produces the spectral estimator, and then closely approximates the iterates of
    the true AMP. We also provide numerical results that demonstrate the validity
    of the proposed approach.'
acknowledgement: "The authors would like to thank Andrea Montanari for helpful discussions.\r\nM
  Mondelli was partially supported by the 2019 Lopez-Loreta Prize. R Venkataramanan
  was partially supported by the Alan Turing Institute under the EPSRC Grant\r\nEP/N510129/1."
article_number: '114003'
article_processing_charge: Yes (via OA deal)
article_type: original
author:
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
- first_name: Ramji
  full_name: Venkataramanan, Ramji
  last_name: Venkataramanan
citation:
  ama: 'Mondelli M, Venkataramanan R. Approximate message passing with spectral initialization
    for generalized linear models. <i>Journal of Statistical Mechanics: Theory and
    Experiment</i>. 2022;2022(11). doi:<a href="https://doi.org/10.1088/1742-5468/ac9828">10.1088/1742-5468/ac9828</a>'
  apa: 'Mondelli, M., &#38; Venkataramanan, R. (2022). Approximate message passing
    with spectral initialization for generalized linear models. <i>Journal of Statistical
    Mechanics: Theory and Experiment</i>. IOP Publishing. <a href="https://doi.org/10.1088/1742-5468/ac9828">https://doi.org/10.1088/1742-5468/ac9828</a>'
  chicago: 'Mondelli, Marco, and Ramji Venkataramanan. “Approximate Message Passing
    with Spectral Initialization for Generalized Linear Models.” <i>Journal of Statistical
    Mechanics: Theory and Experiment</i>. IOP Publishing, 2022. <a href="https://doi.org/10.1088/1742-5468/ac9828">https://doi.org/10.1088/1742-5468/ac9828</a>.'
  ieee: 'M. Mondelli and R. Venkataramanan, “Approximate message passing with spectral
    initialization for generalized linear models,” <i>Journal of Statistical Mechanics:
    Theory and Experiment</i>, vol. 2022, no. 11. IOP Publishing, 2022.'
  ista: 'Mondelli M, Venkataramanan R. 2022. Approximate message passing with spectral
    initialization for generalized linear models. Journal of Statistical Mechanics:
    Theory and Experiment. 2022(11), 114003.'
  mla: 'Mondelli, Marco, and Ramji Venkataramanan. “Approximate Message Passing with
    Spectral Initialization for Generalized Linear Models.” <i>Journal of Statistical
    Mechanics: Theory and Experiment</i>, vol. 2022, no. 11, 114003, IOP Publishing,
    2022, doi:<a href="https://doi.org/10.1088/1742-5468/ac9828">10.1088/1742-5468/ac9828</a>.'
  short: 'M. Mondelli, R. Venkataramanan, Journal of Statistical Mechanics: Theory
    and Experiment 2022 (2022).'
date_created: 2023-02-02T08:31:57Z
date_published: 2022-11-24T00:00:00Z
date_updated: 2024-03-07T10:36:52Z
day: '24'
ddc:
- '510'
- '530'
department:
- _id: MaMo
doi: 10.1088/1742-5468/ac9828
external_id:
  isi:
  - '000889589900001'
file:
- access_level: open_access
  checksum: 01411ffa76d3e380a0446baeb89b1ef7
  content_type: application/pdf
  creator: dernst
  date_created: 2023-02-02T08:35:52Z
  date_updated: 2023-02-02T08:35:52Z
  file_id: '12481'
  file_name: 2022_JourStatisticalMechanics_Mondelli.pdf
  file_size: 1729997
  relation: main_file
  success: 1
file_date_updated: 2023-02-02T08:35:52Z
has_accepted_license: '1'
intvolume: '      2022'
isi: 1
issue: '11'
keyword:
- Statistics
- Probability and Uncertainty
- Statistics and Probability
- Statistical and Nonlinear Physics
language:
- iso: eng
month: '11'
oa: 1
oa_version: Published Version
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: 'Journal of Statistical Mechanics: Theory and Experiment'
publication_identifier:
  issn:
  - 1742-5468
publication_status: published
publisher: IOP Publishing
quality_controlled: '1'
related_material:
  record:
  - id: '10598'
    relation: earlier_version
    status: public
scopus_import: '1'
status: public
title: Approximate message passing with spectral initialization for generalized linear
  models
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: journal_article
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
volume: 2022
year: '2022'
...
---
_id: '12537'
abstract:
- lang: eng
  text: 'The Neural Tangent Kernel (NTK) has emerged as a powerful tool to provide
    memorization, optimization and generalization guarantees in deep neural networks.
    A line of work has studied the NTK spectrum for two-layer and deep networks with
    at least a layer with Ω(N) neurons, N being the number of training samples. Furthermore,
    there is increasing evidence suggesting that deep networks with sub-linear layer
    widths are powerful memorizers and optimizers, as long as the number of parameters
    exceeds the number of samples. Thus, a natural open question is whether the NTK
    is well conditioned in such a challenging sub-linear setup. In this paper, we
    answer this question in the affirmative. Our key technical contribution is a lower
    bound on the smallest NTK eigenvalue for deep networks with the minimum possible
    over-parameterization: the number of parameters is roughly Ω(N) and, hence, the
    number of neurons is as little as Ω(N−−√). To showcase the applicability of our
    NTK bounds, we provide two results concerning memorization capacity and optimization
    guarantees for gradient descent training.'
acknowledgement: "The authors were partially supported by the 2019 Lopez-Loreta prize,
  and they would like to thank\r\nQuynh Nguyen, Mahdi Soltanolkotabi and Adel Javanmard
  for helpful discussions.\r\n"
article_processing_charge: No
arxiv: 1
author:
- first_name: Simone
  full_name: Bombari, Simone
  id: ca726dda-de17-11ea-bc14-f9da834f63aa
  last_name: Bombari
- first_name: Mohammad Hossein
  full_name: Amani, Mohammad Hossein
  last_name: Amani
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Bombari S, Amani MH, Mondelli M. Memorization and optimization in deep neural
    networks with minimum over-parameterization. In: <i>36th Conference on Neural
    Information Processing Systems</i>. Vol 35. Curran Associates; 2022:7628-7640.'
  apa: Bombari, S., Amani, M. H., &#38; Mondelli, M. (2022). Memorization and optimization
    in deep neural networks with minimum over-parameterization. In <i>36th Conference
    on Neural Information Processing Systems</i> (Vol. 35, pp. 7628–7640). Curran
    Associates.
  chicago: Bombari, Simone, Mohammad Hossein Amani, and Marco Mondelli. “Memorization
    and Optimization in Deep Neural Networks with Minimum Over-Parameterization.”
    In <i>36th Conference on Neural Information Processing Systems</i>, 35:7628–40.
    Curran Associates, 2022.
  ieee: S. Bombari, M. H. Amani, and M. Mondelli, “Memorization and optimization in
    deep neural networks with minimum over-parameterization,” in <i>36th Conference
    on Neural Information Processing Systems</i>, 2022, vol. 35, pp. 7628–7640.
  ista: Bombari S, Amani MH, Mondelli M. 2022. Memorization and optimization in deep
    neural networks with minimum over-parameterization. 36th Conference on Neural
    Information Processing Systems. vol. 35, 7628–7640.
  mla: Bombari, Simone, et al. “Memorization and Optimization in Deep Neural Networks
    with Minimum Over-Parameterization.” <i>36th Conference on Neural Information
    Processing Systems</i>, vol. 35, Curran Associates, 2022, pp. 7628–40.
  short: S. Bombari, M.H. Amani, M. Mondelli, in:, 36th Conference on Neural Information
    Processing Systems, Curran Associates, 2022, pp. 7628–7640.
date_created: 2023-02-10T13:46:37Z
date_published: 2022-07-24T00:00:00Z
date_updated: 2024-09-10T13:03:19Z
day: '24'
department:
- _id: MaMo
external_id:
  arxiv:
  - '2205.10217'
intvolume: '        35'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: ' https://doi.org/10.48550/arXiv.2205.10217'
month: '07'
oa: 1
oa_version: Preprint
page: 7628-7640
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: 36th Conference on Neural Information Processing Systems
publication_identifier:
  isbn:
  - '9781713871088'
publication_status: published
publisher: Curran Associates
quality_controlled: '1'
status: public
title: Memorization and optimization in deep neural networks with minimum over-parameterization
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 35
year: '2022'
...
---
_id: '12540'
abstract:
- lang: eng
  text: We consider the problem of signal estimation in generalized linear models
    defined via rotationally invariant design matrices. Since these matrices can have
    an arbitrary spectral distribution, this model is well suited for capturing complex
    correlation structures which often arise in applications. We propose a novel family
    of approximate message passing (AMP) algorithms for signal estimation, and rigorously
    characterize their performance in the high-dimensional limit via a state evolution
    recursion. Our rotationally invariant AMP has complexity of the same order as
    the existing AMP derived under the restrictive assumption of a Gaussian design;
    our algorithm also recovers this existing AMP as a special case. Numerical results
    showcase a performance close to Vector AMP (which is conjectured to be Bayes-optimal
    in some settings), but obtained with a much lower complexity, as the proposed
    algorithm does not require a computationally expensive singular value decomposition.
acknowledgement: The authors would like to thank the anonymous reviewers for their
  helpful comments. KK and MM were partially supported by the 2019 Lopez-Loreta Prize.
article_number: '22'
article_processing_charge: No
author:
- first_name: Ramji
  full_name: Venkataramanan, Ramji
  last_name: Venkataramanan
- first_name: Kevin
  full_name: Kögler, Kevin
  id: 94ec913c-dc85-11ea-9058-e5051ab2428b
  last_name: Kögler
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Venkataramanan R, Kögler K, Mondelli M. Estimation in rotationally invariant
    generalized linear models via approximate message passing. In: <i>Proceedings
    of the 39th International Conference on Machine Learning</i>. Vol 162. ML Research
    Press; 2022.'
  apa: 'Venkataramanan, R., Kögler, K., &#38; Mondelli, M. (2022). Estimation in rotationally
    invariant generalized linear models via approximate message passing. In <i>Proceedings
    of the 39th International Conference on Machine Learning</i> (Vol. 162). Baltimore,
    MD, United States: ML Research Press.'
  chicago: Venkataramanan, Ramji, Kevin Kögler, and Marco Mondelli. “Estimation in
    Rotationally Invariant Generalized Linear Models via Approximate Message Passing.”
    In <i>Proceedings of the 39th International Conference on Machine Learning</i>,
    Vol. 162. ML Research Press, 2022.
  ieee: R. Venkataramanan, K. Kögler, and M. Mondelli, “Estimation in rotationally
    invariant generalized linear models via approximate message passing,” in <i>Proceedings
    of the 39th International Conference on Machine Learning</i>, Baltimore, MD, United
    States, 2022, vol. 162.
  ista: 'Venkataramanan R, Kögler K, Mondelli M. 2022. Estimation in rotationally
    invariant generalized linear models via approximate message passing. Proceedings
    of the 39th International Conference on Machine Learning. ICML: International
    Conference on Machine Learning vol. 162, 22.'
  mla: Venkataramanan, Ramji, et al. “Estimation in Rotationally Invariant Generalized
    Linear Models via Approximate Message Passing.” <i>Proceedings of the 39th International
    Conference on Machine Learning</i>, vol. 162, 22, ML Research Press, 2022.
  short: R. Venkataramanan, K. Kögler, M. Mondelli, in:, Proceedings of the 39th International
    Conference on Machine Learning, ML Research Press, 2022.
conference:
  end_date: 2022-07-23
  location: Baltimore, MD, United States
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2022-07-17
date_created: 2023-02-10T13:49:04Z
date_published: 2022-01-01T00:00:00Z
date_updated: 2024-09-10T13:03:17Z
ddc:
- '000'
department:
- _id: MaMo
file:
- access_level: open_access
  checksum: 67436eb0a660789514cdf9db79e84683
  content_type: application/pdf
  creator: dernst
  date_created: 2023-02-13T10:53:11Z
  date_updated: 2023-02-13T10:53:11Z
  file_id: '12547'
  file_name: 2022_PMLR_Venkataramanan.pdf
  file_size: 2341343
  relation: main_file
  success: 1
file_date_updated: 2023-02-13T10:53:11Z
has_accepted_license: '1'
intvolume: '       162'
language:
- iso: eng
oa: 1
oa_version: Published Version
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: Proceedings of the 39th International Conference on Machine Learning
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
status: public
title: Estimation in rotationally invariant generalized linear models via approximate
  message passing
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 162
year: '2022'
...
---
_id: '13146'
abstract:
- lang: eng
  text: 'A recent line of work has analyzed the theoretical properties of deep neural
    networks via the Neural Tangent Kernel (NTK). In particular, the smallest eigenvalue
    of the NTK has been related to the memorization capacity, the global convergence
    of gradient descent algorithms and the generalization of deep nets. However, existing
    results either provide bounds in the two-layer setting or assume that the spectrum
    of the NTK matrices is bounded away from 0 for multi-layer networks. In this paper,
    we provide tight bounds on the smallest eigenvalue of NTK matrices for deep ReLU
    nets, both in the limiting case of infinite widths and for finite widths. In the
    finite-width setting, the network architectures we consider are fairly general:
    we require the existence of a wide layer with roughly order of N neurons, N being
    the number of data samples; and the scaling of the remaining layer widths is arbitrary
    (up to logarithmic factors). To obtain our results, we analyze various quantities
    of independent interest: we give lower bounds on the smallest singular value of
    hidden feature matrices, and upper bounds on the Lipschitz constant of input-output
    feature maps.'
acknowledgement: The authors would like to thank the anonymous reviewers for their
  helpful comments. MM was partially supported by the 2019 Lopez-Loreta Prize. QN
  and GM acknowledge support from the European Research Council (ERC) under the European
  Union’s Horizon 2020 research and innovation programme (grant agreement no 757983).
article_processing_charge: No
arxiv: 1
author:
- first_name: Quynh
  full_name: Nguyen, Quynh
  last_name: Nguyen
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
- first_name: Guido
  full_name: Montufar, Guido
  last_name: Montufar
citation:
  ama: 'Nguyen Q, Mondelli M, Montufar G. Tight bounds on the smallest Eigenvalue
    of the neural tangent kernel for deep ReLU networks. In: <i>Proceedings of the
    38th International Conference on Machine Learning</i>. Vol 139. ML Research Press;
    2021:8119-8129.'
  apa: 'Nguyen, Q., Mondelli, M., &#38; Montufar, G. (2021). Tight bounds on the smallest
    Eigenvalue of the neural tangent kernel for deep ReLU networks. In <i>Proceedings
    of the 38th International Conference on Machine Learning</i> (Vol. 139, pp. 8119–8129).
    Virtual: ML Research Press.'
  chicago: Nguyen, Quynh, Marco Mondelli, and Guido Montufar. “Tight Bounds on the
    Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks.” In <i>Proceedings
    of the 38th International Conference on Machine Learning</i>, 139:8119–29. ML
    Research Press, 2021.
  ieee: Q. Nguyen, M. Mondelli, and G. Montufar, “Tight bounds on the smallest Eigenvalue
    of the neural tangent kernel for deep ReLU networks,” in <i>Proceedings of the
    38th International Conference on Machine Learning</i>, Virtual, 2021, vol. 139,
    pp. 8119–8129.
  ista: Nguyen Q, Mondelli M, Montufar G. 2021. Tight bounds on the smallest Eigenvalue
    of the neural tangent kernel for deep ReLU networks. Proceedings of the 38th International
    Conference on Machine Learning. International Conference on Machine Learning vol.
    139, 8119–8129.
  mla: Nguyen, Quynh, et al. “Tight Bounds on the Smallest Eigenvalue of the Neural
    Tangent Kernel for Deep ReLU Networks.” <i>Proceedings of the 38th International
    Conference on Machine Learning</i>, vol. 139, ML Research Press, 2021, pp. 8119–29.
  short: Q. Nguyen, M. Mondelli, G. Montufar, in:, Proceedings of the 38th International
    Conference on Machine Learning, ML Research Press, 2021, pp. 8119–8129.
conference:
  end_date: 2021-07-24
  location: Virtual
  name: International Conference on Machine Learning
  start_date: 2021-07-18
date_created: 2023-06-18T22:00:48Z
date_published: 2021-07-01T00:00:00Z
date_updated: 2024-09-10T13:03:17Z
day: '01'
ddc:
- '000'
department:
- _id: MaMo
external_id:
  arxiv:
  - '2012.11654'
file:
- access_level: open_access
  checksum: 19489cf5e16a0596b1f92e317d97c9b0
  content_type: application/pdf
  creator: dernst
  date_created: 2023-06-19T10:49:12Z
  date_updated: 2023-06-19T10:49:12Z
  file_id: '13155'
  file_name: 2021_PMLR_Nguyen.pdf
  file_size: 591332
  relation: main_file
  success: 1
file_date_updated: 2023-06-19T10:49:12Z
has_accepted_license: '1'
intvolume: '       139'
language:
- iso: eng
month: '07'
oa: 1
oa_version: Published Version
page: 8119-8129
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: Proceedings of the 38th International Conference on Machine Learning
publication_identifier:
  eissn:
  - 2640-3498
  isbn:
  - '9781713845065'
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: Tight bounds on the smallest Eigenvalue of the neural tangent kernel for deep
  ReLU networks
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 139
year: '2021'
...
---
_id: '10053'
abstract:
- lang: eng
  text: 'This paper characterizes the latency of the simplified successive-cancellation
    (SSC) decoding scheme for polar codes under hardware resource constraints. In
    particular, when the number of processing elements P that can perform SSC decoding
    operations in parallel is limited, as is the case in practice, the latency of
    SSC decoding is O(N1−1 μ+NPlog2log2NP), where N is the block length of the code
    and μ is the scaling exponent of polar codes for the channel. Three direct consequences
    of this bound are presented. First, in a fully-parallel implementation where P=N2
    , the latency of SSC decoding is O(N1−1/μ) , which is sublinear in the block length.
    This recovers a result from an earlier work. Second, in a fully-serial implementation
    where P=1 , the latency of SSC decoding scales as O(Nlog2log2N) . The multiplicative
    constant is also calculated: we show that the latency of SSC decoding when P=1
    is given by (2+o(1))Nlog2log2N . Third, in a semi-parallel implementation, the
    smallest P that gives the same latency as that of the fully-parallel implementation
    is P=N1/μ . The tightness of our bound on SSC decoding latency and the applicability
    of the foregoing results is validated through extensive simulations.'
acknowledgement: "S. A. Hashemi is supported by a Postdoctoral Fellowship from the
  Natural Sciences and Engineering Research Council\r\nof Canada (NSERC) and by Huawei.
  M. Mondelli is partially supported by the 2019 Lopez-Loreta Prize. A. Fazeli and
  A. Vardy were supported in part by the National Science Foundation under Grant CCF-1764104."
article_processing_charge: No
arxiv: 1
author:
- first_name: Seyyed Ali
  full_name: Hashemi, Seyyed Ali
  last_name: Hashemi
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
- first_name: Arman
  full_name: Fazeli, Arman
  last_name: Fazeli
- first_name: Alexander
  full_name: Vardy, Alexander
  last_name: Vardy
- first_name: John
  full_name: Cioffi, John
  last_name: Cioffi
- first_name: Andrea
  full_name: Goldsmith, Andrea
  last_name: Goldsmith
citation:
  ama: 'Hashemi SA, Mondelli M, Fazeli A, Vardy A, Cioffi J, Goldsmith A. Parallelism
    versus latency in simplified successive-cancellation decoding of polar codes.
    In: <i>2021 IEEE International Symposium on Information Theory</i>. Institute
    of Electrical and Electronics Engineers; 2021:2369-2374. doi:<a href="https://doi.org/10.1109/ISIT45174.2021.9518153">10.1109/ISIT45174.2021.9518153</a>'
  apa: 'Hashemi, S. A., Mondelli, M., Fazeli, A., Vardy, A., Cioffi, J., &#38; Goldsmith,
    A. (2021). Parallelism versus latency in simplified successive-cancellation decoding
    of polar codes. In <i>2021 IEEE International Symposium on Information Theory</i>
    (pp. 2369–2374). Melbourne, Australia: Institute of Electrical and Electronics
    Engineers. <a href="https://doi.org/10.1109/ISIT45174.2021.9518153">https://doi.org/10.1109/ISIT45174.2021.9518153</a>'
  chicago: Hashemi, Seyyed Ali, Marco Mondelli, Arman Fazeli, Alexander Vardy, John
    Cioffi, and Andrea Goldsmith. “Parallelism versus Latency in Simplified Successive-Cancellation
    Decoding of Polar Codes.” In <i>2021 IEEE International Symposium on Information
    Theory</i>, 2369–74. Institute of Electrical and Electronics Engineers, 2021.
    <a href="https://doi.org/10.1109/ISIT45174.2021.9518153">https://doi.org/10.1109/ISIT45174.2021.9518153</a>.
  ieee: S. A. Hashemi, M. Mondelli, A. Fazeli, A. Vardy, J. Cioffi, and A. Goldsmith,
    “Parallelism versus latency in simplified successive-cancellation decoding of
    polar codes,” in <i>2021 IEEE International Symposium on Information Theory</i>,
    Melbourne, Australia, 2021, pp. 2369–2374.
  ista: 'Hashemi SA, Mondelli M, Fazeli A, Vardy A, Cioffi J, Goldsmith A. 2021. Parallelism
    versus latency in simplified successive-cancellation decoding of polar codes.
    2021 IEEE International Symposium on Information Theory. ISIT: International Symposium
    on Information Theory, 2369–2374.'
  mla: Hashemi, Seyyed Ali, et al. “Parallelism versus Latency in Simplified Successive-Cancellation
    Decoding of Polar Codes.” <i>2021 IEEE International Symposium on Information
    Theory</i>, Institute of Electrical and Electronics Engineers, 2021, pp. 2369–74,
    doi:<a href="https://doi.org/10.1109/ISIT45174.2021.9518153">10.1109/ISIT45174.2021.9518153</a>.
  short: S.A. Hashemi, M. Mondelli, A. Fazeli, A. Vardy, J. Cioffi, A. Goldsmith,
    in:, 2021 IEEE International Symposium on Information Theory, Institute of Electrical
    and Electronics Engineers, 2021, pp. 2369–2374.
conference:
  end_date: 2021-07-20
  location: Melbourne, Australia
  name: 'ISIT: International Symposium on Information Theory'
  start_date: 2021-07-12
date_created: 2021-09-27T14:33:14Z
date_published: 2021-09-01T00:00:00Z
date_updated: 2024-09-10T13:03:18Z
day: '01'
department:
- _id: MaMo
doi: 10.1109/ISIT45174.2021.9518153
external_id:
  arxiv:
  - '2012.13378'
  isi:
  - '000701502202078'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2012.13378
month: '09'
oa: 1
oa_version: Preprint
page: 2369-2374
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: 2021 IEEE International Symposium on Information Theory
publication_identifier:
  eisbn:
  - 978-1-5386-8209-8
  isbn:
  - 978-1-5386-8210-4
  issn:
  - 2157-8095
publication_status: published
publisher: Institute of Electrical and Electronics Engineers
quality_controlled: '1'
related_material:
  record:
  - id: '10364'
    relation: later_version
    status: public
scopus_import: '1'
status: public
title: Parallelism versus latency in simplified successive-cancellation decoding of
  polar codes
type: conference
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
year: '2021'
...
---
_id: '10593'
abstract:
- lang: eng
  text: 'We study the problem of estimating a rank-$1$ signal in the presence of rotationally
    invariant noise-a class of perturbations more general than Gaussian noise. Principal
    Component Analysis (PCA) provides a natural estimator, and sharp results on its
    performance have been obtained in the high-dimensional regime. Recently, an Approximate
    Message Passing (AMP) algorithm has been proposed as an alternative estimator
    with the potential to improve the accuracy of PCA. However, the existing analysis
    of AMP requires an initialization that is both correlated with the signal and
    independent of the noise, which is often unrealistic in practice. In this work,
    we combine the two methods, and propose to initialize AMP with PCA. Our main result
    is a rigorous asymptotic characterization of the performance of this estimator.
    Both the AMP algorithm and its analysis differ from those previously derived in
    the Gaussian setting: at every iteration, our AMP algorithm requires a specific
    term to account for PCA initialization, while in the Gaussian case, PCA initialization
    affects only the first iteration of AMP. The proof is based on a two-phase artificial
    AMP that first approximates the PCA estimator and then mimics the true AMP. Our
    numerical simulations show an excellent agreement between AMP results and theoretical
    predictions, and suggest an interesting open direction on achieving Bayes-optimal
    performance.'
acknowledgement: "M. Mondelli would like to thank László Erdős for helpful discussions.
  M. Mondelli was partially supported by the 2019 Lopez-Loreta Prize. R. Venkataramanan
  was partially supported by the Alan Turing Institute under the EPSRC grant EP/N510129/1.\r\n"
article_processing_charge: No
arxiv: 1
author:
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
- first_name: Ramji
  full_name: Venkataramanan, Ramji
  last_name: Venkataramanan
citation:
  ama: 'Mondelli M, Venkataramanan R. PCA initialization for approximate message passing
    in rotationally invariant models. In: <i>35th Conference on Neural Information
    Processing Systems</i>. Vol 35. Neural Information Processing Systems Foundation;
    2021:29616-29629.'
  apa: 'Mondelli, M., &#38; Venkataramanan, R. (2021). PCA initialization for approximate
    message passing in rotationally invariant models. In <i>35th Conference on Neural
    Information Processing Systems</i> (Vol. 35, pp. 29616–29629). Virtual: Neural
    Information Processing Systems Foundation.'
  chicago: Mondelli, Marco, and Ramji Venkataramanan. “PCA Initialization for Approximate
    Message Passing in Rotationally Invariant Models.” In <i>35th Conference on Neural
    Information Processing Systems</i>, 35:29616–29. Neural Information Processing
    Systems Foundation, 2021.
  ieee: M. Mondelli and R. Venkataramanan, “PCA initialization for approximate message
    passing in rotationally invariant models,” in <i>35th Conference on Neural Information
    Processing Systems</i>, Virtual, 2021, vol. 35, pp. 29616–29629.
  ista: 'Mondelli M, Venkataramanan R. 2021. PCA initialization for approximate message
    passing in rotationally invariant models. 35th Conference on Neural Information
    Processing Systems. NeurIPS: Neural Information Processing Systems vol. 35, 29616–29629.'
  mla: Mondelli, Marco, and Ramji Venkataramanan. “PCA Initialization for Approximate
    Message Passing in Rotationally Invariant Models.” <i>35th Conference on Neural
    Information Processing Systems</i>, vol. 35, Neural Information Processing Systems
    Foundation, 2021, pp. 29616–29.
  short: M. Mondelli, R. Venkataramanan, in:, 35th Conference on Neural Information
    Processing Systems, Neural Information Processing Systems Foundation, 2021, pp.
    29616–29629.
conference:
  end_date: 2021-12-14
  location: Virtual
  name: 'NeurIPS: Neural Information Processing Systems'
  start_date: 2021-12-06
date_created: 2022-01-03T10:50:02Z
date_published: 2021-12-01T00:00:00Z
date_updated: 2024-09-10T13:03:19Z
day: '01'
department:
- _id: MaMo
external_id:
  arxiv:
  - '2106.02356'
intvolume: '        35'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2106.02356
month: '12'
oa: 1
oa_version: Preprint
page: 29616-29629
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: 35th Conference on Neural Information Processing Systems
publication_identifier:
  isbn:
  - '9781713845393'
  issn:
  - 1049-5258
publication_status: published
publisher: Neural Information Processing Systems Foundation
quality_controlled: '1'
scopus_import: '1'
status: public
title: PCA initialization for approximate message passing in rotationally invariant
  models
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 35
year: '2021'
...
---
_id: '10594'
abstract:
- lang: eng
  text: 'The question of how and why the phenomenon of mode connectivity occurs in
    training deep neural networks has gained remarkable attention in the research
    community. From a theoretical perspective, two possible explanations have been
    proposed: (i) the loss function has connected sublevel sets, and (ii) the solutions
    found by stochastic gradient descent are dropout stable. While these explanations
    provide insights into the phenomenon, their assumptions are not always satisfied
    in practice. In particular, the first approach requires the network to have one
    layer with order of N neurons (N being the number of training samples), while
    the second one requires the loss to be almost invariant after removing half of
    the neurons at each layer (up to some rescaling of the remaining ones). In this
    work, we improve both conditions by exploiting the quality of the features at
    every intermediate layer together with a milder over-parameterization condition.
    More specifically, we show that: (i) under generic assumptions on the features
    of intermediate layers, it suffices that the last two hidden layers have order
    of N−−√ neurons, and (ii) if subsets of features at each layer are linearly separable,
    then no over-parameterization is needed to show the connectivity. Our experiments
    confirm that the proposed condition ensures the connectivity of solutions found
    by stochastic gradient descent, even in settings where the previous requirements
    do not hold.'
acknowledgement: MM was partially supported by the 2019 Lopez-Loreta Prize. QN and
  PB acknowledge support from the European Research Council (ERC) under the European
  Union’s Horizon 2020 research and innovation programme (grant agreement no 757983).
article_processing_charge: No
arxiv: 1
author:
- first_name: Quynh
  full_name: Nguyen, Quynh
  last_name: Nguyen
- first_name: Pierre
  full_name: Bréchet, Pierre
  last_name: Bréchet
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Nguyen Q, Bréchet P, Mondelli M. When are solutions connected in deep networks?
    In: <i>35th Conference on Neural Information Processing Systems</i>. Vol 35. Neural
    Information Processing Systems Foundation; 2021.'
  apa: 'Nguyen, Q., Bréchet, P., &#38; Mondelli, M. (2021). When are solutions connected
    in deep networks? In <i>35th Conference on Neural Information Processing Systems</i>
    (Vol. 35). Virtual: Neural Information Processing Systems Foundation.'
  chicago: Nguyen, Quynh, Pierre Bréchet, and Marco Mondelli. “When Are Solutions
    Connected in Deep Networks?” In <i>35th Conference on Neural Information Processing
    Systems</i>, Vol. 35. Neural Information Processing Systems Foundation, 2021.
  ieee: Q. Nguyen, P. Bréchet, and M. Mondelli, “When are solutions connected in deep
    networks?,” in <i>35th Conference on Neural Information Processing Systems</i>,
    Virtual, 2021, vol. 35.
  ista: Nguyen Q, Bréchet P, Mondelli M. 2021. When are solutions connected in deep
    networks? 35th Conference on Neural Information Processing Systems. 35th Conference
    on Neural Information Processing Systems vol. 35.
  mla: Nguyen, Quynh, et al. “When Are Solutions Connected in Deep Networks?” <i>35th
    Conference on Neural Information Processing Systems</i>, vol. 35, Neural Information
    Processing Systems Foundation, 2021.
  short: Q. Nguyen, P. Bréchet, M. Mondelli, in:, 35th Conference on Neural Information
    Processing Systems, Neural Information Processing Systems Foundation, 2021.
conference:
  end_date: 2021-12-14
  location: Virtual
  name: 35th Conference on Neural Information Processing Systems
  start_date: 2021-12-06
date_created: 2022-01-03T10:56:20Z
date_published: 2021-12-01T00:00:00Z
date_updated: 2024-09-10T13:03:19Z
day: '01'
department:
- _id: MaMo
external_id:
  arxiv:
  - '2102.09671'
intvolume: '        35'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2102.09671
month: '12'
oa: 1
oa_version: Preprint
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: 35th Conference on Neural Information Processing Systems
publication_identifier:
  isbn:
  - '9781713845393'
  issn:
  - 1049-5258
publication_status: published
publisher: Neural Information Processing Systems Foundation
quality_controlled: '1'
status: public
title: When are solutions connected in deep networks?
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 35
year: '2021'
...
---
_id: '10595'
abstract:
- lang: eng
  text: 'A recent line of work has analyzed the theoretical properties of deep neural
    networks via the Neural Tangent Kernel (NTK). In particular, the smallest eigenvalue
    of the NTK has been related to the memorization capacity, the global convergence
    of gradient descent algorithms and the generalization of deep nets. However, existing
    results either provide bounds in the two-layer setting or assume that the spectrum
    of the NTK matrices is bounded away from 0 for multi-layer networks. In this paper,
    we provide tight bounds on the smallest eigenvalue of NTK matrices for deep ReLU
    nets, both in the limiting case of infinite widths and for finite widths. In the
    finite-width setting, the network architectures we consider are fairly general:
    we require the existence of a wide layer with roughly order of $N$ neurons, $N$
    being the number of data samples; and the scaling of the remaining layer widths
    is arbitrary (up to logarithmic factors). To obtain our results, we analyze various
    quantities of independent interest: we give lower bounds on the smallest singular
    value of hidden feature matrices, and upper bounds on the Lipschitz constant of
    input-output feature maps.'
acknowledgement: "The authors would like to thank the anonymous reviewers for their
  helpful comments. MM was partially supported\r\nby the 2019 Lopez-Loreta Prize.
  QN and GM acknowledge support from the European Research Council (ERC) under\r\nthe
  European Union’s Horizon 2020 research and innovation programme (grant agreement
  no 757983)."
alternative_title:
- Proceedings of Machine Learning Research
article_processing_charge: No
arxiv: 1
author:
- first_name: Quynh
  full_name: Nguyen, Quynh
  last_name: Nguyen
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
- first_name: Guido F
  full_name: Montufar, Guido F
  last_name: Montufar
citation:
  ama: 'Nguyen Q, Mondelli M, Montufar GF. Tight bounds on the smallest eigenvalue
    of the neural tangent kernel for deep ReLU networks. In: Meila M, Zhang T, eds.
    <i>Proceedings of the 38th International Conference on Machine Learning</i>. Vol
    139. ML Research Press; 2021:8119-8129.'
  apa: 'Nguyen, Q., Mondelli, M., &#38; Montufar, G. F. (2021). Tight bounds on the
    smallest eigenvalue of the neural tangent kernel for deep ReLU networks. In M.
    Meila &#38; T. Zhang (Eds.), <i>Proceedings of the 38th International Conference
    on Machine Learning</i> (Vol. 139, pp. 8119–8129). Virtual: ML Research Press.'
  chicago: Nguyen, Quynh, Marco Mondelli, and Guido F Montufar. “Tight Bounds on the
    Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks.” In <i>Proceedings
    of the 38th International Conference on Machine Learning</i>, edited by Marina
    Meila and Tong Zhang, 139:8119–29. ML Research Press, 2021.
  ieee: Q. Nguyen, M. Mondelli, and G. F. Montufar, “Tight bounds on the smallest
    eigenvalue of the neural tangent kernel for deep ReLU networks,” in <i>Proceedings
    of the 38th International Conference on Machine Learning</i>, Virtual, 2021, vol.
    139, pp. 8119–8129.
  ista: 'Nguyen Q, Mondelli M, Montufar GF. 2021. Tight bounds on the smallest eigenvalue
    of the neural tangent kernel for deep ReLU networks. Proceedings of the 38th International
    Conference on Machine Learning. ICML: International Conference on Machine Learning,
    Proceedings of Machine Learning Research, vol. 139, 8119–8129.'
  mla: Nguyen, Quynh, et al. “Tight Bounds on the Smallest Eigenvalue of the Neural
    Tangent Kernel for Deep ReLU Networks.” <i>Proceedings of the 38th International
    Conference on Machine Learning</i>, edited by Marina Meila and Tong Zhang, vol.
    139, ML Research Press, 2021, pp. 8119–29.
  short: Q. Nguyen, M. Mondelli, G.F. Montufar, in:, M. Meila, T. Zhang (Eds.), Proceedings
    of the 38th International Conference on Machine Learning, ML Research Press, 2021,
    pp. 8119–8129.
conference:
  end_date: 2021-07-24
  location: Virtual
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2021-07-18
date_created: 2022-01-03T10:57:49Z
date_published: 2021-01-01T00:00:00Z
date_updated: 2024-09-10T13:03:17Z
department:
- _id: MaMo
editor:
- first_name: Marina
  full_name: Meila, Marina
  last_name: Meila
- first_name: Tong
  full_name: Zhang, Tong
  last_name: Zhang
external_id:
  arxiv:
  - '2012.11654'
intvolume: '       139'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: http://proceedings.mlr.press/v139/nguyen21g.html
oa: 1
oa_version: Published Version
page: 8119-8129
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: Proceedings of the 38th International Conference on Machine Learning
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
status: public
title: Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep
  ReLU networks
type: conference
user_id: 8b945eb4-e2f2-11eb-945a-df72226e66a9
volume: 139
year: '2021'
...
---
_id: '10597'
abstract:
- lang: eng
  text: We thank Emmanuel Abbe and Min Ye for providing us the implementation of RPA
    decoding. D. Fathollahi and M. Mondelli are partially supported by the 2019 Lopez-Loreta
    Prize. N. Farsad is supported by Discovery Grant from the Natural Sciences and
    Engineering Research Council of Canada (NSERC) and Canada Foundation for Innovation
    (CFI), John R. Evans Leader Fund. S. A. Hashemi is supported by a Postdoctoral
    Fellowship from NSERC.
article_processing_charge: No
arxiv: 1
author:
- first_name: Dorsa
  full_name: Fathollahi, Dorsa
  last_name: Fathollahi
- first_name: Nariman
  full_name: Farsad, Nariman
  last_name: Farsad
- first_name: Seyyed Ali
  full_name: Hashemi, Seyyed Ali
  last_name: Hashemi
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Fathollahi D, Farsad N, Hashemi SA, Mondelli M. Sparse multi-decoder recursive
    projection aggregation for Reed-Muller codes. In: <i>2021 IEEE International Symposium
    on Information Theory</i>. Institute of Electrical and Electronics Engineers;
    2021:1082-1087. doi:<a href="https://doi.org/10.1109/isit45174.2021.9517887">10.1109/isit45174.2021.9517887</a>'
  apa: 'Fathollahi, D., Farsad, N., Hashemi, S. A., &#38; Mondelli, M. (2021). Sparse
    multi-decoder recursive projection aggregation for Reed-Muller codes. In <i>2021
    IEEE International Symposium on Information Theory</i> (pp. 1082–1087). Virtual,
    Melbourne, Australia: Institute of Electrical and Electronics Engineers. <a href="https://doi.org/10.1109/isit45174.2021.9517887">https://doi.org/10.1109/isit45174.2021.9517887</a>'
  chicago: Fathollahi, Dorsa, Nariman Farsad, Seyyed Ali Hashemi, and Marco Mondelli.
    “Sparse Multi-Decoder Recursive Projection Aggregation for Reed-Muller Codes.”
    In <i>2021 IEEE International Symposium on Information Theory</i>, 1082–87. Institute
    of Electrical and Electronics Engineers, 2021. <a href="https://doi.org/10.1109/isit45174.2021.9517887">https://doi.org/10.1109/isit45174.2021.9517887</a>.
  ieee: D. Fathollahi, N. Farsad, S. A. Hashemi, and M. Mondelli, “Sparse multi-decoder
    recursive projection aggregation for Reed-Muller codes,” in <i>2021 IEEE International
    Symposium on Information Theory</i>, Virtual, Melbourne, Australia, 2021, pp.
    1082–1087.
  ista: 'Fathollahi D, Farsad N, Hashemi SA, Mondelli M. 2021. Sparse multi-decoder
    recursive projection aggregation for Reed-Muller codes. 2021 IEEE International
    Symposium on Information Theory. ISIT: International Symposium on Information
    Theory, 1082–1087.'
  mla: Fathollahi, Dorsa, et al. “Sparse Multi-Decoder Recursive Projection Aggregation
    for Reed-Muller Codes.” <i>2021 IEEE International Symposium on Information Theory</i>,
    Institute of Electrical and Electronics Engineers, 2021, pp. 1082–87, doi:<a href="https://doi.org/10.1109/isit45174.2021.9517887">10.1109/isit45174.2021.9517887</a>.
  short: D. Fathollahi, N. Farsad, S.A. Hashemi, M. Mondelli, in:, 2021 IEEE International
    Symposium on Information Theory, Institute of Electrical and Electronics Engineers,
    2021, pp. 1082–1087.
conference:
  end_date: 2021-07-20
  location: Virtual, Melbourne, Australia
  name: 'ISIT: International Symposium on Information Theory'
  start_date: 2021-07-12
date_created: 2022-01-03T11:31:26Z
date_published: 2021-09-01T00:00:00Z
date_updated: 2024-09-10T13:03:18Z
day: '01'
department:
- _id: MaMo
doi: 10.1109/isit45174.2021.9517887
external_id:
  arxiv:
  - '2011.12882'
  isi:
  - '000701502201029'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2011.12882
month: '09'
oa: 1
oa_version: Preprint
page: 1082-1087
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: 2021 IEEE International Symposium on Information Theory
publication_identifier:
  eisbn:
  - 978-1-5386-8209-8
  isbn:
  - 978-1-5386-8210-4
publication_status: published
publisher: Institute of Electrical and Electronics Engineers
quality_controlled: '1'
scopus_import: '1'
status: public
title: Sparse multi-decoder recursive projection aggregation for Reed-Muller codes
type: conference
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
year: '2021'
...
---
_id: '10598'
abstract:
- lang: eng
  text: ' We consider the problem of estimating a signal from measurements obtained
    via a generalized linear model. We focus on estimators based on approximate message
    passing (AMP), a family of iterative algorithms with many appealing features:
    the performance of AMP in the high-dimensional limit can be succinctly characterized
    under suitable model assumptions; AMP can also be tailored to the empirical distribution
    of the signal entries, and for a wide class of estimation problems, AMP is conjectured
    to be optimal among all polynomial-time algorithms. However, a major issue of
    AMP is that in many models (such as phase retrieval), it requires an initialization
    correlated with the ground-truth signal and independent from the measurement matrix.
    Assuming that such an initialization is available is typically not realistic.
    In this paper, we solve this problem by proposing an AMP algorithm initialized
    with a spectral estimator. With such an initialization, the standard AMP analysis
    fails since the spectral estimator depends in a complicated way on the design
    matrix. Our main contribution is a rigorous characterization of the performance
    of AMP with spectral initialization in the high-dimensional limit. The key technical
    idea is to define and analyze a two-phase artificial AMP algorithm that first
    produces the spectral estimator, and then closely approximates the iterates of
    the true AMP. We also provide numerical results that demonstrate the validity
    of the proposed approach. '
acknowledgement: The authors would like to thank Andrea Montanari for helpful discussions.
  M. Mondelli was partially supported by the 2019 Lopez-Loreta Prize. R. Venkataramanan
  was partially supported by the Alan Turing Institute under the EPSRC grant EP/N510129/1.
alternative_title:
- Proceedings of Machine Learning Research
article_processing_charge: Yes (via OA deal)
arxiv: 1
author:
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
- first_name: Ramji
  full_name: Venkataramanan, Ramji
  last_name: Venkataramanan
citation:
  ama: 'Mondelli M, Venkataramanan R. Approximate message passing with spectral initialization
    for generalized linear models. In: Banerjee A, Fukumizu K, eds. <i>Proceedings
    of The 24th International Conference on Artificial Intelligence and Statistics</i>.
    Vol 130. ML Research Press; 2021:397-405.'
  apa: 'Mondelli, M., &#38; Venkataramanan, R. (2021). Approximate message passing
    with spectral initialization for generalized linear models. In A. Banerjee &#38;
    K. Fukumizu (Eds.), <i>Proceedings of The 24th International Conference on Artificial
    Intelligence and Statistics</i> (Vol. 130, pp. 397–405). Virtual, San Diego, CA,
    United States: ML Research Press.'
  chicago: Mondelli, Marco, and Ramji Venkataramanan. “Approximate Message Passing
    with Spectral Initialization for Generalized Linear Models.” In <i>Proceedings
    of The 24th International Conference on Artificial Intelligence and Statistics</i>,
    edited by Arindam Banerjee and Kenji Fukumizu, 130:397–405. ML Research Press,
    2021.
  ieee: M. Mondelli and R. Venkataramanan, “Approximate message passing with spectral
    initialization for generalized linear models,” in <i>Proceedings of The 24th International
    Conference on Artificial Intelligence and Statistics</i>, Virtual, San Diego,
    CA, United States, 2021, vol. 130, pp. 397–405.
  ista: 'Mondelli M, Venkataramanan R. 2021. Approximate message passing with spectral
    initialization for generalized linear models. Proceedings of The 24th International
    Conference on Artificial Intelligence and Statistics. AISTATS: Artificial Intelligence
    and Statistics, Proceedings of Machine Learning Research, vol. 130, 397–405.'
  mla: Mondelli, Marco, and Ramji Venkataramanan. “Approximate Message Passing with
    Spectral Initialization for Generalized Linear Models.” <i>Proceedings of The
    24th International Conference on Artificial Intelligence and Statistics</i>, edited
    by Arindam Banerjee and Kenji Fukumizu, vol. 130, ML Research Press, 2021, pp.
    397–405.
  short: M. Mondelli, R. Venkataramanan, in:, A. Banerjee, K. Fukumizu (Eds.), Proceedings
    of The 24th International Conference on Artificial Intelligence and Statistics,
    ML Research Press, 2021, pp. 397–405.
conference:
  end_date: 2021-04-15
  location: Virtual, San Diego, CA, United States
  name: 'AISTATS: Artificial Intelligence and Statistics'
  start_date: 2021-04-13
date_created: 2022-01-03T11:34:22Z
date_published: 2021-04-01T00:00:00Z
date_updated: 2024-03-07T10:36:53Z
day: '01'
department:
- _id: MaMo
editor:
- first_name: Arindam
  full_name: Banerjee, Arindam
  last_name: Banerjee
- first_name: Kenji
  full_name: Fukumizu, Kenji
  last_name: Fukumizu
external_id:
  arxiv:
  - '2010.03460'
intvolume: '       130'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://proceedings.mlr.press/v130/mondelli21a.html
month: '04'
oa: 1
oa_version: Preprint
page: 397-405
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: Proceedings of The 24th International Conference on Artificial Intelligence
  and Statistics
publication_identifier:
  issn:
  - 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
related_material:
  record:
  - id: '12480'
    relation: later_version
    status: public
scopus_import: '1'
status: public
title: Approximate message passing with spectral initialization for generalized linear
  models
type: conference
user_id: 3E5EF7F0-F248-11E8-B48F-1D18A9856A87
volume: 130
year: '2021'
...
