---
_id: '12859'
abstract:
- lang: eng
  text: 'Machine learning models are vulnerable to adversarial perturbations, and
    a thought-provoking paper by Bubeck and Sellke has analyzed this phenomenon through
    the lens of over-parameterization: interpolating smoothly the data requires significantly
    more parameters than simply memorizing it. However, this "universal" law provides
    only a necessary condition for robustness, and it is unable to discriminate between
    models. In this paper, we address these gaps by focusing on empirical risk minimization
    in two prototypical settings, namely, random features and the neural tangent kernel
    (NTK). We prove that, for random features, the model is not robust for any degree
    of over-parameterization, even when the necessary condition coming from the universal
    law of robustness is satisfied. In contrast, for even activations, the NTK model
    meets the universal lower bound, and it is robust as soon as the necessary condition
    on over-parameterization is fulfilled. This also addresses a conjecture in prior
    work by Bubeck, Li and Nagaraj. Our analysis decouples the effect of the kernel
    of the model from an "interaction matrix", which describes the interaction with
    the test data and captures the effect of the activation. Our theoretical results
    are corroborated by numerical evidence on both synthetic and standard datasets
    (MNIST, CIFAR-10).'
acknowledgement: "Simone Bombari and Marco Mondelli were partially supported by the
  2019 Lopez-Loreta prize, and\r\nthe authors would like to thank Hamed Hassani for
  helpful discussions.\r\n"
alternative_title:
- PMLR
article_processing_charge: No
arxiv: 1
author:
- first_name: Simone
  full_name: Bombari, Simone
  id: ca726dda-de17-11ea-bc14-f9da834f63aa
  last_name: Bombari
- first_name: Shayan
  full_name: Kiyani, Shayan
  id: f5a2b424-e339-11ed-8435-ff3b4fe70cf8
  last_name: Kiyani
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Bombari S, Kiyani S, Mondelli M. Beyond the universal law of robustness: Sharper
    laws for random features and neural tangent kernels. In: <i>Proceedings of the
    40th International Conference on Machine Learning</i>. Vol 202. ML Research Press;
    2023:2738-2776.'
  apa: 'Bombari, S., Kiyani, S., &#38; Mondelli, M. (2023). Beyond the universal law
    of robustness: Sharper laws for random features and neural tangent kernels. In
    <i>Proceedings of the 40th International Conference on Machine Learning</i> (Vol.
    202, pp. 2738–2776). Honolulu, HI, United States: ML Research Press.'
  chicago: 'Bombari, Simone, Shayan Kiyani, and Marco Mondelli. “Beyond the Universal
    Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels.”
    In <i>Proceedings of the 40th International Conference on Machine Learning</i>,
    202:2738–76. ML Research Press, 2023.'
  ieee: 'S. Bombari, S. Kiyani, and M. Mondelli, “Beyond the universal law of robustness:
    Sharper laws for random features and neural tangent kernels,” in <i>Proceedings
    of the 40th International Conference on Machine Learning</i>, Honolulu, HI, United
    States, 2023, vol. 202, pp. 2738–2776.'
  ista: 'Bombari S, Kiyani S, Mondelli M. 2023. Beyond the universal law of robustness:
    Sharper laws for random features and neural tangent kernels. Proceedings of the
    40th International Conference on Machine Learning. ICML: International Conference
    on Machine Learning, PMLR, vol. 202, 2738–2776.'
  mla: 'Bombari, Simone, et al. “Beyond the Universal Law of Robustness: Sharper Laws
    for Random Features and Neural Tangent Kernels.” <i>Proceedings of the 40th International
    Conference on Machine Learning</i>, vol. 202, ML Research Press, 2023, pp. 2738–76.'
  short: S. Bombari, S. Kiyani, M. Mondelli, in:, Proceedings of the 40th International
    Conference on Machine Learning, ML Research Press, 2023, pp. 2738–2776.
conference:
  end_date: 2023-07-29
  location: Honolulu, HI, United States
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2023-07-23
date_created: 2023-04-23T16:11:03Z
date_published: 2023-10-27T00:00:00Z
date_updated: 2024-09-10T13:03:19Z
day: '27'
department:
- _id: GradSch
- _id: MaMo
external_id:
  arxiv:
  - '2302.01629'
intvolume: '       202'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2302.01629
month: '10'
oa: 1
oa_version: Preprint
page: 2738-2776
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: Proceedings of the 40th International Conference on Machine Learning
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
related_material:
  link:
  - relation: software
    url: https://github.com/simone-bombari/beyond-universal-robustness
status: public
title: 'Beyond the universal law of robustness: Sharper laws for random features and
  neural tangent kernels'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 202
year: '2023'
...
---
_id: '12537'
abstract:
- lang: eng
  text: 'The Neural Tangent Kernel (NTK) has emerged as a powerful tool to provide
    memorization, optimization and generalization guarantees in deep neural networks.
    A line of work has studied the NTK spectrum for two-layer and deep networks with
    at least a layer with Ω(N) neurons, N being the number of training samples. Furthermore,
    there is increasing evidence suggesting that deep networks with sub-linear layer
    widths are powerful memorizers and optimizers, as long as the number of parameters
    exceeds the number of samples. Thus, a natural open question is whether the NTK
    is well conditioned in such a challenging sub-linear setup. In this paper, we
    answer this question in the affirmative. Our key technical contribution is a lower
    bound on the smallest NTK eigenvalue for deep networks with the minimum possible
    over-parameterization: the number of parameters is roughly Ω(N) and, hence, the
    number of neurons is as little as Ω(N−−√). To showcase the applicability of our
    NTK bounds, we provide two results concerning memorization capacity and optimization
    guarantees for gradient descent training.'
acknowledgement: "The authors were partially supported by the 2019 Lopez-Loreta prize,
  and they would like to thank\r\nQuynh Nguyen, Mahdi Soltanolkotabi and Adel Javanmard
  for helpful discussions.\r\n"
article_processing_charge: No
arxiv: 1
author:
- first_name: Simone
  full_name: Bombari, Simone
  id: ca726dda-de17-11ea-bc14-f9da834f63aa
  last_name: Bombari
- first_name: Mohammad Hossein
  full_name: Amani, Mohammad Hossein
  last_name: Amani
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
citation:
  ama: 'Bombari S, Amani MH, Mondelli M. Memorization and optimization in deep neural
    networks with minimum over-parameterization. In: <i>36th Conference on Neural
    Information Processing Systems</i>. Vol 35. Curran Associates; 2022:7628-7640.'
  apa: Bombari, S., Amani, M. H., &#38; Mondelli, M. (2022). Memorization and optimization
    in deep neural networks with minimum over-parameterization. In <i>36th Conference
    on Neural Information Processing Systems</i> (Vol. 35, pp. 7628–7640). Curran
    Associates.
  chicago: Bombari, Simone, Mohammad Hossein Amani, and Marco Mondelli. “Memorization
    and Optimization in Deep Neural Networks with Minimum Over-Parameterization.”
    In <i>36th Conference on Neural Information Processing Systems</i>, 35:7628–40.
    Curran Associates, 2022.
  ieee: S. Bombari, M. H. Amani, and M. Mondelli, “Memorization and optimization in
    deep neural networks with minimum over-parameterization,” in <i>36th Conference
    on Neural Information Processing Systems</i>, 2022, vol. 35, pp. 7628–7640.
  ista: Bombari S, Amani MH, Mondelli M. 2022. Memorization and optimization in deep
    neural networks with minimum over-parameterization. 36th Conference on Neural
    Information Processing Systems. vol. 35, 7628–7640.
  mla: Bombari, Simone, et al. “Memorization and Optimization in Deep Neural Networks
    with Minimum Over-Parameterization.” <i>36th Conference on Neural Information
    Processing Systems</i>, vol. 35, Curran Associates, 2022, pp. 7628–40.
  short: S. Bombari, M.H. Amani, M. Mondelli, in:, 36th Conference on Neural Information
    Processing Systems, Curran Associates, 2022, pp. 7628–7640.
date_created: 2023-02-10T13:46:37Z
date_published: 2022-07-24T00:00:00Z
date_updated: 2024-09-10T13:03:19Z
day: '24'
department:
- _id: MaMo
external_id:
  arxiv:
  - '2205.10217'
intvolume: '        35'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: ' https://doi.org/10.48550/arXiv.2205.10217'
month: '07'
oa: 1
oa_version: Preprint
page: 7628-7640
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: 36th Conference on Neural Information Processing Systems
publication_identifier:
  isbn:
  - '9781713871088'
publication_status: published
publisher: Curran Associates
quality_controlled: '1'
status: public
title: Memorization and optimization in deep neural networks with minimum over-parameterization
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 35
year: '2022'
...
---
_id: '12538'
abstract:
- lang: eng
  text: In this paper, we study the compression of a target two-layer neural network
    with N nodes into a compressed network with M<N nodes. More precisely, we consider
    the setting in which the weights of the target network are i.i.d. sub-Gaussian,
    and we minimize the population L_2 loss between the outputs of the target and
    of the compressed network, under the assumption of Gaussian inputs. By using tools
    from high-dimensional probability, we show that this non-convex problem can be
    simplified when the target network is sufficiently over-parameterized, and provide
    the error rate of this approximation as a function of the input dimension and
    N. In this mean-field limit, the simplified objective, as well as the optimal
    weights of the compressed network, does not depend on the realization of the target
    network, but only on expected scaling factors. Furthermore, for networks with
    ReLU activation, we conjecture that the optimum of the simplified optimization
    problem is achieved by taking weights on the Equiangular Tight Frame (ETF), while
    the scaling of the weights and the orientation of the ETF depend on the parameters
    of the target network. Numerical evidence is provided to support this conjecture.
article_processing_charge: No
article_type: original
arxiv: 1
author:
- first_name: Mohammad Hossein
  full_name: Amani, Mohammad Hossein
  last_name: Amani
- first_name: Simone
  full_name: Bombari, Simone
  id: ca726dda-de17-11ea-bc14-f9da834f63aa
  last_name: Bombari
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
- first_name: Rattana
  full_name: Pukdee, Rattana
  last_name: Pukdee
- first_name: Stefano
  full_name: Rini, Stefano
  last_name: Rini
citation:
  ama: Amani MH, Bombari S, Mondelli M, Pukdee R, Rini S. Sharp asymptotics on the
    compression of two-layer neural networks. <i>IEEE Information Theory Workshop</i>.
    2022:588-593. doi:<a href="https://doi.org/10.1109/ITW54588.2022.9965870">10.1109/ITW54588.2022.9965870</a>
  apa: 'Amani, M. H., Bombari, S., Mondelli, M., Pukdee, R., &#38; Rini, S. (2022).
    Sharp asymptotics on the compression of two-layer neural networks. <i>IEEE Information
    Theory Workshop</i>. Mumbai, India: IEEE. <a href="https://doi.org/10.1109/ITW54588.2022.9965870">https://doi.org/10.1109/ITW54588.2022.9965870</a>'
  chicago: Amani, Mohammad Hossein, Simone Bombari, Marco Mondelli, Rattana Pukdee,
    and Stefano Rini. “Sharp Asymptotics on the Compression of Two-Layer Neural Networks.”
    <i>IEEE Information Theory Workshop</i>. IEEE, 2022. <a href="https://doi.org/10.1109/ITW54588.2022.9965870">https://doi.org/10.1109/ITW54588.2022.9965870</a>.
  ieee: M. H. Amani, S. Bombari, M. Mondelli, R. Pukdee, and S. Rini, “Sharp asymptotics
    on the compression of two-layer neural networks,” <i>IEEE Information Theory Workshop</i>.
    IEEE, pp. 588–593, 2022.
  ista: Amani MH, Bombari S, Mondelli M, Pukdee R, Rini S. 2022. Sharp asymptotics
    on the compression of two-layer neural networks. IEEE Information Theory Workshop.,
    588–593.
  mla: Amani, Mohammad Hossein, et al. “Sharp Asymptotics on the Compression of Two-Layer
    Neural Networks.” <i>IEEE Information Theory Workshop</i>, IEEE, 2022, pp. 588–93,
    doi:<a href="https://doi.org/10.1109/ITW54588.2022.9965870">10.1109/ITW54588.2022.9965870</a>.
  short: M.H. Amani, S. Bombari, M. Mondelli, R. Pukdee, S. Rini, IEEE Information
    Theory Workshop (2022) 588–593.
conference:
  end_date: 2022-11-09
  location: Mumbai, India
  name: 'ITW: Information Theory Workshop'
  start_date: 2022-11-01
date_created: 2023-02-10T13:47:56Z
date_published: 2022-11-16T00:00:00Z
date_updated: 2023-12-18T11:31:47Z
day: '16'
department:
- _id: MaMo
doi: 10.1109/ITW54588.2022.9965870
external_id:
  arxiv:
  - '2205.08199'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: ' https://doi.org/10.48550/arXiv.2205.08199'
month: '11'
oa: 1
oa_version: Preprint
page: 588-593
publication: IEEE Information Theory Workshop
publication_identifier:
  isbn:
  - '9781665483414'
publication_status: published
publisher: IEEE
quality_controlled: '1'
scopus_import: '1'
status: public
title: Sharp asymptotics on the compression of two-layer neural networks
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2022'
...
---
_id: '12860'
abstract:
- lang: eng
  text: 'Memorization of the relation between entities in a dataset can lead to privacy
    issues when using a trained model for question answering. We introduce Relational
    Memorization (RM) to understand, quantify and control this phenomenon. While bounding
    general memorization can have detrimental effects on the performance of a trained
    model, bounding RM does not prevent effective learning. The difference is most
    pronounced when the data distribution is long-tailed, with many queries having
    only few training examples: Impeding general memorization prevents effective learning,
    while impeding only relational memorization still allows learning general properties
    of the underlying concepts. We formalize the notion of Relational Privacy (RP)
    and, inspired by Differential Privacy (DP), we provide a possible definition of
    Differential Relational Privacy (DrP). These notions can be used to describe and
    compute bounds on the amount of RM in a trained model. We illustrate Relational
    Privacy concepts in experiments with large-scale models for Question Answering.'
article_number: '2203.16701'
article_processing_charge: No
arxiv: 1
author:
- first_name: Simone
  full_name: Bombari, Simone
  id: ca726dda-de17-11ea-bc14-f9da834f63aa
  last_name: Bombari
- first_name: Alessandro
  full_name: Achille, Alessandro
  last_name: Achille
- first_name: Zijian
  full_name: Wang, Zijian
  last_name: Wang
- first_name: Yu-Xiang
  full_name: Wang, Yu-Xiang
  last_name: Wang
- first_name: Yusheng
  full_name: Xie, Yusheng
  last_name: Xie
- first_name: Kunwar Yashraj
  full_name: Singh, Kunwar Yashraj
  last_name: Singh
- first_name: Srikar
  full_name: Appalaraju, Srikar
  last_name: Appalaraju
- first_name: Vijay
  full_name: Mahadevan, Vijay
  last_name: Mahadevan
- first_name: Stefano
  full_name: Soatto, Stefano
  last_name: Soatto
citation:
  ama: Bombari S, Achille A, Wang Z, et al. Towards differential relational privacy
    and its use in question answering. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2203.16701">10.48550/arXiv.2203.16701</a>
  apa: Bombari, S., Achille, A., Wang, Z., Wang, Y.-X., Xie, Y., Singh, K. Y., … Soatto,
    S. (n.d.). Towards differential relational privacy and its use in question answering.
    <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2203.16701">https://doi.org/10.48550/arXiv.2203.16701</a>
  chicago: Bombari, Simone, Alessandro Achille, Zijian Wang, Yu-Xiang Wang, Yusheng
    Xie, Kunwar Yashraj Singh, Srikar Appalaraju, Vijay Mahadevan, and Stefano Soatto.
    “Towards Differential Relational Privacy and Its Use in Question Answering.” <i>ArXiv</i>,
    n.d. <a href="https://doi.org/10.48550/arXiv.2203.16701">https://doi.org/10.48550/arXiv.2203.16701</a>.
  ieee: S. Bombari <i>et al.</i>, “Towards differential relational privacy and its
    use in question answering,” <i>arXiv</i>. .
  ista: Bombari S, Achille A, Wang Z, Wang Y-X, Xie Y, Singh KY, Appalaraju S, Mahadevan
    V, Soatto S. Towards differential relational privacy and its use in question answering.
    arXiv, 2203.16701.
  mla: Bombari, Simone, et al. “Towards Differential Relational Privacy and Its Use
    in Question Answering.” <i>ArXiv</i>, 2203.16701, doi:<a href="https://doi.org/10.48550/arXiv.2203.16701">10.48550/arXiv.2203.16701</a>.
  short: S. Bombari, A. Achille, Z. Wang, Y.-X. Wang, Y. Xie, K.Y. Singh, S. Appalaraju,
    V. Mahadevan, S. Soatto, ArXiv (n.d.).
date_created: 2023-04-23T16:11:48Z
date_published: 2022-03-30T00:00:00Z
date_updated: 2023-04-25T07:34:49Z
day: '30'
department:
- _id: GradSch
- _id: MaMo
doi: 10.48550/arXiv.2203.16701
external_id:
  arxiv:
  - '2203.16701'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2203.16701
month: '03'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: Towards differential relational privacy and its use in question answering
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2022'
...
