---
_id: '14320'
abstract:
- lang: eng
  text: The development of two-dimensional materials has resulted in a diverse range
    of novel, high-quality compounds with increasing complexity. A key requirement
    for a comprehensive quantitative theory is the accurate determination of these
    materials' band structure parameters. However, this task is challenging due to
    the intricate band structures and the indirect nature of experimental probes.
    In this work, we introduce a general framework to derive band structure parameters
    from experimental data using deep neural networks. We applied our method to the
    penetration field capacitance measurement of trilayer graphene, an effective probe
    of its density of states. First, we demonstrate that a trained deep network gives
    accurate predictions for the penetration field capacitance as a function of tight-binding
    parameters. Next, we use the fast and accurate predictions from the trained network
    to automatically determine tight-binding parameters directly from experimental
    data, with extracted parameters being in a good agreement with values in the literature.
    We conclude by discussing potential applications of our method to other materials
    and experimental techniques beyond penetration field capacitance.
acknowledgement: A.F.Y. acknowledges primary support from the Department of Energy
  under award DE-SC0020043, and additional support from the Gordon and Betty Moore
  Foundation under award GBMF9471 for group operations.
article_number: '125411'
article_processing_charge: No
article_type: original
arxiv: 1
author:
- first_name: Paul M
  full_name: Henderson, Paul M
  id: 13C09E74-18D9-11E9-8878-32CFE5697425
  last_name: Henderson
  orcid: 0000-0002-5198-7445
- first_name: Areg
  full_name: Ghazaryan, Areg
  id: 4AF46FD6-F248-11E8-B48F-1D18A9856A87
  last_name: Ghazaryan
  orcid: 0000-0001-9666-3543
- first_name: Alexander A.
  full_name: Zibrov, Alexander A.
  last_name: Zibrov
- first_name: Andrea F.
  full_name: Young, Andrea F.
  last_name: Young
- first_name: Maksym
  full_name: Serbyn, Maksym
  id: 47809E7E-F248-11E8-B48F-1D18A9856A87
  last_name: Serbyn
  orcid: 0000-0002-2399-5827
citation:
  ama: 'Henderson PM, Ghazaryan A, Zibrov AA, Young AF, Serbyn M. Deep learning extraction
    of band structure parameters from density of states: A case study on trilayer
    graphene. <i>Physical Review B</i>. 2023;108(12). doi:<a href="https://doi.org/10.1103/physrevb.108.125411">10.1103/physrevb.108.125411</a>'
  apa: 'Henderson, P. M., Ghazaryan, A., Zibrov, A. A., Young, A. F., &#38; Serbyn,
    M. (2023). Deep learning extraction of band structure parameters from density
    of states: A case study on trilayer graphene. <i>Physical Review B</i>. American
    Physical Society. <a href="https://doi.org/10.1103/physrevb.108.125411">https://doi.org/10.1103/physrevb.108.125411</a>'
  chicago: 'Henderson, Paul M, Areg Ghazaryan, Alexander A. Zibrov, Andrea F. Young,
    and Maksym Serbyn. “Deep Learning Extraction of Band Structure Parameters from
    Density of States: A Case Study on Trilayer Graphene.” <i>Physical Review B</i>.
    American Physical Society, 2023. <a href="https://doi.org/10.1103/physrevb.108.125411">https://doi.org/10.1103/physrevb.108.125411</a>.'
  ieee: 'P. M. Henderson, A. Ghazaryan, A. A. Zibrov, A. F. Young, and M. Serbyn,
    “Deep learning extraction of band structure parameters from density of states:
    A case study on trilayer graphene,” <i>Physical Review B</i>, vol. 108, no. 12.
    American Physical Society, 2023.'
  ista: 'Henderson PM, Ghazaryan A, Zibrov AA, Young AF, Serbyn M. 2023. Deep learning
    extraction of band structure parameters from density of states: A case study on
    trilayer graphene. Physical Review B. 108(12), 125411.'
  mla: 'Henderson, Paul M., et al. “Deep Learning Extraction of Band Structure Parameters
    from Density of States: A Case Study on Trilayer Graphene.” <i>Physical Review
    B</i>, vol. 108, no. 12, 125411, American Physical Society, 2023, doi:<a href="https://doi.org/10.1103/physrevb.108.125411">10.1103/physrevb.108.125411</a>.'
  short: P.M. Henderson, A. Ghazaryan, A.A. Zibrov, A.F. Young, M. Serbyn, Physical
    Review B 108 (2023).
date_created: 2023-09-12T07:12:12Z
date_published: 2023-09-15T00:00:00Z
date_updated: 2023-09-20T09:38:24Z
day: '15'
department:
- _id: MaSe
- _id: ChLa
- _id: MiLe
doi: 10.1103/physrevb.108.125411
external_id:
  arxiv:
  - '2210.06310'
intvolume: '       108'
issue: '12'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2210.06310
month: '09'
oa: 1
oa_version: Preprint
publication: Physical Review B
publication_identifier:
  eissn:
  - 2469-9969
  issn:
  - 2469-9950
publication_status: published
publisher: American Physical Society
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'Deep learning extraction of band structure parameters from density of states:
  A case study on trilayer graphene'
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 108
year: '2023'
...
---
_id: '14410'
abstract:
- lang: eng
  text: This paper focuses on the implementation details of the baseline methods and
    a recent lightweight conditional model extrapolation algorithm LIMES [5] for streaming
    data under class-prior shift. LIMES achieves superior performance over the baseline
    methods, especially concerning the minimum-across-day accuracy, which is important
    for the users of the system. In this work, the key measures to facilitate reproducibility
    and enhance the credibility of the results are described.
alternative_title:
- LNCS
article_processing_charge: No
author:
- first_name: Paulina
  full_name: Tomaszewska, Paulina
  last_name: Tomaszewska
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Tomaszewska P, Lampert C. On the implementation of baselines and lightweight
    conditional model extrapolation (LIMES) under class-prior shift. In: <i>International
    Workshop on Reproducible Research in Pattern Recognition</i>. Vol 14068. Springer
    Nature; 2023:67-73. doi:<a href="https://doi.org/10.1007/978-3-031-40773-4_6">10.1007/978-3-031-40773-4_6</a>'
  apa: 'Tomaszewska, P., &#38; Lampert, C. (2023). On the implementation of baselines
    and lightweight conditional model extrapolation (LIMES) under class-prior shift.
    In <i>International Workshop on Reproducible Research in Pattern Recognition</i>
    (Vol. 14068, pp. 67–73). Montreal, Canada: Springer Nature. <a href="https://doi.org/10.1007/978-3-031-40773-4_6">https://doi.org/10.1007/978-3-031-40773-4_6</a>'
  chicago: Tomaszewska, Paulina, and Christoph Lampert. “On the Implementation of Baselines
    and Lightweight Conditional Model Extrapolation (LIMES) under Class-Prior Shift.”
    In <i>International Workshop on Reproducible Research in Pattern Recognition</i>,
    14068:67–73. Springer Nature, 2023. <a href="https://doi.org/10.1007/978-3-031-40773-4_6">https://doi.org/10.1007/978-3-031-40773-4_6</a>.
  ieee: P. Tomaszewska and C. Lampert, “On the implementation of baselines and lightweight
    conditional model extrapolation (LIMES) under class-prior shift,” in <i>International
    Workshop on Reproducible Research in Pattern Recognition</i>, Montreal, Canada,
    2023, vol. 14068, pp. 67–73.
  ista: 'Tomaszewska P, Lampert C. 2023. On the implementation of baselines and lightweight
    conditional model extrapolation (LIMES) under class-prior shift. International
    Workshop on Reproducible Research in Pattern Recognition. RRPR: Reproducible Research
    in Pattern Recognition, LNCS, vol. 14068, 67–73.'
  mla: Tomaszewska, Paulina, and Christoph Lampert. “On the Implementation of Baselines
    and Lightweight Conditional Model Extrapolation (LIMES) under Class-Prior Shift.”
    <i>International Workshop on Reproducible Research in Pattern Recognition</i>,
    vol. 14068, Springer Nature, 2023, pp. 67–73, doi:<a href="https://doi.org/10.1007/978-3-031-40773-4_6">10.1007/978-3-031-40773-4_6</a>.
  short: P. Tomaszewska, C. Lampert, in:, International Workshop on Reproducible Research
    in Pattern Recognition, Springer Nature, 2023, pp. 67–73.
conference:
  end_date: 2022-08-21
  location: Montreal, Canada
  name: 'RRPR: Reproducible Research in Pattern Recognition'
  start_date: 2022-08-21
date_created: 2023-10-08T22:01:18Z
date_published: 2023-08-20T00:00:00Z
date_updated: 2023-10-09T06:48:02Z
day: '20'
department:
- _id: ChLa
doi: 10.1007/978-3-031-40773-4_6
intvolume: '     14068'
language:
- iso: eng
month: '08'
oa_version: None
page: 67-73
publication: International Workshop on Reproducible Research in Pattern Recognition
publication_identifier:
  eissn:
  - 1611-3349
  isbn:
  - '9783031407727'
  issn:
  - 0302-9743
publication_status: published
publisher: Springer Nature
quality_controlled: '1'
scopus_import: '1'
status: public
title: On the implementation of baselines and lightweight conditional model extrapolation
  (LIMES) under class-prior shift
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 14068
year: '2023'
...
---
_id: '14446'
abstract:
- lang: eng
  text: Recent work has paid close attention to the first principle of Granger causality,
    according to which cause precedes effect. In this context, the question may arise
    whether the detected direction of causality also reverses after the time reversal
    of unidirectionally coupled data. Recently, it has been shown that for unidirectionally
    causally connected autoregressive (AR) processes X → Y, after time reversal of
    data, the opposite causal direction Y → X is indeed detected, although typically
    as part of the bidirectional X↔ Y link. As we argue here, the answer is different
    when the measured data are not from AR processes but from linked deterministic
    systems. When the goal is the usual forward data analysis, cross-mapping-like
    approaches correctly detect X → Y, while Granger causality-like approaches, which
    should not be used for deterministic time series, detect causal independence X
    → Y. The results of backward causal analysis depend on the predictability of the
    reversed data. Unlike AR processes, observables from deterministic dynamical systems,
    even complex nonlinear ones, can be predicted well forward, while backward predictions
    can be difficult (notably when the time reversal of a function leads to one-to-many
    relations). To address this problem, we propose an approach based on models that
    provide multiple candidate predictions for the target, combined with a loss function
    that consideres only the best candidate. The resulting good forward and backward
    predictability supports the view that unidirectionally causally linked deterministic
    dynamical systems X → Y can be expected to detect the same link both before and
    after time reversal.
acknowledgement: The work was supported by the Scientific Grant Agency of the Ministry
  of Education of the Slovak Republic and the Slovak Academy of Sciences, projects
  APVV-21-0216, VEGA2-0096-21 and VEGA 2-0023-22.
article_processing_charge: Yes
article_type: original
author:
- first_name: Jozef
  full_name: Jakubík, Jozef
  last_name: Jakubík
- first_name: Phuong
  full_name: Bui Thi Mai, Phuong
  id: 3EC6EE64-F248-11E8-B48F-1D18A9856A87
  last_name: Bui Thi Mai
- first_name: Martina
  full_name: Chvosteková, Martina
  last_name: Chvosteková
- first_name: Anna
  full_name: Krakovská, Anna
  last_name: Krakovská
citation:
  ama: Jakubík J, Phuong M, Chvosteková M, Krakovská A. Against the flow of time with
    multi-output models. <i>Measurement Science Review</i>. 2023;23(4):175-183. doi:<a
    href="https://doi.org/10.2478/msr-2023-0023">10.2478/msr-2023-0023</a>
  apa: Jakubík, J., Phuong, M., Chvosteková, M., &#38; Krakovská, A. (2023). Against
    the flow of time with multi-output models. <i>Measurement Science Review</i>.
    Sciendo. <a href="https://doi.org/10.2478/msr-2023-0023">https://doi.org/10.2478/msr-2023-0023</a>
  chicago: Jakubík, Jozef, Mary Phuong, Martina Chvosteková, and Anna Krakovská. “Against
    the Flow of Time with Multi-Output Models.” <i>Measurement Science Review</i>.
    Sciendo, 2023. <a href="https://doi.org/10.2478/msr-2023-0023">https://doi.org/10.2478/msr-2023-0023</a>.
  ieee: J. Jakubík, M. Phuong, M. Chvosteková, and A. Krakovská, “Against the flow
    of time with multi-output models,” <i>Measurement Science Review</i>, vol. 23,
    no. 4. Sciendo, pp. 175–183, 2023.
  ista: Jakubík J, Phuong M, Chvosteková M, Krakovská A. 2023. Against the flow of
    time with multi-output models. Measurement Science Review. 23(4), 175–183.
  mla: Jakubík, Jozef, et al. “Against the Flow of Time with Multi-Output Models.”
    <i>Measurement Science Review</i>, vol. 23, no. 4, Sciendo, 2023, pp. 175–83,
    doi:<a href="https://doi.org/10.2478/msr-2023-0023">10.2478/msr-2023-0023</a>.
  short: J. Jakubík, M. Phuong, M. Chvosteková, A. Krakovská, Measurement Science
    Review 23 (2023) 175–183.
date_created: 2023-10-22T22:01:15Z
date_published: 2023-08-01T00:00:00Z
date_updated: 2023-10-31T12:12:47Z
day: '01'
ddc:
- '510'
department:
- _id: ChLa
doi: 10.2478/msr-2023-0023
file:
- access_level: open_access
  checksum: b069cc10fa6a7c96b2bc9f728165f9e6
  content_type: application/pdf
  creator: dernst
  date_created: 2023-10-31T12:07:23Z
  date_updated: 2023-10-31T12:07:23Z
  file_id: '14476'
  file_name: 2023_MeasurementScienceRev_Jakubik.pdf
  file_size: 2639783
  relation: main_file
  success: 1
file_date_updated: 2023-10-31T12:07:23Z
has_accepted_license: '1'
intvolume: '        23'
issue: '4'
language:
- iso: eng
month: '08'
oa: 1
oa_version: Published Version
page: 175-183
publication: Measurement Science Review
publication_identifier:
  eissn:
  - 1335-8871
publication_status: published
publisher: Sciendo
quality_controlled: '1'
scopus_import: '1'
status: public
title: Against the flow of time with multi-output models
tmp:
  image: /images/cc_by_nc_nd.png
  legal_code_url: https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode
  name: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
    (CC BY-NC-ND 4.0)
  short: CC BY-NC-ND (4.0)
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 23
year: '2023'
...
---
_id: '14771'
abstract:
- lang: eng
  text: Pruning—that is, setting a significant subset of the parameters of a neural
    network to zero—is one of the most popular methods of model compression. Yet,
    several recent works have raised the issue that pruning may induce or exacerbate
    bias in the output of the compressed model. Despite existing evidence for this
    phenomenon, the relationship between neural network pruning and induced bias is
    not well-understood. In this work, we systematically investigate and characterize
    this phenomenon in Convolutional Neural Networks for computer vision. First, we
    show that it is in fact possible to obtain highly-sparse models, e.g. with less
    than 10% remaining weights, which do not decrease in accuracy nor substantially
    increase in bias when compared to dense models. At the same time, we also find
    that, at higher sparsities, pruned models exhibit higher uncertainty in their
    outputs, as well as increased correlations, which we directly link to increased
    bias. We propose easy-to-use criteria which, based only on the uncompressed model,
    establish whether bias will increase with pruning, and identify the samples most
    susceptible to biased predictions post-compression. Our code can be found at https://github.com/IST-DASLab/pruned-vision-model-bias.
acknowledgement: The authors would like to sincerely thank Sara Hooker for her feedback
  during the development of this work. EI was supported in part by the FWF DK VGSCO,
  grant agreement number W1260-N35. AP and DA acknowledge generous ERC support, via
  Starting Grant 805223 ScaleML.
article_processing_charge: No
arxiv: 1
author:
- first_name: Eugenia B
  full_name: Iofinova, Eugenia B
  id: f9a17499-f6e0-11ea-865d-fdf9a3f77117
  last_name: Iofinova
  orcid: 0000-0002-7778-3221
- first_name: Elena-Alexandra
  full_name: Peste, Elena-Alexandra
  id: 32D78294-F248-11E8-B48F-1D18A9856A87
  last_name: Peste
- first_name: Dan-Adrian
  full_name: Alistarh, Dan-Adrian
  id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
  last_name: Alistarh
  orcid: 0000-0003-3650-940X
citation:
  ama: 'Iofinova EB, Peste E-A, Alistarh D-A. Bias in pruned vision models: In-depth
    analysis and countermeasures. In: <i>2023 IEEE/CVF Conference on Computer Vision
    and Pattern Recognition</i>. IEEE; 2023:24364-24373. doi:<a href="https://doi.org/10.1109/cvpr52729.2023.02334">10.1109/cvpr52729.2023.02334</a>'
  apa: 'Iofinova, E. B., Peste, E.-A., &#38; Alistarh, D.-A. (2023). Bias in pruned
    vision models: In-depth analysis and countermeasures. In <i>2023 IEEE/CVF Conference
    on Computer Vision and Pattern Recognition</i> (pp. 24364–24373). Vancouver, BC,
    Canada: IEEE. <a href="https://doi.org/10.1109/cvpr52729.2023.02334">https://doi.org/10.1109/cvpr52729.2023.02334</a>'
  chicago: 'Iofinova, Eugenia B, Elena-Alexandra Peste, and Dan-Adrian Alistarh. “Bias
    in Pruned Vision Models: In-Depth Analysis and Countermeasures.” In <i>2023 IEEE/CVF
    Conference on Computer Vision and Pattern Recognition</i>, 24364–73. IEEE, 2023.
    <a href="https://doi.org/10.1109/cvpr52729.2023.02334">https://doi.org/10.1109/cvpr52729.2023.02334</a>.'
  ieee: 'E. B. Iofinova, E.-A. Peste, and D.-A. Alistarh, “Bias in pruned vision models:
    In-depth analysis and countermeasures,” in <i>2023 IEEE/CVF Conference on Computer
    Vision and Pattern Recognition</i>, Vancouver, BC, Canada, 2023, pp. 24364–24373.'
  ista: 'Iofinova EB, Peste E-A, Alistarh D-A. 2023. Bias in pruned vision models:
    In-depth analysis and countermeasures. 2023 IEEE/CVF Conference on Computer Vision
    and Pattern Recognition. CVPR: Conference on Computer Vision and Pattern Recognition,
    24364–24373.'
  mla: 'Iofinova, Eugenia B., et al. “Bias in Pruned Vision Models: In-Depth Analysis
    and Countermeasures.” <i>2023 IEEE/CVF Conference on Computer Vision and Pattern
    Recognition</i>, IEEE, 2023, pp. 24364–73, doi:<a href="https://doi.org/10.1109/cvpr52729.2023.02334">10.1109/cvpr52729.2023.02334</a>.'
  short: E.B. Iofinova, E.-A. Peste, D.-A. Alistarh, in:, 2023 IEEE/CVF Conference
    on Computer Vision and Pattern Recognition, IEEE, 2023, pp. 24364–24373.
conference:
  end_date: 2023-06-24
  location: Vancouver, BC, Canada
  name: 'CVPR: Conference on Computer Vision and Pattern Recognition'
  start_date: 2023-06-17
date_created: 2024-01-10T08:42:40Z
date_published: 2023-08-22T00:00:00Z
date_updated: 2024-01-10T08:59:26Z
day: '22'
department:
- _id: DaAl
- _id: ChLa
doi: 10.1109/cvpr52729.2023.02334
ec_funded: 1
external_id:
  arxiv:
  - '2304.12622'
  isi:
  - '001062531308068'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2304.12622
month: '08'
oa: 1
oa_version: Preprint
page: 24364-24373
project:
- _id: 9B9290DE-BA93-11EA-9121-9846C619BF3A
  grant_number: ' W1260-N35'
  name: Vienna Graduate School on Computational Optimization
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '805223'
  name: Elastic Coordination for Scalable Machine Learning
publication: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition
publication_identifier:
  eisbn:
  - '9798350301298'
  eissn:
  - 2575-7075
publication_status: published
publisher: IEEE
quality_controlled: '1'
related_material:
  link:
  - relation: software
    url: https://github.com/IST-DASLab/pruned-vision-model-bias
status: public
title: 'Bias in pruned vision models: In-depth analysis and countermeasures'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14921'
abstract:
- lang: eng
  text: Neural collapse (NC) refers to the surprising structure of the last layer
    of deep neural networks in the terminal phase of gradient descent training. Recently,
    an increasing amount of experimental evidence has pointed to the propagation of
    NC to earlier layers of neural networks. However, while the NC in the last layer
    is well studied theoretically, much less is known about its multi-layered counterpart
    - deep neural collapse (DNC). In particular, existing work focuses either on linear
    layers or only on the last two layers at the price of an extra assumption. Our
    paper fills this gap by generalizing the established analytical framework for
    NC - the unconstrained features model - to multiple non-linear layers. Our key
    technical contribution is to show that, in a deep unconstrained features model,
    the unique global optimum for binary classification exhibits all the properties
    typical of DNC. This explains the existing experimental evidence of DNC. We also
    empirically show that (i) by optimizing deep unconstrained features models via
    gradient descent, the resulting solution agrees well with our theory, and (ii)
    trained networks recover the unconstrained features suitable for the occurrence
    of DNC, thus supporting the validity of this modeling principle.
acknowledgement: M. M. is partially supported by the 2019 Lopez-Loreta Prize. The
  authors would like to thank Eugenia Iofinova, Bernd Prach and Simone Bombari for
  valuable feedback on the manuscript.
alternative_title:
- NeurIPS
article_processing_charge: No
arxiv: 1
author:
- first_name: Peter
  full_name: Súkeník, Peter
  id: d64d6a8d-eb8e-11eb-b029-96fd216dec3c
  last_name: Súkeník
- first_name: Marco
  full_name: Mondelli, Marco
  id: 27EB676C-8706-11E9-9510-7717E6697425
  last_name: Mondelli
  orcid: 0000-0002-3242-7020
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Súkeník P, Mondelli M, Lampert C. Deep neural collapse is provably optimal
    for the deep unconstrained features model. In: <i>37th Annual Conference on Neural
    Information Processing Systems</i>.'
  apa: Súkeník, P., Mondelli, M., &#38; Lampert, C. (n.d.). Deep neural collapse is
    provably optimal for the deep unconstrained features model. In <i>37th Annual
    Conference on Neural Information Processing Systems</i>. New Orleans, LA, United
    States.
  chicago: Súkeník, Peter, Marco Mondelli, and Christoph Lampert. “Deep Neural Collapse
    Is Provably Optimal for the Deep Unconstrained Features Model.” In <i>37th Annual
    Conference on Neural Information Processing Systems</i>, n.d.
  ieee: P. Súkeník, M. Mondelli, and C. Lampert, “Deep neural collapse is provably
    optimal for the deep unconstrained features model,” in <i>37th Annual Conference
    on Neural Information Processing Systems</i>, New Orleans, LA, United States.
  ista: 'Súkeník P, Mondelli M, Lampert C. Deep neural collapse is provably optimal
    for the deep unconstrained features model. 37th Annual Conference on Neural Information
    Processing Systems. NeurIPS: Neural Information Processing Systems, NeurIPS, .'
  mla: Súkeník, Peter, et al. “Deep Neural Collapse Is Provably Optimal for the Deep
    Unconstrained Features Model.” <i>37th Annual Conference on Neural Information
    Processing Systems</i>.
  short: P. Súkeník, M. Mondelli, C. Lampert, in:, 37th Annual Conference on Neural
    Information Processing Systems, n.d.
conference:
  end_date: 2023-12-16
  location: New Orleans, LA, United States
  name: 'NeurIPS: Neural Information Processing Systems'
  start_date: 2023-12-10
date_created: 2024-02-02T11:17:41Z
date_published: 2023-12-15T00:00:00Z
date_updated: 2024-09-10T13:03:19Z
day: '15'
department:
- _id: MaMo
- _id: ChLa
external_id:
  arxiv:
  - '2305.13165'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: ' https://doi.org/10.48550/arXiv.2305.13165'
month: '12'
oa: 1
oa_version: Preprint
project:
- _id: 059876FA-7A3F-11EA-A408-12923DDC885E
  name: Prix Lopez-Loretta 2019 - Marco Mondelli
publication: 37th Annual Conference on Neural Information Processing Systems
publication_status: inpress
quality_controlled: '1'
status: public
title: Deep neural collapse is provably optimal for the deep unconstrained features
  model
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '15039'
abstract:
- lang: eng
  text: 'A crucial property for achieving secure, trustworthy and interpretable deep
    learning systems is their robustness: small changes to a system''s inputs should
    not result in large changes to its outputs. Mathematically, this means one strives
    for networks with a small Lipschitz constant. Several recent works have focused
    on how to construct such Lipschitz networks, typically by imposing constraints
    on the weight matrices. In this work, we study an orthogonal aspect, namely the
    role of the activation function. We show that commonly used activation functions,
    such as MaxMin, as well as all piece-wise linear ones with two segments unnecessarily
    restrict the class of representable functions, even in the simplest one-dimensional
    setting. We furthermore introduce the new N-activation function that is provably
    more expressive than currently popular activation functions. We provide code at
    this https URL.'
article_number: '2311.06103'
article_processing_charge: No
arxiv: 1
author:
- first_name: Bernd
  full_name: Prach, Bernd
  id: 2D561D42-C427-11E9-89B4-9C1AE6697425
  last_name: Prach
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: Prach B, Lampert C. 1-Lipschitz neural networks are more expressive with N-activations.
    <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/ARXIV.2311.06103">10.48550/ARXIV.2311.06103</a>
  apa: Prach, B., &#38; Lampert, C. (n.d.). 1-Lipschitz neural networks are more expressive
    with N-activations. <i>arXiv</i>. <a href="https://doi.org/10.48550/ARXIV.2311.06103">https://doi.org/10.48550/ARXIV.2311.06103</a>
  chicago: Prach, Bernd, and Christoph Lampert. “1-Lipschitz Neural Networks Are More
    Expressive with N-Activations.” <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/ARXIV.2311.06103">https://doi.org/10.48550/ARXIV.2311.06103</a>.
  ieee: B. Prach and C. Lampert, “1-Lipschitz neural networks are more expressive
    with N-activations,” <i>arXiv</i>. .
  ista: Prach B, Lampert C. 1-Lipschitz neural networks are more expressive with N-activations.
    arXiv, 2311.06103.
  mla: Prach, Bernd, and Christoph Lampert. “1-Lipschitz Neural Networks Are More
    Expressive with N-Activations.” <i>ArXiv</i>, 2311.06103, doi:<a href="https://doi.org/10.48550/ARXIV.2311.06103">10.48550/ARXIV.2311.06103</a>.
  short: B. Prach, C. Lampert, ArXiv (n.d.).
date_created: 2024-02-28T17:59:32Z
date_published: 2023-11-10T00:00:00Z
date_updated: 2024-03-04T07:02:39Z
day: '10'
department:
- _id: GradSch
- _id: ChLa
doi: 10.48550/ARXIV.2311.06103
external_id:
  arxiv:
  - '2311.06103'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2311.06103
month: '11'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: 1-Lipschitz neural networks are more expressive with N-activations
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '13053'
abstract:
- lang: eng
  text: 'Deep neural networks (DNNs) often have to be compressed, via pruning and/or
    quantization, before they can be deployed in practical settings. In this work
    we propose a new compression-aware minimizer dubbed CrAM that modifies the optimization
    step in a principled way, in order to produce models whose local loss behavior
    is stable under compression operations such as pruning. Thus, dense models trained
    via CrAM should be compressible post-training, in a single step, without significant
    accuracy loss. Experimental results on standard benchmarks, such as residual networks
    for ImageNet classification and BERT models for language modelling, show that
    CrAM produces dense models that can be more accurate than the standard SGD/Adam-based
    baselines, but which are stable under weight pruning: specifically, we can prune
    models in one-shot to 70-80% sparsity with almost no accuracy loss, and to 90%
    with reasonable (∼1%) accuracy loss, which is competitive with gradual compression
    methods. Additionally, CrAM can produce sparse models which perform well for transfer
    learning, and it also works for semi-structured 2:4 pruning patterns supported
    by GPU hardware. The code for reproducing the results is available at this https
    URL .'
acknowledged_ssus:
- _id: ScienComp
acknowledgement: "AP, EK, DA received funding from the European Research Council (ERC)
  under the European\r\nUnion’s Horizon 2020 research and innovation programme (grant
  agreement No 805223 ScaleML). AV acknowledges the support of the French Agence Nationale
  de la Recherche (ANR), under grant ANR-21-CE48-0016 (project COMCOPT). We further
  acknowledge the support from the Scientific Service Units (SSU) of ISTA through
  resources provided by Scientific Computing (SciComp)-"
article_processing_charge: No
arxiv: 1
author:
- first_name: Elena-Alexandra
  full_name: Peste, Elena-Alexandra
  id: 32D78294-F248-11E8-B48F-1D18A9856A87
  last_name: Peste
- first_name: Adrian
  full_name: Vladu, Adrian
  last_name: Vladu
- first_name: Eldar
  full_name: Kurtic, Eldar
  id: 47beb3a5-07b5-11eb-9b87-b108ec578218
  last_name: Kurtic
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
- first_name: Dan-Adrian
  full_name: Alistarh, Dan-Adrian
  id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
  last_name: Alistarh
  orcid: 0000-0003-3650-940X
citation:
  ama: 'Peste E-A, Vladu A, Kurtic E, Lampert C, Alistarh D-A. CrAM: A Compression-Aware
    Minimizer. In: <i>11th International Conference on Learning Representations </i>.'
  apa: 'Peste, E.-A., Vladu, A., Kurtic, E., Lampert, C., &#38; Alistarh, D.-A. (n.d.).
    CrAM: A Compression-Aware Minimizer. In <i>11th International Conference on Learning
    Representations </i>. Kigali, Rwanda .'
  chicago: 'Peste, Elena-Alexandra, Adrian Vladu, Eldar Kurtic, Christoph Lampert,
    and Dan-Adrian Alistarh. “CrAM: A Compression-Aware Minimizer.” In <i>11th International
    Conference on Learning Representations </i>, n.d.'
  ieee: 'E.-A. Peste, A. Vladu, E. Kurtic, C. Lampert, and D.-A. Alistarh, “CrAM:
    A Compression-Aware Minimizer,” in <i>11th International Conference on Learning
    Representations </i>, Kigali, Rwanda .'
  ista: 'Peste E-A, Vladu A, Kurtic E, Lampert C, Alistarh D-A. CrAM: A Compression-Aware
    Minimizer. 11th International Conference on Learning Representations . ICLR: International
    Conference on Learning Representations.'
  mla: 'Peste, Elena-Alexandra, et al. “CrAM: A Compression-Aware Minimizer.” <i>11th
    International Conference on Learning Representations </i>.'
  short: E.-A. Peste, A. Vladu, E. Kurtic, C. Lampert, D.-A. Alistarh, in:, 11th International
    Conference on Learning Representations , n.d.
conference:
  end_date: 2023-05-05
  location: 'Kigali, Rwanda '
  name: 'ICLR: International Conference on Learning Representations'
  start_date: 2023-05-01
date_created: 2023-05-23T11:36:18Z
date_published: 2023-05-01T00:00:00Z
date_updated: 2023-06-01T12:54:45Z
department:
- _id: GradSch
- _id: DaAl
- _id: ChLa
ec_funded: 1
external_id:
  arxiv:
  - '2207.14200'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://openreview.net/pdf?id=_eTZBs-yedr
month: '05'
oa: 1
oa_version: Preprint
project:
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '805223'
  name: Elastic Coordination for Scalable Machine Learning
publication: '11th International Conference on Learning Representations '
publication_status: accepted
quality_controlled: '1'
related_material:
  record:
  - id: '13074'
    relation: dissertation_contains
    status: public
status: public
title: 'CrAM: A Compression-Aware Minimizer'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '13074'
abstract:
- lang: eng
  text: "Deep learning has become an integral part of a large number of important
    applications, and many of the recent breakthroughs have been enabled by the ability
    to train very large models, capable to capture complex patterns and relationships
    from the data. At the same time, the massive sizes of modern deep learning models
    have made their deployment to smaller devices more challenging; this is particularly
    important, as in many applications the users rely on accurate deep learning predictions,
    but they only have access to devices with limited memory and compute power. One
    solution to this problem is to prune neural networks, by setting as many of their
    parameters as possible to zero, to obtain accurate sparse models with lower memory
    footprint. Despite the great research progress in obtaining sparse models that
    preserve accuracy, while satisfying memory and computational constraints, there
    are still many challenges associated with efficiently training sparse models,
    as well as understanding their generalization properties.\r\n\r\nThe focus of
    this thesis is to investigate how the training process of sparse models can be
    made more efficient, and to understand the differences between sparse and dense
    models in terms of how well they can generalize to changes in the data distribution.
    We first study a method for co-training sparse and dense models, at a lower cost
    compared to regular training. With our method we can obtain very accurate sparse
    networks, and dense models that can recover the baseline accuracy. Furthermore,
    we are able to more easily analyze the differences, at prediction level, between
    the sparse-dense model pairs. Next, we investigate the generalization properties
    of sparse neural networks in more detail, by studying how well different sparse
    models trained on a larger task can adapt to smaller, more specialized tasks,
    in a transfer learning scenario. Our analysis across multiple pruning methods
    and sparsity levels reveals that sparse models provide features that can transfer
    similarly to or better than the dense baseline. However, the choice of the pruning
    method plays an important role, and can influence the results when the features
    are fixed (linear finetuning), or when they are allowed to adapt to the new task
    (full finetuning). Using sparse models with fixed masks for finetuning on new
    tasks has an important practical advantage, as it enables training neural networks
    on smaller devices. However, one drawback of current pruning methods is that the
    entire training cycle has to be repeated to obtain the initial sparse model, for
    every sparsity target; in consequence, the entire training process is costly and
    also multiple models need to be stored. In the last part of the thesis we propose
    a method that can train accurate dense models that are compressible in a single
    step, to multiple sparsity levels, without additional finetuning. Our method results
    in sparse models that can be competitive with existing pruning methods, and which
    can also successfully generalize to new tasks."
acknowledged_ssus:
- _id: ScienComp
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Elena-Alexandra
  full_name: Peste, Elena-Alexandra
  id: 32D78294-F248-11E8-B48F-1D18A9856A87
  last_name: Peste
citation:
  ama: Peste E-A. Efficiency and generalization of sparse neural networks. 2023. doi:<a
    href="https://doi.org/10.15479/at:ista:13074">10.15479/at:ista:13074</a>
  apa: Peste, E.-A. (2023). <i>Efficiency and generalization of sparse neural networks</i>.
    Institute of Science and Technology Austria. <a href="https://doi.org/10.15479/at:ista:13074">https://doi.org/10.15479/at:ista:13074</a>
  chicago: Peste, Elena-Alexandra. “Efficiency and Generalization of Sparse Neural
    Networks.” Institute of Science and Technology Austria, 2023. <a href="https://doi.org/10.15479/at:ista:13074">https://doi.org/10.15479/at:ista:13074</a>.
  ieee: E.-A. Peste, “Efficiency and generalization of sparse neural networks,” Institute
    of Science and Technology Austria, 2023.
  ista: Peste E-A. 2023. Efficiency and generalization of sparse neural networks.
    Institute of Science and Technology Austria.
  mla: Peste, Elena-Alexandra. <i>Efficiency and Generalization of Sparse Neural Networks</i>.
    Institute of Science and Technology Austria, 2023, doi:<a href="https://doi.org/10.15479/at:ista:13074">10.15479/at:ista:13074</a>.
  short: E.-A. Peste, Efficiency and Generalization of Sparse Neural Networks, Institute
    of Science and Technology Austria, 2023.
date_created: 2023-05-23T17:07:53Z
date_published: 2023-05-23T00:00:00Z
date_updated: 2023-08-04T10:33:27Z
day: '23'
ddc:
- '000'
degree_awarded: PhD
department:
- _id: GradSch
- _id: DaAl
- _id: ChLa
doi: 10.15479/at:ista:13074
ec_funded: 1
file:
- access_level: open_access
  checksum: 6b3354968403cb9d48cc5a83611fb571
  content_type: application/pdf
  creator: epeste
  date_created: 2023-05-24T16:11:16Z
  date_updated: 2023-05-24T16:11:16Z
  file_id: '13087'
  file_name: PhD_Thesis_Alexandra_Peste_final.pdf
  file_size: 2152072
  relation: main_file
  success: 1
- access_level: closed
  checksum: 8d0df94bbcf4db72c991f22503b3fd60
  content_type: application/zip
  creator: epeste
  date_created: 2023-05-24T16:12:59Z
  date_updated: 2023-05-24T16:12:59Z
  file_id: '13088'
  file_name: PhD_Thesis_APeste.zip
  file_size: 1658293
  relation: source_file
file_date_updated: 2023-05-24T16:12:59Z
has_accepted_license: '1'
language:
- iso: eng
month: '05'
oa: 1
oa_version: Published Version
page: '147'
project:
- _id: 2564DBCA-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '665385'
  name: International IST Doctoral Program
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '805223'
  name: Elastic Coordination for Scalable Machine Learning
publication_identifier:
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
related_material:
  record:
  - id: '11458'
    relation: part_of_dissertation
    status: public
  - id: '13053'
    relation: part_of_dissertation
    status: public
  - id: '12299'
    relation: part_of_dissertation
    status: public
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
- first_name: Dan-Adrian
  full_name: Alistarh, Dan-Adrian
  id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
  last_name: Alistarh
  orcid: 0000-0003-3650-940X
title: Efficiency and generalization of sparse neural networks
type: dissertation
user_id: 8b945eb4-e2f2-11eb-945a-df72226e66a9
year: '2023'
...
---
_id: '10752'
abstract:
- lang: eng
  text: 'The digitalization of almost all aspects of our everyday lives has led to
    unprecedented amounts of data being freely available on the Internet. In particular
    social media platforms provide rich sources of user-generated data, though typically
    in unstructured form, and with high diversity, such as written in many different
    languages. Automatically identifying meaningful information in such big data resources
    and extracting it efficiently is one of the ongoing challenges of our time. A
    common step for this is sentiment analysis, which forms the foundation for tasks
    such as opinion mining or trend prediction. Unfortunately, publicly available
    tools for this task are almost exclusively available for English-language texts.
    Consequently, a large fraction of the Internet users, who do not communicate in
    English, are ignored in automatized studies, a phenomenon called rare-language
    discrimination.In this work we propose a technique to overcome this problem by
    a truly multi-lingual model, which can be trained automatically without linguistic
    knowledge or even the ability to read the many target languages. The main step
    is to combine self-annotation, specifically the use of emoticons as a proxy for
    labels, with multi-lingual sentence representations.To evaluate our method we
    curated several large datasets from data obtained via the free Twitter streaming
    API. The results show that our proposed multi-lingual training is able to achieve
    sentiment predictions at the same quality level for rare languages as for frequent
    ones, and in particular clearly better than what mono-lingual training achieves
    on the same data. '
article_processing_charge: No
author:
- first_name: Jasmin
  full_name: Lampert, Jasmin
  last_name: Lampert
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0002-4561-241X
citation:
  ama: 'Lampert J, Lampert C. Overcoming rare-language discrimination in multi-lingual
    sentiment analysis. In: <i>2021 IEEE International Conference on Big Data</i>.
    IEEE; 2022:5185-5192. doi:<a href="https://doi.org/10.1109/bigdata52589.2021.9672003">10.1109/bigdata52589.2021.9672003</a>'
  apa: 'Lampert, J., &#38; Lampert, C. (2022). Overcoming rare-language discrimination
    in multi-lingual sentiment analysis. In <i>2021 IEEE International Conference
    on Big Data</i> (pp. 5185–5192). Orlando, FL, United States: IEEE. <a href="https://doi.org/10.1109/bigdata52589.2021.9672003">https://doi.org/10.1109/bigdata52589.2021.9672003</a>'
  chicago: Lampert, Jasmin, and Christoph Lampert. “Overcoming Rare-Language Discrimination
    in Multi-Lingual Sentiment Analysis.” In <i>2021 IEEE International Conference
    on Big Data</i>, 5185–92. IEEE, 2022. <a href="https://doi.org/10.1109/bigdata52589.2021.9672003">https://doi.org/10.1109/bigdata52589.2021.9672003</a>.
  ieee: J. Lampert and C. Lampert, “Overcoming rare-language discrimination in multi-lingual
    sentiment analysis,” in <i>2021 IEEE International Conference on Big Data</i>,
    Orlando, FL, United States, 2022, pp. 5185–5192.
  ista: 'Lampert J, Lampert C. 2022. Overcoming rare-language discrimination in multi-lingual
    sentiment analysis. 2021 IEEE International Conference on Big Data. Big Data:
    International Conference on Big Data, 5185–5192.'
  mla: Lampert, Jasmin, and Christoph Lampert. “Overcoming Rare-Language Discrimination
    in Multi-Lingual Sentiment Analysis.” <i>2021 IEEE International Conference on
    Big Data</i>, IEEE, 2022, pp. 5185–92, doi:<a href="https://doi.org/10.1109/bigdata52589.2021.9672003">10.1109/bigdata52589.2021.9672003</a>.
  short: J. Lampert, C. Lampert, in:, 2021 IEEE International Conference on Big Data,
    IEEE, 2022, pp. 5185–5192.
conference:
  end_date: 2021-12-18
  location: Orlando, FL, United States
  name: 'Big Data: International Conference on Big Data'
  start_date: 2021-12-15
date_created: 2022-02-10T14:08:23Z
date_published: 2022-01-13T00:00:00Z
date_updated: 2023-08-02T14:27:50Z
day: '13'
department:
- _id: ChLa
doi: 10.1109/bigdata52589.2021.9672003
external_id:
  isi:
  - '000800559505036'
isi: 1
language:
- iso: eng
month: '01'
oa_version: None
page: 5185-5192
publication: 2021 IEEE International Conference on Big Data
publication_identifier:
  isbn:
  - '9781665439022'
publication_status: published
publisher: IEEE
quality_controlled: '1'
status: public
title: Overcoming rare-language discrimination in multi-lingual sentiment analysis
type: conference
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
year: '2022'
...
---
_id: '10799'
abstract:
- lang: eng
  text: "Because of the increasing popularity of machine learning methods, it is becoming
    important to understand the impact of learned components on automated decision-making
    systems and to guarantee that their consequences are beneficial to society. In
    other words, it is necessary to ensure that machine learning is sufficiently trustworthy
    to be used in real-world applications. This thesis studies two properties of machine
    learning models that are highly desirable for the\r\nsake of reliability: robustness
    and fairness. In the first part of the thesis we study the robustness of learning
    algorithms to training data corruption. Previous work has shown that machine learning
    models are vulnerable to a range\r\nof training set issues, varying from label
    noise through systematic biases to worst-case data manipulations. This is an especially
    relevant problem from a present perspective, since modern machine learning methods
    are particularly data hungry and therefore practitioners often have to rely on
    data collected from various external sources, e.g. from the Internet, from app
    users or via crowdsourcing. Naturally, such sources vary greatly in the quality
    and reliability of the\r\ndata they provide. With these considerations in mind,
    we study the problem of designing machine learning algorithms that are robust
    to corruptions in data coming from multiple sources. We show that, in contrast
    to the case of a single dataset with outliers, successful learning within this
    model is possible both theoretically and practically, even under worst-case data
    corruptions. The second part of this thesis deals with fairness-aware machine
    learning. There are multiple areas where machine learning models have shown promising
    results, but where careful considerations are required, in order to avoid discrimanative
    decisions taken by such learned components. Ensuring fairness can be particularly
    challenging, because real-world training datasets are expected to contain various
    forms of historical bias that may affect the learning process. In this thesis
    we show that data corruption can indeed render the problem of achieving fairness
    impossible, by tightly characterizing the theoretical limits of fair learning
    under worst-case data manipulations. However, assuming access to clean data, we
    also show how fairness-aware learning can be made practical in contexts beyond
    binary classification, in particular in the challenging learning to rank setting."
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Nikola H
  full_name: Konstantinov, Nikola H
  id: 4B9D76E4-F248-11E8-B48F-1D18A9856A87
  last_name: Konstantinov
citation:
  ama: Konstantinov NH. Robustness and fairness in machine learning. 2022. doi:<a
    href="https://doi.org/10.15479/at:ista:10799">10.15479/at:ista:10799</a>
  apa: Konstantinov, N. H. (2022). <i>Robustness and fairness in machine learning</i>.
    Institute of Science and Technology Austria. <a href="https://doi.org/10.15479/at:ista:10799">https://doi.org/10.15479/at:ista:10799</a>
  chicago: Konstantinov, Nikola H. “Robustness and Fairness in Machine Learning.”
    Institute of Science and Technology Austria, 2022. <a href="https://doi.org/10.15479/at:ista:10799">https://doi.org/10.15479/at:ista:10799</a>.
  ieee: N. H. Konstantinov, “Robustness and fairness in machine learning,” Institute
    of Science and Technology Austria, 2022.
  ista: Konstantinov NH. 2022. Robustness and fairness in machine learning. Institute
    of Science and Technology Austria.
  mla: Konstantinov, Nikola H. <i>Robustness and Fairness in Machine Learning</i>.
    Institute of Science and Technology Austria, 2022, doi:<a href="https://doi.org/10.15479/at:ista:10799">10.15479/at:ista:10799</a>.
  short: N.H. Konstantinov, Robustness and Fairness in Machine Learning, Institute
    of Science and Technology Austria, 2022.
date_created: 2022-02-28T13:03:49Z
date_published: 2022-03-08T00:00:00Z
date_updated: 2023-10-17T12:31:54Z
day: '08'
ddc:
- '000'
degree_awarded: PhD
department:
- _id: GradSch
- _id: ChLa
doi: 10.15479/at:ista:10799
ec_funded: 1
file:
- access_level: open_access
  checksum: 626bc523ae8822d20e635d0e2d95182e
  content_type: application/pdf
  creator: nkonstan
  date_created: 2022-03-06T11:42:54Z
  date_updated: 2022-03-06T11:42:54Z
  file_id: '10823'
  file_name: thesis.pdf
  file_size: 4204905
  relation: main_file
  success: 1
- access_level: closed
  checksum: e2ca2b88350ac8ea1515b948885cbcb1
  content_type: application/x-zip-compressed
  creator: nkonstan
  date_created: 2022-03-06T11:42:57Z
  date_updated: 2022-03-10T12:11:48Z
  file_id: '10824'
  file_name: thesis.zip
  file_size: 22841103
  relation: source_file
file_date_updated: 2022-03-10T12:11:48Z
has_accepted_license: '1'
keyword:
- robustness
- fairness
- machine learning
- PAC learning
- adversarial learning
language:
- iso: eng
month: '03'
oa: 1
oa_version: Published Version
page: '176'
project:
- _id: 2564DBCA-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '665385'
  name: International IST Doctoral Program
publication_identifier:
  isbn:
  - 978-3-99078-015-2
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
related_material:
  record:
  - id: '8724'
    relation: part_of_dissertation
    status: public
  - id: '10803'
    relation: part_of_dissertation
    status: public
  - id: '10802'
    relation: part_of_dissertation
    status: public
  - id: '6590'
    relation: part_of_dissertation
    status: public
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Robustness and fairness in machine learning
type: dissertation
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2022'
...
---
_id: '10802'
abstract:
- lang: eng
  text: "Addressing fairness concerns about machine learning models is a crucial step
    towards their long-term adoption in real-world automated systems. While many approaches
    have been developed for training fair models from data, little is known about
    the robustness of these methods to data corruption. In this work we consider fairness-aware
    learning under worst-case data manipulations. We show that an adversary can in
    some situations force any learner to return an overly biased classifier, regardless
    of the sample size and with or without degrading\r\naccuracy, and that the strength
    of the excess bias increases for learning problems with underrepresented protected
    groups in the data. We also prove that our hardness results are tight up to constant
    factors. To this end, we study two natural learning algorithms that optimize for
    both accuracy and fairness and show that these algorithms enjoy guarantees that
    are order-optimal in terms of the corruption ratio and the protected groups frequencies
    in the large data\r\nlimit."
acknowledgement: The authors thank Eugenia Iofinova and Bernd Prach for providing
  feedback on early versions of this paper. This publication was made possible by
  an ETH AI Center postdoctoral fellowship to Nikola Konstantinov.
article_processing_charge: No
article_type: original
arxiv: 1
author:
- first_name: Nikola H
  full_name: Konstantinov, Nikola H
  id: 4B9D76E4-F248-11E8-B48F-1D18A9856A87
  last_name: Konstantinov
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0002-4561-241X
citation:
  ama: Konstantinov NH, Lampert C. Fairness-aware PAC learning from corrupted data.
    <i>Journal of Machine Learning Research</i>. 2022;23:1-60.
  apa: Konstantinov, N. H., &#38; Lampert, C. (2022). Fairness-aware PAC learning
    from corrupted data. <i>Journal of Machine Learning Research</i>. ML Research
    Press.
  chicago: Konstantinov, Nikola H, and Christoph Lampert. “Fairness-Aware PAC Learning
    from Corrupted Data.” <i>Journal of Machine Learning Research</i>. ML Research
    Press, 2022.
  ieee: N. H. Konstantinov and C. Lampert, “Fairness-aware PAC learning from corrupted
    data,” <i>Journal of Machine Learning Research</i>, vol. 23. ML Research Press,
    pp. 1–60, 2022.
  ista: Konstantinov NH, Lampert C. 2022. Fairness-aware PAC learning from corrupted
    data. Journal of Machine Learning Research. 23, 1–60.
  mla: Konstantinov, Nikola H., and Christoph Lampert. “Fairness-Aware PAC Learning
    from Corrupted Data.” <i>Journal of Machine Learning Research</i>, vol. 23, ML
    Research Press, 2022, pp. 1–60.
  short: N.H. Konstantinov, C. Lampert, Journal of Machine Learning Research 23 (2022)
    1–60.
date_created: 2022-02-28T14:05:42Z
date_published: 2022-05-01T00:00:00Z
date_updated: 2023-09-26T10:44:37Z
day: '01'
ddc:
- '004'
department:
- _id: ChLa
external_id:
  arxiv:
  - '2102.06004'
file:
- access_level: open_access
  checksum: 9cac897b54a0ddf3a553a2c33e88cfda
  content_type: application/pdf
  creator: kschuh
  date_created: 2022-07-12T15:08:28Z
  date_updated: 2022-07-12T15:08:28Z
  file_id: '11570'
  file_name: 2022_JournalMachineLearningResearch_Konstantinov.pdf
  file_size: 551862
  relation: main_file
  success: 1
file_date_updated: 2022-07-12T15:08:28Z
has_accepted_license: '1'
intvolume: '        23'
keyword:
- Fairness
- robustness
- data poisoning
- trustworthy machine learning
- PAC learning
language:
- iso: eng
month: '05'
oa: 1
oa_version: Published Version
page: 1-60
publication: Journal of Machine Learning Research
publication_identifier:
  eissn:
  - 1533-7928
  issn:
  - 1532-4435
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
related_material:
  record:
  - id: '10799'
    relation: dissertation_contains
    status: public
  - id: '13241'
    relation: shorter_version
    status: public
scopus_import: '1'
status: public
title: Fairness-aware PAC learning from corrupted data
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 23
year: '2022'
...
---
_id: '13241'
abstract:
- lang: eng
  text: Addressing fairness concerns about machine learning models is a crucial step
    towards their long-term adoption in real-world automated systems. Many approaches
    for training fair models from data have been developed and an implicit assumption
    about such algorithms is that they are able to recover a fair model, despite potential
    historical biases in the data. In this work we show a number of impossibility
    results that indicate that there is no learning algorithm that can recover a fair
    model when a proportion of the dataset is subject to arbitrary manipulations.
    Specifically, we prove that there are situations in which an adversary can force
    any learner to return a biased classifier, with or without degrading accuracy,
    and that the strength of this bias increases for learning problems with underrepresented
    protected groups in the data. Our results emphasize on the importance of studying
    further data corruption models of various strength and of establishing stricter
    data collection practices for fairness-aware learning.
acknowledgement: "This paper is a shortened, workshop version of Konstantinov and
  Lampert (2021),\r\nhttps://arxiv.org/abs/2102.06004. For further results, including
  an analysis of algorithms achieving the lower bounds from this paper, we refer to
  the full version."
article_processing_charge: No
arxiv: 1
author:
- first_name: Nikola H
  full_name: Konstantinov, Nikola H
  id: 4B9D76E4-F248-11E8-B48F-1D18A9856A87
  last_name: Konstantinov
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Konstantinov NH, Lampert C. On the impossibility of fairness-aware learning
    from corrupted data. In: <i>Proceedings of Machine Learning Research</i>. Vol
    171. ML Research Press; 2022:59-83.'
  apa: Konstantinov, N. H., &#38; Lampert, C. (2022). On the impossibility of fairness-aware
    learning from corrupted data. In <i>Proceedings of Machine Learning Research</i>
    (Vol. 171, pp. 59–83). ML Research Press.
  chicago: Konstantinov, Nikola H, and Christoph Lampert. “On the Impossibility of
    Fairness-Aware Learning from Corrupted Data.” In <i>Proceedings of Machine Learning
    Research</i>, 171:59–83. ML Research Press, 2022.
  ieee: N. H. Konstantinov and C. Lampert, “On the impossibility of fairness-aware
    learning from corrupted data,” in <i>Proceedings of Machine Learning Research</i>,
    2022, vol. 171, pp. 59–83.
  ista: Konstantinov NH, Lampert C. 2022. On the impossibility of fairness-aware learning
    from corrupted data. Proceedings of Machine Learning Research. vol. 171, 59–83.
  mla: Konstantinov, Nikola H., and Christoph Lampert. “On the Impossibility of Fairness-Aware
    Learning from Corrupted Data.” <i>Proceedings of Machine Learning Research</i>,
    vol. 171, ML Research Press, 2022, pp. 59–83.
  short: N.H. Konstantinov, C. Lampert, in:, Proceedings of Machine Learning Research,
    ML Research Press, 2022, pp. 59–83.
date_created: 2023-07-16T22:01:13Z
date_published: 2022-12-01T00:00:00Z
date_updated: 2023-09-26T10:44:37Z
day: '01'
department:
- _id: ChLa
external_id:
  arxiv:
  - '2102.06004'
intvolume: '       171'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2102.06004
month: '12'
oa: 1
oa_version: Preprint
page: 59-83
publication: Proceedings of Machine Learning Research
publication_identifier:
  eissn:
  - 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
related_material:
  record:
  - id: '10802'
    relation: extended_version
    status: public
scopus_import: '1'
status: public
title: On the impossibility of fairness-aware learning from corrupted data
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 171
year: '2022'
...
---
_id: '11839'
abstract:
- lang: eng
  text: "It is a highly desirable property for deep networks to be robust against\r\nsmall
    input changes. One popular way to achieve this property is by designing\r\nnetworks
    with a small Lipschitz constant. In this work, we propose a new\r\ntechnique for
    constructing such Lipschitz networks that has a number of\r\ndesirable properties:
    it can be applied to any linear network layer\r\n(fully-connected or convolutional),
    it provides formal guarantees on the\r\nLipschitz constant, it is easy to implement
    and efficient to run, and it can be\r\ncombined with any training objective and
    optimization method. In fact, our\r\ntechnique is the first one in the literature
    that achieves all of these\r\nproperties simultaneously. Our main contribution
    is a rescaling-based weight\r\nmatrix parametrization that guarantees each network
    layer to have a Lipschitz\r\nconstant of at most 1 and results in the learned
    weight matrices to be close to\r\northogonal. Hence we call such layers almost-orthogonal
    Lipschitz (AOL).\r\nExperiments and ablation studies in the context of image classification
    with\r\ncertified robust accuracy confirm that AOL layers achieve results that
    are on\r\npar with most existing methods. Yet, they are simpler to implement and
    more\r\nbroadly applicable, because they do not require computationally expensive\r\nmatrix
    orthogonalization or inversion steps as part of the network\r\narchitecture. We
    provide code at https://github.com/berndprach/AOL."
alternative_title:
- LNCS
article_processing_charge: No
arxiv: 1
author:
- first_name: Bernd
  full_name: Prach, Bernd
  id: 2D561D42-C427-11E9-89B4-9C1AE6697425
  last_name: Prach
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Prach B, Lampert C. Almost-orthogonal layers for efficient general-purpose
    Lipschitz networks. In: <i>Computer Vision – ECCV 2022</i>. Vol 13681. Springer
    Nature; 2022:350-365. doi:<a href="https://doi.org/10.1007/978-3-031-19803-8_21">10.1007/978-3-031-19803-8_21</a>'
  apa: 'Prach, B., &#38; Lampert, C. (2022). Almost-orthogonal layers for efficient
    general-purpose Lipschitz networks. In <i>Computer Vision – ECCV 2022</i> (Vol.
    13681, pp. 350–365). Tel Aviv, Israel: Springer Nature. <a href="https://doi.org/10.1007/978-3-031-19803-8_21">https://doi.org/10.1007/978-3-031-19803-8_21</a>'
  chicago: Prach, Bernd, and Christoph Lampert. “Almost-Orthogonal Layers for Efficient
    General-Purpose Lipschitz Networks.” In <i>Computer Vision – ECCV 2022</i>, 13681:350–65.
    Springer Nature, 2022. <a href="https://doi.org/10.1007/978-3-031-19803-8_21">https://doi.org/10.1007/978-3-031-19803-8_21</a>.
  ieee: B. Prach and C. Lampert, “Almost-orthogonal layers for efficient general-purpose
    Lipschitz networks,” in <i>Computer Vision – ECCV 2022</i>, Tel Aviv, Israel,
    2022, vol. 13681, pp. 350–365.
  ista: 'Prach B, Lampert C. 2022. Almost-orthogonal layers for efficient general-purpose
    Lipschitz networks. Computer Vision – ECCV 2022. ECCV: European Conference on
    Computer Vision, LNCS, vol. 13681, 350–365.'
  mla: Prach, Bernd, and Christoph Lampert. “Almost-Orthogonal Layers for Efficient
    General-Purpose Lipschitz Networks.” <i>Computer Vision – ECCV 2022</i>, vol.
    13681, Springer Nature, 2022, pp. 350–65, doi:<a href="https://doi.org/10.1007/978-3-031-19803-8_21">10.1007/978-3-031-19803-8_21</a>.
  short: B. Prach, C. Lampert, in:, Computer Vision – ECCV 2022, Springer Nature,
    2022, pp. 350–365.
conference:
  end_date: 2022-10-27
  location: Tel Aviv, Israel
  name: 'ECCV: European Conference on Computer Vision'
  start_date: 2022-10-23
date_created: 2022-08-12T15:09:47Z
date_published: 2022-10-23T00:00:00Z
date_updated: 2023-05-03T08:00:46Z
day: '23'
department:
- _id: GradSch
- _id: ChLa
doi: 10.1007/978-3-031-19803-8_21
external_id:
  arxiv:
  - '2208.03160'
intvolume: '     13681'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: ' https://doi.org/10.48550/arXiv.2208.03160'
month: '10'
oa: 1
oa_version: Preprint
page: 350-365
publication: Computer Vision – ECCV 2022
publication_identifier:
  eisbn:
  - '9783031198038'
  isbn:
  - '9783031198021'
publication_status: published
publisher: Springer Nature
quality_controlled: '1'
scopus_import: '1'
status: public
title: Almost-orthogonal layers for efficient general-purpose Lipschitz networks
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 13681
year: '2022'
...
---
_id: '12161'
abstract:
- lang: eng
  text: 'We introduce LIMES, a new method for learning with non-stationary streaming
    data, inspired by the recent success of meta-learning. The main idea is not to
    attempt to learn a single classifier that would have to work well across all occurring
    data distributions, nor many separate classifiers, but to exploit a hybrid strategy:
    we learn a single set of model parameters from which a specific classifier for
    any specific data distribution is derived via classifier adaptation. Assuming
    a multiclass classification setting with class-prior shift, the adaptation step
    can be performed analytically with only the classifier’s bias terms being affected.
    Another contribution of our work is an extrapolation step that predicts suitable
    adaptation parameters for future time steps based on the previous data. In combination,
    we obtain a lightweight procedure for learning from streaming data with varying
    class distribution that adds no trainable parameters and almost no memory or computational
    overhead compared to training a single model. Experiments on a set of exemplary
    tasks using Twitter data show that LIMES achieves higher accuracy than alternative
    approaches, especially with respect to the relevant real-world metric of lowest
    within-day accuracy.'
article_processing_charge: No
arxiv: 1
author:
- first_name: Paulina
  full_name: Tomaszewska, Paulina
  last_name: Tomaszewska
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Tomaszewska P, Lampert C. Lightweight conditional model extrapolation for
    streaming data under class-prior shift. In: <i>26th International Conference on
    Pattern Recognition</i>. Vol 2022. Institute of Electrical and Electronics Engineers;
    2022:2128-2134. doi:<a href="https://doi.org/10.1109/icpr56361.2022.9956195">10.1109/icpr56361.2022.9956195</a>'
  apa: 'Tomaszewska, P., &#38; Lampert, C. (2022). Lightweight conditional model extrapolation
    for streaming data under class-prior shift. In <i>26th International Conference
    on Pattern Recognition</i> (Vol. 2022, pp. 2128–2134). Montreal, Canada: Institute
    of Electrical and Electronics Engineers. <a href="https://doi.org/10.1109/icpr56361.2022.9956195">https://doi.org/10.1109/icpr56361.2022.9956195</a>'
  chicago: Tomaszewska, Paulina, and Christoph Lampert. “Lightweight Conditional Model
    Extrapolation for Streaming Data under Class-Prior Shift.” In <i>26th International
    Conference on Pattern Recognition</i>, 2022:2128–34. Institute of Electrical and
    Electronics Engineers, 2022. <a href="https://doi.org/10.1109/icpr56361.2022.9956195">https://doi.org/10.1109/icpr56361.2022.9956195</a>.
  ieee: P. Tomaszewska and C. Lampert, “Lightweight conditional model extrapolation
    for streaming data under class-prior shift,” in <i>26th International Conference
    on Pattern Recognition</i>, Montreal, Canada, 2022, vol. 2022, pp. 2128–2134.
  ista: 'Tomaszewska P, Lampert C. 2022. Lightweight conditional model extrapolation
    for streaming data under class-prior shift. 26th International Conference on Pattern
    Recognition. ICPR: International Conference on Pattern Recognition vol. 2022,
    2128–2134.'
  mla: Tomaszewska, Paulina, and Christoph Lampert. “Lightweight Conditional Model
    Extrapolation for Streaming Data under Class-Prior Shift.” <i>26th International
    Conference on Pattern Recognition</i>, vol. 2022, Institute of Electrical and
    Electronics Engineers, 2022, pp. 2128–34, doi:<a href="https://doi.org/10.1109/icpr56361.2022.9956195">10.1109/icpr56361.2022.9956195</a>.
  short: P. Tomaszewska, C. Lampert, in:, 26th International Conference on Pattern
    Recognition, Institute of Electrical and Electronics Engineers, 2022, pp. 2128–2134.
conference:
  end_date: 2022-08-25
  location: Montreal, Canada
  name: 'ICPR: International Conference on Pattern Recognition'
  start_date: 2022-08-21
date_created: 2023-01-12T12:09:38Z
date_published: 2022-11-29T00:00:00Z
date_updated: 2023-08-04T09:06:34Z
day: '29'
department:
- _id: ChLa
doi: 10.1109/icpr56361.2022.9956195
external_id:
  arxiv:
  - '2206.05181'
  isi:
  - '000897707602018'
intvolume: '      2022'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2206.05181
month: '11'
oa: 1
oa_version: Preprint
page: 2128-2134
publication: 26th International Conference on Pattern Recognition
publication_identifier:
  eisbn:
  - '9781665490627'
  eissn:
  - 2831-7475
publication_status: published
publisher: Institute of Electrical and Electronics Engineers
quality_controlled: '1'
scopus_import: '1'
status: public
title: Lightweight conditional model extrapolation for streaming data under class-prior
  shift
type: conference
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
volume: 2022
year: '2022'
...
---
_id: '12299'
abstract:
- lang: eng
  text: 'Transfer learning is a classic paradigm by which models pretrained on large
    “upstream” datasets are adapted to yield good results on “downstream” specialized
    datasets. Generally, more accurate models on the “upstream” dataset tend to provide
    better transfer accuracy “downstream”. In this work, we perform an in-depth investigation
    of this phenomenon in the context of convolutional neural networks (CNNs) trained
    on the ImageNet dataset, which have been pruned-that is, compressed by sparsifiying
    their connections. We consider transfer using unstructured pruned models obtained
    by applying several state-of-the-art pruning methods, including magnitude-based,
    second-order, regrowth, lottery-ticket, and regularization approaches, in the
    context of twelve standard transfer tasks. In a nutshell, our study shows that
    sparse models can match or even outperform the transfer performance of dense models,
    even at high sparsities, and, while doing so, can lead to significant inference
    and even training speedups. At the same time, we observe and analyze significant
    differences in the behaviour of different pruning methods. The code is available
    at: https://github.com/IST-DASLab/sparse-imagenet-transfer.'
acknowledgement: he authors would like to sincerely thank Christoph Lampert and Nir
  Shavit for fruitful discussions during the development of this work, and Eldar Kurtic
  for experimental support. EI was supported in part by the FWF DK VGSCO, grant agreement
  number W1260-N35, while AP and DA acknowledge generous support by the ERC, via Starting
  Grant 805223 ScaleML.
article_processing_charge: No
arxiv: 1
author:
- first_name: Eugenia B
  full_name: Iofinova, Eugenia B
  id: f9a17499-f6e0-11ea-865d-fdf9a3f77117
  last_name: Iofinova
  orcid: 0000-0002-7778-3221
- first_name: Elena-Alexandra
  full_name: Peste, Elena-Alexandra
  id: 32D78294-F248-11E8-B48F-1D18A9856A87
  last_name: Peste
- first_name: Mark
  full_name: Kurtz, Mark
  last_name: Kurtz
- first_name: Dan-Adrian
  full_name: Alistarh, Dan-Adrian
  id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
  last_name: Alistarh
  orcid: 0000-0003-3650-940X
citation:
  ama: 'Iofinova EB, Peste E-A, Kurtz M, Alistarh D-A. How well do sparse ImageNet
    models transfer? In: <i>2022 IEEE/CVF Conference on Computer Vision and Pattern
    Recognition</i>. Institute of Electrical and Electronics Engineers; 2022:12256-12266.
    doi:<a href="https://doi.org/10.1109/cvpr52688.2022.01195">10.1109/cvpr52688.2022.01195</a>'
  apa: 'Iofinova, E. B., Peste, E.-A., Kurtz, M., &#38; Alistarh, D.-A. (2022). How
    well do sparse ImageNet models transfer? In <i>2022 IEEE/CVF Conference on Computer
    Vision and Pattern Recognition</i> (pp. 12256–12266). New Orleans, LA, United
    States: Institute of Electrical and Electronics Engineers. <a href="https://doi.org/10.1109/cvpr52688.2022.01195">https://doi.org/10.1109/cvpr52688.2022.01195</a>'
  chicago: Iofinova, Eugenia B, Elena-Alexandra Peste, Mark Kurtz, and Dan-Adrian
    Alistarh. “How Well Do Sparse ImageNet Models Transfer?” In <i>2022 IEEE/CVF Conference
    on Computer Vision and Pattern Recognition</i>, 12256–66. Institute of Electrical
    and Electronics Engineers, 2022. <a href="https://doi.org/10.1109/cvpr52688.2022.01195">https://doi.org/10.1109/cvpr52688.2022.01195</a>.
  ieee: E. B. Iofinova, E.-A. Peste, M. Kurtz, and D.-A. Alistarh, “How well do sparse
    ImageNet models transfer?,” in <i>2022 IEEE/CVF Conference on Computer Vision
    and Pattern Recognition</i>, New Orleans, LA, United States, 2022, pp. 12256–12266.
  ista: 'Iofinova EB, Peste E-A, Kurtz M, Alistarh D-A. 2022. How well do sparse ImageNet
    models transfer? 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
    CVPR: Computer Vision and Pattern Recognition, 12256–12266.'
  mla: Iofinova, Eugenia B., et al. “How Well Do Sparse ImageNet Models Transfer?”
    <i>2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition</i>, Institute
    of Electrical and Electronics Engineers, 2022, pp. 12256–66, doi:<a href="https://doi.org/10.1109/cvpr52688.2022.01195">10.1109/cvpr52688.2022.01195</a>.
  short: E.B. Iofinova, E.-A. Peste, M. Kurtz, D.-A. Alistarh, in:, 2022 IEEE/CVF
    Conference on Computer Vision and Pattern Recognition, Institute of Electrical
    and Electronics Engineers, 2022, pp. 12256–12266.
conference:
  end_date: 2022-06-24
  location: New Orleans, LA, United States
  name: 'CVPR: Computer Vision and Pattern Recognition'
  start_date: 2022-06-18
date_created: 2023-01-16T10:06:00Z
date_published: 2022-09-27T00:00:00Z
date_updated: 2023-08-04T10:33:28Z
day: '27'
department:
- _id: DaAl
- _id: ChLa
doi: 10.1109/cvpr52688.2022.01195
ec_funded: 1
external_id:
  arxiv:
  - '2111.13445'
  isi:
  - '000870759105034'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2111.13445
month: '09'
oa: 1
oa_version: Preprint
page: 12256-12266
project:
- _id: 9B9290DE-BA93-11EA-9121-9846C619BF3A
  grant_number: ' W1260-N35'
  name: Vienna Graduate School on Computational Optimization
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '805223'
  name: Elastic Coordination for Scalable Machine Learning
publication: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition
publication_identifier:
  eissn:
  - 2575-7075
publication_status: published
publisher: Institute of Electrical and Electronics Engineers
quality_controlled: '1'
related_material:
  record:
  - id: '13074'
    relation: dissertation_contains
    status: public
scopus_import: '1'
status: public
title: How well do sparse ImageNet models transfer?
type: conference
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
year: '2022'
...
---
_id: '12495'
abstract:
- lang: eng
  text: "Fairness-aware learning aims at constructing classifiers that not only make
    accurate predictions, but also do not discriminate against specific groups. It
    is a fast-growing area of\r\nmachine learning with far-reaching societal impact.
    However, existing fair learning methods\r\nare vulnerable to accidental or malicious
    artifacts in the training data, which can cause\r\nthem to unknowingly produce
    unfair classifiers. In this work we address the problem of\r\nfair learning from
    unreliable training data in the robust multisource setting, where the\r\navailable
    training data comes from multiple sources, a fraction of which might not be representative
    of the true data distribution. We introduce FLEA, a filtering-based algorithm\r\nthat
    identifies and suppresses those data sources that would have a negative impact
    on\r\nfairness or accuracy if they were used for training. As such, FLEA is not
    a replacement of\r\nprior fairness-aware learning methods but rather an augmentation
    that makes any of them\r\nrobust against unreliable training data. We show the
    effectiveness of our approach by a\r\ndiverse range of experiments on multiple
    datasets. Additionally, we prove formally that\r\n–given enough data– FLEA protects
    the learner against corruptions as long as the fraction of\r\naffected data sources
    is less than half. Our source code and documentation are available at\r\nhttps://github.com/ISTAustria-CVML/FLEA."
acknowledged_ssus:
- _id: ScienComp
acknowledgement: 'The authors would like to thank Bernd Prach, Elias Frantar, Alexandra
  Peste, Mahdi Nikdan, and Peter Súkeník for their helpful feedback. This research
  was supported by the Scientific Service Units (SSU) of IST Austria through resources
  provided by Scientific Computing (SciComp). This publication was made possible by
  an ETH AI Center postdoctoral fellowship granted to Nikola Konstantinov. Eugenia
  Iofinova was supported in part by the FWF DK VGSCO, grant agreement number W1260-N35. '
article_processing_charge: No
article_type: original
arxiv: 1
author:
- first_name: Eugenia B
  full_name: Iofinova, Eugenia B
  id: f9a17499-f6e0-11ea-865d-fdf9a3f77117
  last_name: Iofinova
  orcid: 0000-0002-7778-3221
- first_name: Nikola H
  full_name: Konstantinov, Nikola H
  id: 4B9D76E4-F248-11E8-B48F-1D18A9856A87
  last_name: Konstantinov
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Iofinova EB, Konstantinov NH, Lampert C. FLEA: Provably robust fair multisource
    learning from unreliable training data. <i>Transactions on Machine Learning Research</i>.
    2022.'
  apa: 'Iofinova, E. B., Konstantinov, N. H., &#38; Lampert, C. (2022). FLEA: Provably
    robust fair multisource learning from unreliable training data. <i>Transactions
    on Machine Learning Research</i>. ML Research Press.'
  chicago: 'Iofinova, Eugenia B, Nikola H Konstantinov, and Christoph Lampert. “FLEA:
    Provably Robust Fair Multisource Learning from Unreliable Training Data.” <i>Transactions
    on Machine Learning Research</i>. ML Research Press, 2022.'
  ieee: 'E. B. Iofinova, N. H. Konstantinov, and C. Lampert, “FLEA: Provably robust
    fair multisource learning from unreliable training data,” <i>Transactions on Machine
    Learning Research</i>. ML Research Press, 2022.'
  ista: 'Iofinova EB, Konstantinov NH, Lampert C. 2022. FLEA: Provably robust fair
    multisource learning from unreliable training data. Transactions on Machine Learning
    Research.'
  mla: 'Iofinova, Eugenia B., et al. “FLEA: Provably Robust Fair Multisource Learning
    from Unreliable Training Data.” <i>Transactions on Machine Learning Research</i>,
    ML Research Press, 2022.'
  short: E.B. Iofinova, N.H. Konstantinov, C. Lampert, Transactions on Machine Learning
    Research (2022).
date_created: 2023-02-02T20:29:57Z
date_published: 2022-12-22T00:00:00Z
date_updated: 2023-02-23T10:30:54Z
day: '22'
ddc:
- '000'
department:
- _id: ChLa
external_id:
  arxiv:
  - '2106.11732'
file:
- access_level: open_access
  checksum: 97c8a8470759cab597abb973ca137a3b
  content_type: application/pdf
  creator: dernst
  date_created: 2023-02-23T10:30:04Z
  date_updated: 2023-02-23T10:30:04Z
  file_id: '12673'
  file_name: 2022_TMLR_Iofinova.pdf
  file_size: 1948063
  relation: main_file
  success: 1
file_date_updated: 2023-02-23T10:30:04Z
has_accepted_license: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://openreview.net/forum?id=XsPopigZXV
month: '12'
oa: 1
oa_version: Published Version
project:
- _id: 9B9290DE-BA93-11EA-9121-9846C619BF3A
  grant_number: ' W1260-N35'
  name: Vienna Graduate School on Computational Optimization
publication: Transactions on Machine Learning Research
publication_identifier:
  issn:
  - 2835-8856
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
related_material:
  link:
  - description: source code
    relation: software
    url: https://github.com/ISTAustria-CVML/FLEA
status: public
title: 'FLEA: Provably robust fair multisource learning from unreliable training data'
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2022'
...
---
_id: '12660'
abstract:
- lang: eng
  text: 'We present Cross-Client Label Propagation(XCLP), a new method for transductive
    federated learning. XCLP estimates a data graph jointly from the data of multiple
    clients and computes labels for the unlabeled data by propagating label information
    across the graph. To avoid clients having to share their data with anyone, XCLP
    employs two cryptographically secure protocols: secure Hamming distance computation
    and secure summation. We demonstrate two distinct applications of XCLP within
    federated learning. In the first, we use it in a one-shot way to predict labels
    for unseen test points. In the second, we use it to repeatedly pseudo-label unlabeled
    training data in a federated semi-supervised setting. Experiments on both real
    federated and standard benchmark datasets show that in both applications XCLP
    achieves higher classification accuracy than alternative approaches.'
article_number: '2210.06434'
article_processing_charge: No
arxiv: 1
author:
- first_name: Jonathan A
  full_name: Scott, Jonathan A
  id: e499926b-f6e0-11ea-865d-9c63db0031e8
  last_name: Scott
- first_name: Michelle X
  full_name: Yeo, Michelle X
  id: 2D82B818-F248-11E8-B48F-1D18A9856A87
  last_name: Yeo
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: Scott JA, Yeo MX, Lampert C. Cross-client Label Propagation for transductive
    federated learning. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2210.06434">10.48550/arXiv.2210.06434</a>
  apa: Scott, J. A., Yeo, M. X., &#38; Lampert, C. (n.d.). Cross-client Label Propagation
    for transductive federated learning. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2210.06434">https://doi.org/10.48550/arXiv.2210.06434</a>
  chicago: Scott, Jonathan A, Michelle X Yeo, and Christoph Lampert. “Cross-Client
    Label Propagation for Transductive Federated Learning.” <i>ArXiv</i>, n.d. <a
    href="https://doi.org/10.48550/arXiv.2210.06434">https://doi.org/10.48550/arXiv.2210.06434</a>.
  ieee: J. A. Scott, M. X. Yeo, and C. Lampert, “Cross-client Label Propagation for
    transductive federated learning,” <i>arXiv</i>. .
  ista: Scott JA, Yeo MX, Lampert C. Cross-client Label Propagation for transductive
    federated learning. arXiv, 2210.06434.
  mla: Scott, Jonathan A., et al. “Cross-Client Label Propagation for Transductive
    Federated Learning.” <i>ArXiv</i>, 2210.06434, doi:<a href="https://doi.org/10.48550/arXiv.2210.06434">10.48550/arXiv.2210.06434</a>.
  short: J.A. Scott, M.X. Yeo, C. Lampert, ArXiv (n.d.).
date_created: 2023-02-20T08:21:50Z
date_published: 2022-10-12T00:00:00Z
date_updated: 2023-02-21T08:20:18Z
day: '12'
ddc:
- '004'
department:
- _id: ChLa
doi: 10.48550/arXiv.2210.06434
external_id:
  arxiv:
  - '2210.06434'
file:
- access_level: open_access
  checksum: 7ab20543fd4393f14fb857ce2e4f03c6
  content_type: application/pdf
  creator: chl
  date_created: 2023-02-20T08:21:35Z
  date_updated: 2023-02-20T08:21:35Z
  file_id: '12661'
  file_name: 2210.06434.pdf
  file_size: 291893
  relation: main_file
  success: 1
file_date_updated: 2023-02-20T08:21:35Z
has_accepted_license: '1'
language:
- iso: eng
month: '10'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: Cross-client Label Propagation for transductive federated learning
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2022'
...
---
_id: '12662'
abstract:
- lang: eng
  text: 'Modern machine learning tasks often require considering not just one but
    multiple objectives. For example, besides the prediction quality, this could be
    the efficiency, robustness or fairness of the learned models, or any of their
    combinations. Multi-objective learning offers a natural framework for handling
    such problems without having to commit to early trade-offs. Surprisingly, statistical
    learning theory so far offers almost no insight into the generalization properties
    of multi-objective learning. In this work, we make first steps to fill this gap:
    we establish foundational generalization bounds for the multi-objective setting
    as well as generalization and excess bounds for learning with scalarizations.
    We also provide the first theoretical analysis of the relation between the Pareto-optimal
    sets of the true objectives and the Pareto-optimal sets of their empirical approximations
    from training data. In particular, we show a surprising asymmetry: all Pareto-optimal
    solutions can be approximated by empirically Pareto-optimal ones, but not vice
    versa.'
article_number: '2208.13499'
article_processing_charge: No
arxiv: 1
author:
- first_name: Peter
  full_name: Súkeník, Peter
  id: d64d6a8d-eb8e-11eb-b029-96fd216dec3c
  last_name: Súkeník
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: Súkeník P, Lampert C. Generalization in Multi-objective machine learning. <i>arXiv</i>.
    doi:<a href="https://doi.org/10.48550/arXiv.2208.13499">10.48550/arXiv.2208.13499</a>
  apa: Súkeník, P., &#38; Lampert, C. (n.d.). Generalization in Multi-objective machine
    learning. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2208.13499">https://doi.org/10.48550/arXiv.2208.13499</a>
  chicago: Súkeník, Peter, and Christoph Lampert. “Generalization in Multi-Objective
    Machine Learning.” <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/arXiv.2208.13499">https://doi.org/10.48550/arXiv.2208.13499</a>.
  ieee: P. Súkeník and C. Lampert, “Generalization in Multi-objective machine learning,”
    <i>arXiv</i>. .
  ista: Súkeník P, Lampert C. Generalization in Multi-objective machine learning.
    arXiv, 2208.13499.
  mla: Súkeník, Peter, and Christoph Lampert. “Generalization in Multi-Objective Machine
    Learning.” <i>ArXiv</i>, 2208.13499, doi:<a href="https://doi.org/10.48550/arXiv.2208.13499">10.48550/arXiv.2208.13499</a>.
  short: P. Súkeník, C. Lampert, ArXiv (n.d.).
date_created: 2023-02-20T08:23:06Z
date_published: 2022-08-29T00:00:00Z
date_updated: 2023-02-21T08:24:55Z
day: '29'
ddc:
- '004'
department:
- _id: ChLa
doi: 10.48550/arXiv.2208.13499
external_id:
  arxiv:
  - '2208.13499'
has_accepted_license: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: ' https://doi.org/10.48550/arXiv.2208.13499'
month: '08'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: Generalization in Multi-objective machine learning
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2022'
...
---
_id: '10803'
abstract:
- lang: eng
  text: Given the abundance of applications of ranking in recent years, addressing
    fairness concerns around automated ranking systems becomes necessary for increasing
    the trust among end-users. Previous work on fair ranking has mostly focused on
    application-specific fairness notions, often tailored to online advertising, and
    it rarely considers learning as part of the process. In this work, we show how
    to transfer numerous fairness notions from binary classification to a learning
    to rank setting. Our formalism allows us to design methods for incorporating fairness
    objectives with provable generalization guarantees. An extensive experimental
    evaluation shows that our method can improve ranking fairness substantially with
    no or only little loss of model quality.
article_number: '2102.05996'
article_processing_charge: No
arxiv: 1
author:
- first_name: Nikola H
  full_name: Konstantinov, Nikola H
  id: 4B9D76E4-F248-11E8-B48F-1D18A9856A87
  last_name: Konstantinov
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0002-4561-241X
citation:
  ama: Konstantinov NH, Lampert C. Fairness through regularization for learning to
    rank. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2102.05996">10.48550/arXiv.2102.05996</a>
  apa: Konstantinov, N. H., &#38; Lampert, C. (n.d.). Fairness through regularization
    for learning to rank. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2102.05996">https://doi.org/10.48550/arXiv.2102.05996</a>
  chicago: Konstantinov, Nikola H, and Christoph Lampert. “Fairness through Regularization
    for Learning to Rank.” <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/arXiv.2102.05996">https://doi.org/10.48550/arXiv.2102.05996</a>.
  ieee: N. H. Konstantinov and C. Lampert, “Fairness through regularization for learning
    to rank,” <i>arXiv</i>. .
  ista: Konstantinov NH, Lampert C. Fairness through regularization for learning to
    rank. arXiv, 2102.05996.
  mla: Konstantinov, Nikola H., and Christoph Lampert. “Fairness through Regularization
    for Learning to Rank.” <i>ArXiv</i>, 2102.05996, doi:<a href="https://doi.org/10.48550/arXiv.2102.05996">10.48550/arXiv.2102.05996</a>.
  short: N.H. Konstantinov, C. Lampert, ArXiv (n.d.).
date_created: 2022-02-28T14:13:59Z
date_published: 2021-06-07T00:00:00Z
date_updated: 2023-09-07T13:42:08Z
day: '07'
department:
- _id: ChLa
doi: 10.48550/arXiv.2102.05996
external_id:
  arxiv:
  - '2102.05996'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2102.05996
month: '06'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
related_material:
  record:
  - id: '10799'
    relation: dissertation_contains
    status: public
status: public
title: Fairness through regularization for learning to rank
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2021'
...
---
_id: '14987'
abstract:
- lang: eng
  text: "The goal of zero-shot learning is to construct a classifier that can identify
    object classes for which no training examples are available. When training data
    for some of the object classes is available but not for others, the name generalized
    zero-shot learning is commonly used.\r\nIn a wider sense, the phrase zero-shot
    is also used to describe other machine learning-based approaches that require
    no training data from the problem of interest, such as zero-shot action recognition
    or zero-shot machine translation."
article_processing_charge: No
author:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Lampert C. Zero-Shot Learning. In: Ikeuchi K, ed. <i>Computer Vision</i>.
    2nd ed. Cham: Springer; 2021:1395-1397. doi:<a href="https://doi.org/10.1007/978-3-030-63416-2_874">10.1007/978-3-030-63416-2_874</a>'
  apa: 'Lampert, C. (2021). Zero-Shot Learning. In K. Ikeuchi (Ed.), <i>Computer Vision</i>
    (2nd ed., pp. 1395–1397). Cham: Springer. <a href="https://doi.org/10.1007/978-3-030-63416-2_874">https://doi.org/10.1007/978-3-030-63416-2_874</a>'
  chicago: 'Lampert, Christoph. “Zero-Shot Learning.” In <i>Computer Vision</i>, edited
    by Katsushi Ikeuchi, 2nd ed., 1395–97. Cham: Springer, 2021. <a href="https://doi.org/10.1007/978-3-030-63416-2_874">https://doi.org/10.1007/978-3-030-63416-2_874</a>.'
  ieee: 'C. Lampert, “Zero-Shot Learning,” in <i>Computer Vision</i>, 2nd ed., K.
    Ikeuchi, Ed. Cham: Springer, 2021, pp. 1395–1397.'
  ista: 'Lampert C. 2021.Zero-Shot Learning. In: Computer Vision. , 1395–1397.'
  mla: Lampert, Christoph. “Zero-Shot Learning.” <i>Computer Vision</i>, edited by
    Katsushi Ikeuchi, 2nd ed., Springer, 2021, pp. 1395–97, doi:<a href="https://doi.org/10.1007/978-3-030-63416-2_874">10.1007/978-3-030-63416-2_874</a>.
  short: C. Lampert, in:, K. Ikeuchi (Ed.), Computer Vision, 2nd ed., Springer, Cham,
    2021, pp. 1395–1397.
date_created: 2024-02-14T14:05:32Z
date_published: 2021-10-13T00:00:00Z
date_updated: 2024-02-19T10:59:04Z
day: '13'
department:
- _id: ChLa
doi: 10.1007/978-3-030-63416-2_874
edition: '2'
editor:
- first_name: Katsushi
  full_name: Ikeuchi, Katsushi
  last_name: Ikeuchi
language:
- iso: eng
month: '10'
oa_version: None
page: 1395-1397
place: Cham
publication: Computer Vision
publication_identifier:
  eisbn:
  - '9783030634162'
  isbn:
  - '9783030634155'
publication_status: published
publisher: Springer
quality_controlled: '1'
status: public
title: Zero-Shot Learning
type: book_chapter
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2021'
...