---
_id: '14461'
abstract:
- lang: eng
  text: 'Communication-reduction techniques are a popular way to improve scalability
    in data-parallel training of deep neural networks (DNNs). The recent emergence
    of large language models such as GPT has created the need for new approaches to
    exploit data-parallelism. Among these, fully-sharded data parallel (FSDP) training
    is highly popular, yet it still encounters scalability bottlenecks. One reason
    is that applying compression techniques to FSDP is challenging: as the vast majority
    of the communication involves the model’s weights, direct compression alters convergence
    and leads to accuracy loss. We present QSDP, a variant of FSDP which supports
    both gradient and weight quantization with theoretical guarantees, is simple to
    implement and has essentially no overheads. To derive QSDP we prove that a natural
    modification of SGD achieves convergence even when we only maintain quantized
    weights, and thus the domain over which we train consists of quantized points
    and is, therefore, highly non-convex. We validate this approach by training GPT-family
    models with up to 1.3 billion parameters on a multi-node cluster. Experiments
    show that QSDP preserves model accuracy, while completely removing the communication
    bottlenecks of FSDP, providing end-to-end speedups of up to 2.2x.'
acknowledged_ssus:
- _id: ScienComp
acknowledgement: The authors gratefully acknowledge funding from the European Research
  Council (ERC) under the European Union’s Horizon 2020 research and innovation programme
  (grant agreement No 805223 ScaleML), as well as experimental support from the IST
  Austria IT department, in particular Stefano Elefante, Andrei Hornoiu, and Alois
  Schloegl. AV acknowledges the support of the French Agence Nationale de la Recherche
  (ANR), under grant ANR-21-CE48-0016 (project COMCOPT), the support of Fondation
  Hadamard with a PRMO grant, and the support of CNRS with a CoopIntEER IEA grant
  (project ALFRED).
alternative_title:
- PMLR
article_processing_charge: No
arxiv: 1
author:
- first_name: Ilia
  full_name: Markov, Ilia
  id: D0CF4148-C985-11E9-8066-0BDEE5697425
  last_name: Markov
- first_name: Adrian
  full_name: Vladu, Adrian
  last_name: Vladu
- first_name: Qi
  full_name: Guo, Qi
  last_name: Guo
- first_name: Dan-Adrian
  full_name: Alistarh, Dan-Adrian
  id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
  last_name: Alistarh
  orcid: 0000-0003-3650-940X
citation:
  ama: 'Markov I, Vladu A, Guo Q, Alistarh D-A. Quantized distributed training of
    large models with convergence guarantees. In: <i>Proceedings of the 40th International
    Conference on Machine Learning</i>. Vol 202. ML Research Press; 2023:24020-24044.'
  apa: 'Markov, I., Vladu, A., Guo, Q., &#38; Alistarh, D.-A. (2023). Quantized distributed
    training of large models with convergence guarantees. In <i>Proceedings of the
    40th International Conference on Machine Learning</i> (Vol. 202, pp. 24020–24044).
    Honolulu, Hawaii, HI, United States: ML Research Press.'
  chicago: Markov, Ilia, Adrian Vladu, Qi Guo, and Dan-Adrian Alistarh. “Quantized
    Distributed Training of Large Models with Convergence Guarantees.” In <i>Proceedings
    of the 40th International Conference on Machine Learning</i>, 202:24020–44. ML
    Research Press, 2023.
  ieee: I. Markov, A. Vladu, Q. Guo, and D.-A. Alistarh, “Quantized distributed training
    of large models with convergence guarantees,” in <i>Proceedings of the 40th International
    Conference on Machine Learning</i>, Honolulu, Hawaii, HI, United States, 2023,
    vol. 202, pp. 24020–24044.
  ista: 'Markov I, Vladu A, Guo Q, Alistarh D-A. 2023. Quantized distributed training
    of large models with convergence guarantees. Proceedings of the 40th International
    Conference on Machine Learning. ICML: International Conference on Machine Learning,
    PMLR, vol. 202, 24020–24044.'
  mla: Markov, Ilia, et al. “Quantized Distributed Training of Large Models with Convergence
    Guarantees.” <i>Proceedings of the 40th International Conference on Machine Learning</i>,
    vol. 202, ML Research Press, 2023, pp. 24020–44.
  short: I. Markov, A. Vladu, Q. Guo, D.-A. Alistarh, in:, Proceedings of the 40th
    International Conference on Machine Learning, ML Research Press, 2023, pp. 24020–24044.
conference:
  end_date: 2023-07-29
  location: Honolulu, Hawaii, HI, United States
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2023-07-23
date_created: 2023-10-29T23:01:17Z
date_published: 2023-07-30T00:00:00Z
date_updated: 2023-10-31T09:40:45Z
day: '30'
department:
- _id: DaAl
ec_funded: 1
external_id:
  arxiv:
  - '2302.02390'
intvolume: '       202'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2302.02390
month: '07'
oa: 1
oa_version: Preprint
page: 24020-24044
project:
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '805223'
  name: Elastic Coordination for Scalable Machine Learning
publication: Proceedings of the 40th International Conference on Machine Learning
publication_identifier:
  eissn:
  - 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: Quantized distributed training of large models with convergence guarantees
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 202
year: '2023'
...
---
_id: '12780'
abstract:
- lang: eng
  text: "The ability to scale out training workloads has been one of the key performance
    enablers of deep learning. The main scaling approach is data-parallel GPU-based
    training, which has been boosted by hardware and software support for highly efficient
    point-to-point communication, and in particular via hardware bandwidth over-provisioning.
    Overprovisioning comes at a cost: there is an order of magnitude price difference
    between \"cloud-grade\" servers with such support, relative to their popular \"consumer-grade\"
    counterparts, although single server-grade and consumer-grade GPUs can have similar
    computational envelopes.\r\n\r\nIn this paper, we show that the costly hardware
    overprovisioning approach can be supplanted via algorithmic and system design,
    and propose a framework called CGX, which provides efficient software support
    for compressed communication in ML applications, for both multi-GPU single-node
    training, as well as larger-scale multi-node training. CGX is based on two technical
    advances: At the system level, it relies on a re-developed communication stack
    for ML frameworks, which provides flexible, highly-efficient support for compressed
    communication. At the application level, it provides seamless, parameter-free
    integration with popular frameworks, so that end-users do not have to modify training
    recipes, nor significant training code. This is complemented by a layer-wise adaptive
    compression technique which dynamically balances compression gains with accuracy
    preservation. CGX integrates with popular ML frameworks, providing up to 3X speedups
    for multi-GPU nodes based on commodity hardware, and order-of-magnitude improvements
    in the multi-node setting, with negligible impact on accuracy."
acknowledgement: The authors sincerely thank Nikoli Dryden, Tal Ben-Nun, Torsten Hoefler
  and Bapi Chatterjee for useful discussions throughout the development of this project.
article_processing_charge: Yes (via OA deal)
arxiv: 1
author:
- first_name: Ilia
  full_name: Markov, Ilia
  id: D0CF4148-C985-11E9-8066-0BDEE5697425
  last_name: Markov
- first_name: Hamidreza
  full_name: Ramezanikebrya, Hamidreza
  last_name: Ramezanikebrya
- first_name: Dan-Adrian
  full_name: Alistarh, Dan-Adrian
  id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
  last_name: Alistarh
  orcid: 0000-0003-3650-940X
citation:
  ama: 'Markov I, Ramezanikebrya H, Alistarh D-A. CGX: Adaptive system support for
    communication-efficient deep learning. In: <i>Proceedings of the 23rd ACM/IFIP
    International Middleware Conference</i>. Association for Computing Machinery;
    2022:241-254. doi:<a href="https://doi.org/10.1145/3528535.3565248">10.1145/3528535.3565248</a>'
  apa: 'Markov, I., Ramezanikebrya, H., &#38; Alistarh, D.-A. (2022). CGX: Adaptive
    system support for communication-efficient deep learning. In <i>Proceedings of
    the 23rd ACM/IFIP International Middleware Conference</i> (pp. 241–254). Quebec,
    QC, Canada: Association for Computing Machinery. <a href="https://doi.org/10.1145/3528535.3565248">https://doi.org/10.1145/3528535.3565248</a>'
  chicago: 'Markov, Ilia, Hamidreza Ramezanikebrya, and Dan-Adrian Alistarh. “CGX:
    Adaptive System Support for Communication-Efficient Deep Learning.” In <i>Proceedings
    of the 23rd ACM/IFIP International Middleware Conference</i>, 241–54. Association
    for Computing Machinery, 2022. <a href="https://doi.org/10.1145/3528535.3565248">https://doi.org/10.1145/3528535.3565248</a>.'
  ieee: 'I. Markov, H. Ramezanikebrya, and D.-A. Alistarh, “CGX: Adaptive system support
    for communication-efficient deep learning,” in <i>Proceedings of the 23rd ACM/IFIP
    International Middleware Conference</i>, Quebec, QC, Canada, 2022, pp. 241–254.'
  ista: 'Markov I, Ramezanikebrya H, Alistarh D-A. 2022. CGX: Adaptive system support
    for communication-efficient deep learning. Proceedings of the 23rd ACM/IFIP International
    Middleware Conference. Middleware: International Middleware Conference, 241–254.'
  mla: 'Markov, Ilia, et al. “CGX: Adaptive System Support for Communication-Efficient
    Deep Learning.” <i>Proceedings of the 23rd ACM/IFIP International Middleware Conference</i>,
    Association for Computing Machinery, 2022, pp. 241–54, doi:<a href="https://doi.org/10.1145/3528535.3565248">10.1145/3528535.3565248</a>.'
  short: I. Markov, H. Ramezanikebrya, D.-A. Alistarh, in:, Proceedings of the 23rd
    ACM/IFIP International Middleware Conference, Association for Computing Machinery,
    2022, pp. 241–254.
conference:
  end_date: 2022-11-11
  location: Quebec, QC, Canada
  name: 'Middleware: International Middleware Conference'
  start_date: 2022-11-07
date_created: 2023-03-31T06:17:00Z
date_published: 2022-11-01T00:00:00Z
date_updated: 2023-04-03T06:21:04Z
day: '01'
ddc:
- '000'
department:
- _id: DaAl
doi: 10.1145/3528535.3565248
external_id:
  arxiv:
  - '2111.08617'
file:
- access_level: open_access
  checksum: 1a397746235f245da5468819247ff663
  content_type: application/pdf
  creator: dernst
  date_created: 2023-04-03T06:17:58Z
  date_updated: 2023-04-03T06:17:58Z
  file_id: '12795'
  file_name: 2022_ACMMiddleware_Markov.pdf
  file_size: 1514169
  relation: main_file
  success: 1
file_date_updated: 2023-04-03T06:17:58Z
has_accepted_license: '1'
language:
- iso: eng
month: '11'
oa: 1
oa_version: Published Version
page: 241-254
publication: Proceedings of the 23rd ACM/IFIP International Middleware Conference
publication_identifier:
  isbn:
  - '9781450393409'
publication_status: published
publisher: Association for Computing Machinery
quality_controlled: '1'
status: public
title: 'CGX: Adaptive system support for communication-efficient deep learning'
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2022'
...
---
_id: '10049'
abstract:
- lang: eng
  text: While messaging systems with strong security guarantees are widely used in
    practice, designing a protocol that scales efficiently to large groups and enjoys
    similar security guarantees remains largely open. The two existing proposals to
    date are ART (Cohn-Gordon et al., CCS18) and TreeKEM (IETF, The Messaging Layer
    Security Protocol, draft). TreeKEM is the currently considered candidate by the
    IETF MLS working group, but dynamic group operations (i.e. adding and removing
    users) can cause efficiency issues. In this paper we formalize and analyze a variant
    of TreeKEM which we term Tainted TreeKEM (TTKEM for short). The basic idea underlying
    TTKEM was suggested by Millican (MLS mailing list, February 2018). This version
    is more efficient than TreeKEM for some natural distributions of group operations,
    we quantify this through simulations.Our second contribution is two security proofs
    for TTKEM which establish post compromise and forward secrecy even against adaptive
    attackers. The security loss (to the underlying PKE) in the Random Oracle Model
    is a polynomial factor, and a quasipolynomial one in the Standard Model. Our proofs
    can be adapted to TreeKEM as well. Before our work no security proof for any TreeKEM-like
    protocol establishing tight security against an adversary who can adaptively choose
    the sequence of operations was known. We also are the first to prove (or even
    formalize) active security where the server can arbitrarily deviate from the protocol
    specification. Proving fully active security – where also the users can arbitrarily
    deviate – remains open.
acknowledgement: The first three authors contributed equally to this work. Funded
  by the European Research Council (ERC) under the European Union’s Horizon2020 research
  and innovation programme (682815-TOCNeT). Funded by the European Union’s Horizon
  2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement
  No.665385.
article_processing_charge: No
author:
- first_name: Karen
  full_name: Klein, Karen
  id: 3E83A2F8-F248-11E8-B48F-1D18A9856A87
  last_name: Klein
- first_name: Guillermo
  full_name: Pascual Perez, Guillermo
  id: 2D7ABD02-F248-11E8-B48F-1D18A9856A87
  last_name: Pascual Perez
  orcid: 0000-0001-8630-415X
- first_name: Michael
  full_name: Walter, Michael
  id: 488F98B0-F248-11E8-B48F-1D18A9856A87
  last_name: Walter
  orcid: 0000-0003-3186-2482
- first_name: Chethan
  full_name: Kamath Hosdurg, Chethan
  id: 4BD3F30E-F248-11E8-B48F-1D18A9856A87
  last_name: Kamath Hosdurg
- first_name: Margarita
  full_name: Capretto, Margarita
  last_name: Capretto
- first_name: Miguel
  full_name: Cueto Noval, Miguel
  id: ffc563a3-f6e0-11ea-865d-e3cce03d17cc
  last_name: Cueto Noval
- first_name: Ilia
  full_name: Markov, Ilia
  id: D0CF4148-C985-11E9-8066-0BDEE5697425
  last_name: Markov
- first_name: Michelle X
  full_name: Yeo, Michelle X
  id: 2D82B818-F248-11E8-B48F-1D18A9856A87
  last_name: Yeo
- first_name: Joel F
  full_name: Alwen, Joel F
  id: 2A8DFA8C-F248-11E8-B48F-1D18A9856A87
  last_name: Alwen
- first_name: Krzysztof Z
  full_name: Pietrzak, Krzysztof Z
  id: 3E04A7AA-F248-11E8-B48F-1D18A9856A87
  last_name: Pietrzak
  orcid: 0000-0002-9139-1654
citation:
  ama: 'Klein K, Pascual Perez G, Walter M, et al. Keep the dirt: tainted TreeKEM,
    adaptively and actively secure continuous group key agreement. In: <i>2021 IEEE
    Symposium on Security and Privacy </i>. IEEE; 2021:268-284. doi:<a href="https://doi.org/10.1109/sp40001.2021.00035">10.1109/sp40001.2021.00035</a>'
  apa: 'Klein, K., Pascual Perez, G., Walter, M., Kamath Hosdurg, C., Capretto, M.,
    Cueto Noval, M., … Pietrzak, K. Z. (2021). Keep the dirt: tainted TreeKEM, adaptively
    and actively secure continuous group key agreement. In <i>2021 IEEE Symposium
    on Security and Privacy </i> (pp. 268–284). San Francisco, CA, United States:
    IEEE. <a href="https://doi.org/10.1109/sp40001.2021.00035">https://doi.org/10.1109/sp40001.2021.00035</a>'
  chicago: 'Klein, Karen, Guillermo Pascual Perez, Michael Walter, Chethan Kamath
    Hosdurg, Margarita Capretto, Miguel Cueto Noval, Ilia Markov, Michelle X Yeo,
    Joel F Alwen, and Krzysztof Z Pietrzak. “Keep the Dirt: Tainted TreeKEM, Adaptively
    and Actively Secure Continuous Group Key Agreement.” In <i>2021 IEEE Symposium
    on Security and Privacy </i>, 268–84. IEEE, 2021. <a href="https://doi.org/10.1109/sp40001.2021.00035">https://doi.org/10.1109/sp40001.2021.00035</a>.'
  ieee: 'K. Klein <i>et al.</i>, “Keep the dirt: tainted TreeKEM, adaptively and actively
    secure continuous group key agreement,” in <i>2021 IEEE Symposium on Security
    and Privacy </i>, San Francisco, CA, United States, 2021, pp. 268–284.'
  ista: 'Klein K, Pascual Perez G, Walter M, Kamath Hosdurg C, Capretto M, Cueto Noval
    M, Markov I, Yeo MX, Alwen JF, Pietrzak KZ. 2021. Keep the dirt: tainted TreeKEM,
    adaptively and actively secure continuous group key agreement. 2021 IEEE Symposium
    on Security and Privacy . SP: Symposium on Security and Privacy, 268–284.'
  mla: 'Klein, Karen, et al. “Keep the Dirt: Tainted TreeKEM, Adaptively and Actively
    Secure Continuous Group Key Agreement.” <i>2021 IEEE Symposium on Security and
    Privacy </i>, IEEE, 2021, pp. 268–84, doi:<a href="https://doi.org/10.1109/sp40001.2021.00035">10.1109/sp40001.2021.00035</a>.'
  short: K. Klein, G. Pascual Perez, M. Walter, C. Kamath Hosdurg, M. Capretto, M.
    Cueto Noval, I. Markov, M.X. Yeo, J.F. Alwen, K.Z. Pietrzak, in:, 2021 IEEE Symposium
    on Security and Privacy , IEEE, 2021, pp. 268–284.
conference:
  end_date: 2021-05-27
  location: San Francisco, CA, United States
  name: 'SP: Symposium on Security and Privacy'
  start_date: 2021-05-24
date_created: 2021-09-27T13:46:27Z
date_published: 2021-08-26T00:00:00Z
date_updated: 2023-09-07T13:32:11Z
day: '26'
department:
- _id: KrPi
- _id: DaAl
doi: 10.1109/sp40001.2021.00035
ec_funded: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://eprint.iacr.org/2019/1489
month: '08'
oa: 1
oa_version: Preprint
page: 268-284
project:
- _id: 2564DBCA-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '665385'
  name: International IST Doctoral Program
- _id: 258AA5B2-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '682815'
  name: Teaching Old Crypto New Tricks
publication: '2021 IEEE Symposium on Security and Privacy '
publication_status: published
publisher: IEEE
quality_controlled: '1'
related_material:
  record:
  - id: '10035'
    relation: dissertation_contains
    status: public
status: public
title: 'Keep the dirt: tainted TreeKEM, adaptively and actively secure continuous
  group key agreement'
type: conference
user_id: 8b945eb4-e2f2-11eb-945a-df72226e66a9
year: '2021'
...
---
_id: '10432'
abstract:
- lang: eng
  text: One key element behind the recent progress of machine learning has been the
    ability to train machine learning models in large-scale distributed shared-memory
    and message-passing environments. Most of these models are trained employing variants
    of stochastic gradient descent (SGD) based optimization, but most methods involve
    some type of consistency relaxation relative to sequential SGD, to mitigate its
    large communication or synchronization costs at scale. In this paper, we introduce
    a general consistency condition covering communication-reduced and asynchronous
    distributed SGD implementations. Our framework, called elastic consistency, decouples
    the system-specific aspects of the implementation from the SGD convergence requirements,
    giving a general way to obtain convergence bounds for a wide variety of distributed
    SGD methods used in practice. Elastic consistency can be used to re-derive or
    improve several previous convergence bounds in message-passing and shared-memory
    settings, but also to analyze new models and distribution schemes. As a direct
    application, we propose and analyze a new synchronization-avoiding scheduling
    scheme for distributed SGD, and show that it can be used to efficiently train
    deep convolutional models for image classification.
acknowledgement: "We would like to thank Christopher De Sa for his feedback on an
  earlier draft of this paper, as well as the anonymous AAAI reviewers for their useful
  comments. This project has received\r\nfunding from the European Research Council
  (ERC) under the European Union’s Horizon 2020 research and innovation programme
  (grant agreement No 805223 ScaleML). Bapi\r\nChatterjee was supported by the European
  Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie
  grant agreement No. 754411 (ISTPlus)."
article_processing_charge: No
arxiv: 1
author:
- first_name: Giorgi
  full_name: Nadiradze, Giorgi
  id: 3279A00C-F248-11E8-B48F-1D18A9856A87
  last_name: Nadiradze
  orcid: 0000-0001-5634-0731
- first_name: Ilia
  full_name: Markov, Ilia
  id: D0CF4148-C985-11E9-8066-0BDEE5697425
  last_name: Markov
- first_name: Bapi
  full_name: Chatterjee, Bapi
  id: 3C41A08A-F248-11E8-B48F-1D18A9856A87
  last_name: Chatterjee
  orcid: 0000-0002-2742-4028
- first_name: 'Vyacheslav '
  full_name: 'Kungurtsev, Vyacheslav '
  last_name: Kungurtsev
- first_name: Dan-Adrian
  full_name: Alistarh, Dan-Adrian
  id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
  last_name: Alistarh
  orcid: 0000-0003-3650-940X
citation:
  ama: 'Nadiradze G, Markov I, Chatterjee B, Kungurtsev V, Alistarh D-A. Elastic consistency:
    A practical consistency model for distributed stochastic gradient descent. In:
    <i>Proceedings of the AAAI Conference on Artificial Intelligence</i>. Vol 35.
    ; 2021:9037-9045.'
  apa: 'Nadiradze, G., Markov, I., Chatterjee, B., Kungurtsev, V., &#38; Alistarh,
    D.-A. (2021). Elastic consistency: A practical consistency model for distributed
    stochastic gradient descent. In <i>Proceedings of the AAAI Conference on Artificial
    Intelligence</i> (Vol. 35, pp. 9037–9045). Virtual.'
  chicago: 'Nadiradze, Giorgi, Ilia Markov, Bapi Chatterjee, Vyacheslav  Kungurtsev,
    and Dan-Adrian Alistarh. “Elastic Consistency: A Practical Consistency Model for
    Distributed Stochastic Gradient Descent.” In <i>Proceedings of the AAAI Conference
    on Artificial Intelligence</i>, 35:9037–45, 2021.'
  ieee: 'G. Nadiradze, I. Markov, B. Chatterjee, V. Kungurtsev, and D.-A. Alistarh,
    “Elastic consistency: A practical consistency model for distributed stochastic
    gradient descent,” in <i>Proceedings of the AAAI Conference on Artificial Intelligence</i>,
    Virtual, 2021, vol. 35, no. 10, pp. 9037–9045.'
  ista: 'Nadiradze G, Markov I, Chatterjee B, Kungurtsev V, Alistarh D-A. 2021. Elastic
    consistency: A practical consistency model for distributed stochastic gradient
    descent. Proceedings of the AAAI Conference on Artificial Intelligence. AAAI:
    Association for the Advancement of Artificial Intelligence vol. 35, 9037–9045.'
  mla: 'Nadiradze, Giorgi, et al. “Elastic Consistency: A Practical Consistency Model
    for Distributed Stochastic Gradient Descent.” <i>Proceedings of the AAAI Conference
    on Artificial Intelligence</i>, vol. 35, no. 10, 2021, pp. 9037–45.'
  short: G. Nadiradze, I. Markov, B. Chatterjee, V. Kungurtsev, D.-A. Alistarh, in:,
    Proceedings of the AAAI Conference on Artificial Intelligence, 2021, pp. 9037–9045.
conference:
  end_date: 2021-02-09
  location: Virtual
  name: 'AAAI: Association for the Advancement of Artificial Intelligence'
  start_date: 2021-02-02
date_created: 2021-12-09T09:21:35Z
date_published: 2021-05-18T00:00:00Z
date_updated: 2023-09-07T13:31:39Z
day: '18'
department:
- _id: DaAl
ec_funded: 1
external_id:
  arxiv:
  - '2001.05918'
intvolume: '        35'
issue: '10'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://ojs.aaai.org/index.php/AAAI/article/view/17092
month: '05'
oa: 1
oa_version: Published Version
page: 9037-9045
project:
- _id: 260C2330-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '754411'
  name: ISTplus - Postdoctoral Fellowships
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '805223'
  name: Elastic Coordination for Scalable Machine Learning
publication: Proceedings of the AAAI Conference on Artificial Intelligence
publication_status: published
quality_controlled: '1'
related_material:
  record:
  - id: '10429'
    relation: dissertation_contains
    status: public
status: public
title: 'Elastic consistency: A practical consistency model for distributed stochastic
  gradient descent'
type: conference
user_id: 8b945eb4-e2f2-11eb-945a-df72226e66a9
volume: 35
year: '2021'
...