NUQSGD: Provably communication-efficient data-parallel SGD via nonuniform quantization

Ramezani-Kebrya A, Faghri F, Markov I, Aksenov V, Alistarh D-A, Roy DM. 2021. NUQSGD: Provably communication-efficient data-parallel SGD via nonuniform quantization. Journal of Machine Learning Research. 22(114), 1−43.

Download
OA 2021_JournalOfMachineLearningResearch_Ramezani-Kebrya.pdf 11.24 MB [Published Version]
Download (ext.)
Journal Article | Published | English

Scopus indexed
Author
Ramezani-Kebrya, Ali; Faghri, Fartash; Markov, Ilya; Aksenov, VitaliiISTA; Alistarh, Dan-AdrianISTA ; Roy, Daniel M.
Department
Abstract
As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed to perform parallel model training. One popular communication-compression method for data-parallel SGD is QSGD (Alistarh et al., 2017), which quantizes and encodes gradients to reduce communication costs. The baseline variant of QSGD provides strong theoretical guarantees, however, for practical purposes, the authors proposed a heuristic variant which we call QSGDinf, which demonstrated impressive empirical gains for distributed training of large neural networks. In this paper, we build on this work to propose a new gradient quantization scheme, and show that it has both stronger theoretical guarantees than QSGD, and matches and exceeds the empirical performance of the QSGDinf heuristic and of other compression methods.
Publishing Year
Date Published
2021-04-01
Journal Title
Journal of Machine Learning Research
Publisher
Journal of Machine Learning Research
Volume
22
Issue
114
Page
1−43
ISSN
eISSN
IST-REx-ID

Cite this

Ramezani-Kebrya A, Faghri F, Markov I, Aksenov V, Alistarh D-A, Roy DM. NUQSGD: Provably communication-efficient data-parallel SGD via nonuniform quantization. Journal of Machine Learning Research. 2021;22(114):1−43.
Ramezani-Kebrya, A., Faghri, F., Markov, I., Aksenov, V., Alistarh, D.-A., & Roy, D. M. (2021). NUQSGD: Provably communication-efficient data-parallel SGD via nonuniform quantization. Journal of Machine Learning Research. Journal of Machine Learning Research.
Ramezani-Kebrya, Ali, Fartash Faghri, Ilya Markov, Vitalii Aksenov, Dan-Adrian Alistarh, and Daniel M. Roy. “NUQSGD: Provably Communication-Efficient Data-Parallel SGD via Nonuniform Quantization.” Journal of Machine Learning Research. Journal of Machine Learning Research, 2021.
A. Ramezani-Kebrya, F. Faghri, I. Markov, V. Aksenov, D.-A. Alistarh, and D. M. Roy, “NUQSGD: Provably communication-efficient data-parallel SGD via nonuniform quantization,” Journal of Machine Learning Research, vol. 22, no. 114. Journal of Machine Learning Research, p. 1−43, 2021.
Ramezani-Kebrya A, Faghri F, Markov I, Aksenov V, Alistarh D-A, Roy DM. 2021. NUQSGD: Provably communication-efficient data-parallel SGD via nonuniform quantization. Journal of Machine Learning Research. 22(114), 1−43.
Ramezani-Kebrya, Ali, et al. “NUQSGD: Provably Communication-Efficient Data-Parallel SGD via Nonuniform Quantization.” Journal of Machine Learning Research, vol. 22, no. 114, Journal of Machine Learning Research, 2021, p. 1−43.
All files available under the following license(s):
Creative Commons Attribution 4.0 International Public License (CC-BY 4.0):
Main File(s)
Access Level
OA Open Access
Date Uploaded
2021-06-23
MD5 Checksum
6428aa8bcb67768b6949c99b55d5281d


Link(s) to Main File(s)
Access Level
OA Open Access

Export

Marked Publications

Open Data ISTA Research Explorer

Sources

arXiv 1908.06077

Search this title in

Google Scholar