---
_id: '6569'
abstract:
- lang: eng
  text: 'Knowledge distillation, i.e. one classifier being trained on the outputs
    of another classifier, is an empirically very successful technique for knowledge
    transfer between classifiers. It has even been observed that classifiers learn
    much faster and more reliably if trained with the outputs of another classifier
    as soft labels, instead of from ground truth data. So far, however, there is no
    satisfactory theoretical explanation of this phenomenon. In this work, we provide
    the first insights into the working mechanisms of distillation by studying the
    special case of linear and deep linear classifiers.  Specifically,  we prove a
    generalization bound that establishes fast convergence of the expected risk of
    a distillation-trained linear classifier. From the bound and its proof we extract
    three keyfactors that determine the success of distillation: data geometry – geometric
    properties of the datadistribution, in particular class separation, has an immediate
    influence on the convergence speed of the risk; optimization bias– gradient descentoptimization
    finds a very favorable minimum of the distillation objective; and strong monotonicity–
    the expected risk of the student classifier always decreases when the size of
    the training set grows.'
article_processing_charge: No
author:
- first_name: Phuong
  full_name: Bui Thi Mai, Phuong
  id: 3EC6EE64-F248-11E8-B48F-1D18A9856A87
  last_name: Bui Thi Mai
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Phuong M, Lampert C. Towards understanding knowledge distillation. In: <i>Proceedings
    of the 36th International Conference on Machine Learning</i>. Vol 97. ML Research
    Press; 2019:5142-5151.'
  apa: 'Phuong, M., &#38; Lampert, C. (2019). Towards understanding knowledge distillation.
    In <i>Proceedings of the 36th International Conference on Machine Learning</i>
    (Vol. 97, pp. 5142–5151). Long Beach, CA, United States: ML Research Press.'
  chicago: Phuong, Mary, and Christoph Lampert. “Towards Understanding Knowledge Distillation.”
    In <i>Proceedings of the 36th International Conference on Machine Learning</i>,
    97:5142–51. ML Research Press, 2019.
  ieee: M. Phuong and C. Lampert, “Towards understanding knowledge distillation,”
    in <i>Proceedings of the 36th International Conference on Machine Learning</i>,
    Long Beach, CA, United States, 2019, vol. 97, pp. 5142–5151.
  ista: 'Phuong M, Lampert C. 2019. Towards understanding knowledge distillation.
    Proceedings of the 36th International Conference on Machine Learning. ICML: International
    Conference on Machine Learning vol. 97, 5142–5151.'
  mla: Phuong, Mary, and Christoph Lampert. “Towards Understanding Knowledge Distillation.”
    <i>Proceedings of the 36th International Conference on Machine Learning</i>, vol.
    97, ML Research Press, 2019, pp. 5142–51.
  short: M. Phuong, C. Lampert, in:, Proceedings of the 36th International Conference
    on Machine Learning, ML Research Press, 2019, pp. 5142–5151.
conference:
  end_date: 2019-06-15
  location: Long Beach, CA, United States
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2019-06-10
date_created: 2019-06-20T18:23:03Z
date_published: 2019-06-13T00:00:00Z
date_updated: 2023-10-17T12:31:38Z
day: '13'
ddc:
- '000'
department:
- _id: ChLa
file:
- access_level: open_access
  checksum: a66d00e2694d749250f8507f301320ca
  content_type: application/pdf
  creator: bphuong
  date_created: 2019-06-20T18:22:56Z
  date_updated: 2020-07-14T12:47:33Z
  file_id: '6570'
  file_name: paper.pdf
  file_size: 686432
  relation: main_file
file_date_updated: 2020-07-14T12:47:33Z
has_accepted_license: '1'
intvolume: '        97'
language:
- iso: eng
month: '06'
oa: 1
oa_version: Published Version
page: 5142-5151
publication: Proceedings of the 36th International Conference on Machine Learning
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: Towards understanding knowledge distillation
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 97
year: '2019'
...
---
_id: '6590'
abstract:
- lang: eng
  text: 'Modern machine learning methods often require more data for training than
    a single expert can provide. Therefore, it has become a standard procedure to
    collect data from external sources, e.g. via crowdsourcing. Unfortunately, the
    quality of these sources is not always guaranteed. As additional complications,
    the data might be stored in a distributed way, or might even have to remain private.
    In this work, we address the question of how to learn robustly in such scenarios.
    Studying the problem through the lens of statistical learning theory, we derive
    a procedure that allows for learning from all available sources, yet automatically
    suppresses irrelevant or corrupted data. We show by extensive experiments that
    our method provides significant improvements over alternative approaches from
    robust statistics and distributed optimization. '
article_processing_charge: No
arxiv: 1
author:
- first_name: Nikola H
  full_name: Konstantinov, Nikola H
  id: 4B9D76E4-F248-11E8-B48F-1D18A9856A87
  last_name: Konstantinov
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Konstantinov NH, Lampert C. Robust learning from untrusted sources. In: <i>Proceedings
    of the 36th International Conference on Machine Learning</i>. Vol 97. ML Research
    Press; 2019:3488-3498.'
  apa: 'Konstantinov, N. H., &#38; Lampert, C. (2019). Robust learning from untrusted
    sources. In <i>Proceedings of the 36th International Conference on Machine Learning</i>
    (Vol. 97, pp. 3488–3498). Long Beach, CA, USA: ML Research Press.'
  chicago: Konstantinov, Nikola H, and Christoph Lampert. “Robust Learning from Untrusted
    Sources.” In <i>Proceedings of the 36th International Conference on Machine Learning</i>,
    97:3488–98. ML Research Press, 2019.
  ieee: N. H. Konstantinov and C. Lampert, “Robust learning from untrusted sources,”
    in <i>Proceedings of the 36th International Conference on Machine Learning</i>,
    Long Beach, CA, USA, 2019, vol. 97, pp. 3488–3498.
  ista: 'Konstantinov NH, Lampert C. 2019. Robust learning from untrusted sources.
    Proceedings of the 36th International Conference on Machine Learning. ICML: International
    Conference on Machine Learning vol. 97, 3488–3498.'
  mla: Konstantinov, Nikola H., and Christoph Lampert. “Robust Learning from Untrusted
    Sources.” <i>Proceedings of the 36th International Conference on Machine Learning</i>,
    vol. 97, ML Research Press, 2019, pp. 3488–98.
  short: N.H. Konstantinov, C. Lampert, in:, Proceedings of the 36th International
    Conference on Machine Learning, ML Research Press, 2019, pp. 3488–3498.
conference:
  end_date: 2919-06-15
  location: Long Beach, CA, USA
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2019-06-10
date_created: 2019-06-27T14:18:23Z
date_published: 2019-06-01T00:00:00Z
date_updated: 2023-10-17T12:31:55Z
day: '01'
department:
- _id: ChLa
ec_funded: 1
external_id:
  arxiv:
  - '1901.10310'
intvolume: '        97'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/1901.10310
month: '06'
oa: 1
oa_version: Preprint
page: 3488-3498
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
- _id: 2564DBCA-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '665385'
  name: International IST Doctoral Program
publication: Proceedings of the 36th International Conference on Machine Learning
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
related_material:
  record:
  - id: '10799'
    relation: dissertation_contains
    status: public
scopus_import: '1'
status: public
title: Robust learning from untrusted sources
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 97
year: '2019'
...
---
_id: '321'
abstract:
- lang: eng
  text: The twelve papers in this special section focus on learning systems with shared
    information for computer vision and multimedia communication analysis. In the
    real world, a realistic setting for computer vision or multimedia recognition
    problems is that we have some classes containing lots of training data and many
    classes containing a small amount of training data. Therefore, how to use frequent
    classes to help learning rare classes for which it is harder to collect the training
    data is an open question. Learning with shared information is an emerging topic
    in machine learning, computer vision and multimedia analysis. There are different
    levels of components that can be shared during concept modeling and machine learning
    stages, such as sharing generic object parts, sharing attributes, sharing transformations,
    sharing regularization parameters and sharing training examples, etc. Regarding
    the specific methods, multi-task learning, transfer learning and deep learning
    can be seen as using different strategies to share information. These learning
    with shared information methods are very effective in solving real-world large-scale
    problems.
article_processing_charge: No
article_type: original
author:
- first_name: Trevor
  full_name: Darrell, Trevor
  last_name: Darrell
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
- first_name: Nico
  full_name: Sebe, Nico
  last_name: Sebe
- first_name: Ying
  full_name: Wu, Ying
  last_name: Wu
- first_name: Yan
  full_name: Yan, Yan
  last_name: Yan
citation:
  ama: Darrell T, Lampert C, Sebe N, Wu Y, Yan Y. Guest editors’ introduction to the
    special section on learning with Shared information for computer vision and multimedia
    analysis. <i>IEEE Transactions on Pattern Analysis and Machine Intelligence</i>.
    2018;40(5):1029-1031. doi:<a href="https://doi.org/10.1109/TPAMI.2018.2804998">10.1109/TPAMI.2018.2804998</a>
  apa: Darrell, T., Lampert, C., Sebe, N., Wu, Y., &#38; Yan, Y. (2018). Guest editors’
    introduction to the special section on learning with Shared information for computer
    vision and multimedia analysis. <i>IEEE Transactions on Pattern Analysis and Machine
    Intelligence</i>. IEEE. <a href="https://doi.org/10.1109/TPAMI.2018.2804998">https://doi.org/10.1109/TPAMI.2018.2804998</a>
  chicago: Darrell, Trevor, Christoph Lampert, Nico Sebe, Ying Wu, and Yan Yan. “Guest
    Editors’ Introduction to the Special Section on Learning with Shared Information
    for Computer Vision and Multimedia Analysis.” <i>IEEE Transactions on Pattern
    Analysis and Machine Intelligence</i>. IEEE, 2018. <a href="https://doi.org/10.1109/TPAMI.2018.2804998">https://doi.org/10.1109/TPAMI.2018.2804998</a>.
  ieee: T. Darrell, C. Lampert, N. Sebe, Y. Wu, and Y. Yan, “Guest editors’ introduction
    to the special section on learning with Shared information for computer vision
    and multimedia analysis,” <i>IEEE Transactions on Pattern Analysis and Machine
    Intelligence</i>, vol. 40, no. 5. IEEE, pp. 1029–1031, 2018.
  ista: Darrell T, Lampert C, Sebe N, Wu Y, Yan Y. 2018. Guest editors’ introduction
    to the special section on learning with Shared information for computer vision
    and multimedia analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence.
    40(5), 1029–1031.
  mla: Darrell, Trevor, et al. “Guest Editors’ Introduction to the Special Section
    on Learning with Shared Information for Computer Vision and Multimedia Analysis.”
    <i>IEEE Transactions on Pattern Analysis and Machine Intelligence</i>, vol. 40,
    no. 5, IEEE, 2018, pp. 1029–31, doi:<a href="https://doi.org/10.1109/TPAMI.2018.2804998">10.1109/TPAMI.2018.2804998</a>.
  short: T. Darrell, C. Lampert, N. Sebe, Y. Wu, Y. Yan, IEEE Transactions on Pattern
    Analysis and Machine Intelligence 40 (2018) 1029–1031.
date_created: 2018-12-11T11:45:48Z
date_published: 2018-05-01T00:00:00Z
date_updated: 2023-09-11T14:07:54Z
day: '01'
ddc:
- '000'
department:
- _id: ChLa
doi: 10.1109/TPAMI.2018.2804998
external_id:
  isi:
  - '000428901200001'
file:
- access_level: open_access
  checksum: b19c75da06faf3291a3ca47dfa50ef63
  content_type: application/pdf
  creator: dernst
  date_created: 2020-05-14T12:50:48Z
  date_updated: 2020-07-14T12:46:03Z
  file_id: '7835'
  file_name: 2018_IEEE_Darrell.pdf
  file_size: 141724
  relation: main_file
file_date_updated: 2020-07-14T12:46:03Z
has_accepted_license: '1'
intvolume: '        40'
isi: 1
issue: '5'
language:
- iso: eng
month: '05'
oa: 1
oa_version: Published Version
page: 1029 - 1031
publication: IEEE Transactions on Pattern Analysis and Machine Intelligence
publication_status: published
publisher: IEEE
publist_id: '7544'
quality_controlled: '1'
scopus_import: '1'
status: public
title: Guest editors' introduction to the special section on learning with Shared
  information for computer vision and multimedia analysis
type: journal_article
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
volume: 40
year: '2018'
...
---
_id: '10882'
abstract:
- lang: eng
  text: 'We introduce Intelligent Annotation Dialogs for bounding box annotation.
    We train an agent to automatically choose a sequence of actions for a human annotator
    to produce a bounding box in a minimal amount of time. Specifically, we consider
    two actions: box verification [34], where the annotator verifies a box generated
    by an object detector, and manual box drawing. We explore two kinds of agents,
    one based on predicting the probability that a box will be positively verified,
    and the other based on reinforcement learning. We demonstrate that (1) our agents
    are able to learn efficient annotation strategies in several scenarios, automatically
    adapting to the image difficulty, the desired quality of the boxes, and the detector
    strength; (2) in all scenarios the resulting annotation dialogs speed up annotation
    compared to manual box drawing alone and box verification alone, while also outperforming
    any fixed combination of verification and drawing in most scenarios; (3) in a
    realistic scenario where the detector is iteratively re-trained, our agents evolve
    a series of strategies that reflect the shifting trade-off between verification
    and drawing as the detector grows stronger.'
article_processing_charge: No
arxiv: 1
author:
- first_name: Jasper
  full_name: Uijlings, Jasper
  last_name: Uijlings
- first_name: Ksenia
  full_name: Konyushkova, Ksenia
  last_name: Konyushkova
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
- first_name: Vittorio
  full_name: Ferrari, Vittorio
  last_name: Ferrari
citation:
  ama: 'Uijlings J, Konyushkova K, Lampert C, Ferrari V. Learning intelligent dialogs
    for bounding box annotation. In: <i>2018 IEEE/CVF Conference on Computer Vision
    and Pattern Recognition</i>. IEEE; 2018:9175-9184. doi:<a href="https://doi.org/10.1109/cvpr.2018.00956">10.1109/cvpr.2018.00956</a>'
  apa: 'Uijlings, J., Konyushkova, K., Lampert, C., &#38; Ferrari, V. (2018). Learning
    intelligent dialogs for bounding box annotation. In <i>2018 IEEE/CVF Conference
    on Computer Vision and Pattern Recognition</i> (pp. 9175–9184). Salt Lake City,
    UT, United States: IEEE. <a href="https://doi.org/10.1109/cvpr.2018.00956">https://doi.org/10.1109/cvpr.2018.00956</a>'
  chicago: Uijlings, Jasper, Ksenia Konyushkova, Christoph Lampert, and Vittorio Ferrari.
    “Learning Intelligent Dialogs for Bounding Box Annotation.” In <i>2018 IEEE/CVF
    Conference on Computer Vision and Pattern Recognition</i>, 9175–84. IEEE, 2018.
    <a href="https://doi.org/10.1109/cvpr.2018.00956">https://doi.org/10.1109/cvpr.2018.00956</a>.
  ieee: J. Uijlings, K. Konyushkova, C. Lampert, and V. Ferrari, “Learning intelligent
    dialogs for bounding box annotation,” in <i>2018 IEEE/CVF Conference on Computer
    Vision and Pattern Recognition</i>, Salt Lake City, UT, United States, 2018, pp.
    9175–9184.
  ista: 'Uijlings J, Konyushkova K, Lampert C, Ferrari V. 2018. Learning intelligent
    dialogs for bounding box annotation. 2018 IEEE/CVF Conference on Computer Vision
    and Pattern Recognition. CVF: Conference on Computer Vision and Pattern Recognition,
    9175–9184.'
  mla: Uijlings, Jasper, et al. “Learning Intelligent Dialogs for Bounding Box Annotation.”
    <i>2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition</i>, IEEE,
    2018, pp. 9175–84, doi:<a href="https://doi.org/10.1109/cvpr.2018.00956">10.1109/cvpr.2018.00956</a>.
  short: J. Uijlings, K. Konyushkova, C. Lampert, V. Ferrari, in:, 2018 IEEE/CVF Conference
    on Computer Vision and Pattern Recognition, IEEE, 2018, pp. 9175–9184.
conference:
  end_date: 2018-06-23
  location: Salt Lake City, UT, United States
  name: 'CVF: Conference on Computer Vision and Pattern Recognition'
  start_date: 2018-06-18
date_created: 2022-03-18T12:45:09Z
date_published: 2018-12-17T00:00:00Z
date_updated: 2023-09-19T15:11:49Z
day: '17'
department:
- _id: ChLa
doi: 10.1109/cvpr.2018.00956
external_id:
  arxiv:
  - '1712.08087'
  isi:
  - '000457843609036'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: ' https://doi.org/10.48550/arXiv.1712.08087'
month: '12'
oa: 1
oa_version: Preprint
page: 9175-9184
publication: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
publication_identifier:
  eissn:
  - 2575-7075
  isbn:
  - '9781538664209'
publication_status: published
publisher: IEEE
quality_controlled: '1'
scopus_import: '1'
status: public
title: Learning intelligent dialogs for bounding box annotation
type: conference
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2018'
...
---
_id: '197'
abstract:
- lang: eng
  text: Modern computer vision systems heavily rely on statistical machine learning
    models, which typically require large amounts of labeled data to be learned reliably.
    Moreover, very recently computer vision research widely adopted techniques for
    representation learning, which further increase the demand for labeled data. However,
    for many important practical problems there is relatively small amount of labeled
    data available, so it is problematic to leverage full potential of the representation
    learning methods. One way to overcome this obstacle is to invest substantial resources
    into producing large labelled datasets. Unfortunately, this can be prohibitively
    expensive in practice. In this thesis we focus on the alternative way of tackling
    the aforementioned issue. We concentrate on methods, which make use of weakly-labeled
    or even unlabeled data. Specifically, the first half of the thesis is dedicated
    to the semantic image segmentation task. We develop a technique, which achieves
    competitive segmentation performance and only requires annotations in a form of
    global image-level labels instead of dense segmentation masks. Subsequently, we
    present a new methodology, which further improves segmentation performance by
    leveraging tiny additional feedback from a human annotator. By using our methods
    practitioners can greatly reduce the amount of data annotation effort, which is
    required to learn modern image segmentation models. In the second half of the
    thesis we focus on methods for learning from unlabeled visual data. We study a
    family of autoregressive models for modeling structure of natural images and discuss
    potential applications of these models. Moreover, we conduct in-depth study of
    one of these applications, where we develop the state-of-the-art model for the
    probabilistic image colorization task.
acknowledgement: I also gratefully acknowledge the support of NVIDIA Corporation with
  the donation of the GPUs used for this research.
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Alexander
  full_name: Kolesnikov, Alexander
  id: 2D157DB6-F248-11E8-B48F-1D18A9856A87
  last_name: Kolesnikov
citation:
  ama: Kolesnikov A. Weakly-Supervised Segmentation and Unsupervised Modeling of Natural
    Images. 2018. doi:<a href="https://doi.org/10.15479/AT:ISTA:th_1021">10.15479/AT:ISTA:th_1021</a>
  apa: Kolesnikov, A. (2018). <i>Weakly-Supervised Segmentation and Unsupervised Modeling
    of Natural Images</i>. Institute of Science and Technology Austria. <a href="https://doi.org/10.15479/AT:ISTA:th_1021">https://doi.org/10.15479/AT:ISTA:th_1021</a>
  chicago: Kolesnikov, Alexander. “Weakly-Supervised Segmentation and Unsupervised
    Modeling of Natural Images.” Institute of Science and Technology Austria, 2018.
    <a href="https://doi.org/10.15479/AT:ISTA:th_1021">https://doi.org/10.15479/AT:ISTA:th_1021</a>.
  ieee: A. Kolesnikov, “Weakly-Supervised Segmentation and Unsupervised Modeling of
    Natural Images,” Institute of Science and Technology Austria, 2018.
  ista: Kolesnikov A. 2018. Weakly-Supervised Segmentation and Unsupervised Modeling
    of Natural Images. Institute of Science and Technology Austria.
  mla: Kolesnikov, Alexander. <i>Weakly-Supervised Segmentation and Unsupervised Modeling
    of Natural Images</i>. Institute of Science and Technology Austria, 2018, doi:<a
    href="https://doi.org/10.15479/AT:ISTA:th_1021">10.15479/AT:ISTA:th_1021</a>.
  short: A. Kolesnikov, Weakly-Supervised Segmentation and Unsupervised Modeling of
    Natural Images, Institute of Science and Technology Austria, 2018.
date_created: 2018-12-11T11:45:09Z
date_published: 2018-05-25T00:00:00Z
date_updated: 2023-09-07T12:51:46Z
day: '25'
ddc:
- '004'
degree_awarded: PhD
department:
- _id: ChLa
doi: 10.15479/AT:ISTA:th_1021
ec_funded: 1
file:
- access_level: open_access
  checksum: bc678e02468d8ebc39dc7267dfb0a1c4
  content_type: application/pdf
  creator: system
  date_created: 2018-12-12T10:14:57Z
  date_updated: 2020-07-14T12:45:22Z
  file_id: '5113'
  file_name: IST-2018-1021-v1+1_thesis-unsigned-pdfa.pdf
  file_size: 12918758
  relation: main_file
- access_level: closed
  checksum: bc66973b086da5a043f1162dcfb1fde4
  content_type: application/zip
  creator: dernst
  date_created: 2019-04-05T09:34:49Z
  date_updated: 2020-07-14T12:45:22Z
  file_id: '6225'
  file_name: 2018_Thesis_Kolesnikov_source.zip
  file_size: 55973760
  relation: source_file
file_date_updated: 2020-07-14T12:45:22Z
has_accepted_license: '1'
language:
- iso: eng
month: '05'
oa: 1
oa_version: Published Version
page: '113'
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication_identifier:
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
publist_id: '7718'
pubrep_id: '1021'
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Weakly-Supervised Segmentation and Unsupervised Modeling of Natural Images
type: dissertation
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2018'
...
---
_id: '68'
abstract:
- lang: eng
  text: The most common assumption made in statistical learning theory is the assumption
    of the independent and identically distributed (i.i.d.) data. While being very
    convenient mathematically, it is often very clearly violated in practice. This
    disparity between the machine learning theory and applications underlies a growing
    demand in the development of algorithms that learn from dependent data and theory
    that can provide generalization guarantees similar to the independent situations.
    This thesis is dedicated to two variants of dependencies that can arise in practice.
    One is a dependence on the level of samples in a single learning task. Another
    dependency type arises in the multi-task setting when the tasks are dependent
    on each other even though the data for them can be i.i.d. In both cases we model
    the data (samples or tasks) as stochastic processes and introduce new algorithms
    for both settings that take into account and exploit the resulting dependencies.
    We prove the theoretical guarantees on the performance of the introduced algorithms
    under different evaluation criteria and, in addition, we compliment the theoretical
    study by the empirical one, where we evaluate some of the algorithms on two real
    world datasets to highlight their practical applicability.
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Alexander
  full_name: Zimin, Alexander
  id: 37099E9C-F248-11E8-B48F-1D18A9856A87
  last_name: Zimin
citation:
  ama: Zimin A. Learning from dependent data. 2018. doi:<a href="https://doi.org/10.15479/AT:ISTA:TH1048">10.15479/AT:ISTA:TH1048</a>
  apa: Zimin, A. (2018). <i>Learning from dependent data</i>. Institute of Science
    and Technology Austria. <a href="https://doi.org/10.15479/AT:ISTA:TH1048">https://doi.org/10.15479/AT:ISTA:TH1048</a>
  chicago: Zimin, Alexander. “Learning from Dependent Data.” Institute of Science
    and Technology Austria, 2018. <a href="https://doi.org/10.15479/AT:ISTA:TH1048">https://doi.org/10.15479/AT:ISTA:TH1048</a>.
  ieee: A. Zimin, “Learning from dependent data,” Institute of Science and Technology
    Austria, 2018.
  ista: Zimin A. 2018. Learning from dependent data. Institute of Science and Technology
    Austria.
  mla: Zimin, Alexander. <i>Learning from Dependent Data</i>. Institute of Science
    and Technology Austria, 2018, doi:<a href="https://doi.org/10.15479/AT:ISTA:TH1048">10.15479/AT:ISTA:TH1048</a>.
  short: A. Zimin, Learning from Dependent Data, Institute of Science and Technology
    Austria, 2018.
date_created: 2018-12-11T11:44:27Z
date_published: 2018-09-01T00:00:00Z
date_updated: 2023-09-07T12:29:07Z
day: '01'
ddc:
- '004'
- '519'
degree_awarded: PhD
department:
- _id: ChLa
doi: 10.15479/AT:ISTA:TH1048
ec_funded: 1
file:
- access_level: open_access
  checksum: e849dd40a915e4d6c5572b51b517f098
  content_type: application/pdf
  creator: dernst
  date_created: 2019-04-09T07:32:47Z
  date_updated: 2020-07-14T12:47:40Z
  file_id: '6253'
  file_name: 2018_Thesis_Zimin.pdf
  file_size: 1036137
  relation: main_file
- access_level: closed
  checksum: da092153cec55c97461bd53c45c5d139
  content_type: application/zip
  creator: dernst
  date_created: 2019-04-09T07:32:47Z
  date_updated: 2020-07-14T12:47:40Z
  file_id: '6254'
  file_name: 2018_Thesis_Zimin_Source.zip
  file_size: 637490
  relation: source_file
file_date_updated: 2020-07-14T12:47:40Z
has_accepted_license: '1'
language:
- iso: eng
month: '09'
oa: 1
oa_version: Published Version
page: '92'
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication_identifier:
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
publist_id: '7986'
pubrep_id: '1048'
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Learning from dependent data
type: dissertation
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2018'
...
---
_id: '5584'
abstract:
- lang: eng
  text: "This package contains data for the publication \"Nonlinear decoding of a
    complex movie from the mammalian retina\" by Deny S. et al, PLOS Comput Biol (2018).
    \r\n\r\nThe data consists of\r\n(i) 91 spike sorted, isolated rat retinal ganglion
    cells that pass stability and quality criteria, recorded on the multi-electrode
    array, in response to the presentation of the complex movie with many randomly
    moving dark discs. The responses are represented as 648000 x 91 binary matrix,
    where the first index indicates the timebin of duration 12.5 ms, and the second
    index the neural identity. The matrix entry is 0/1 if the neuron didn't/did spike
    in the particular time bin.\r\n(ii) README file and a graphical illustration of
    the structure of the experiment, specifying how the 648000 timebins are split
    into epochs where 1, 2, 4, or 10 discs  were displayed, and which stimulus segments
    are exact repeats or unique ball trajectories.\r\n(iii) a 648000 x 400 matrix
    of luminance traces for each of the 20 x 20 positions (\"sites\") in the movie
    frame, with time that is locked to the recorded raster. The luminance traces are
    produced as described in the manuscript by filtering the raw disc movie with a
    small gaussian spatial kernel. "
article_processing_charge: No
author:
- first_name: Stephane
  full_name: Deny, Stephane
  last_name: Deny
- first_name: Olivier
  full_name: Marre, Olivier
  last_name: Marre
- first_name: Vicente
  full_name: Botella-Soler, Vicente
  last_name: Botella-Soler
- first_name: Georg S
  full_name: Martius, Georg S
  id: 3A276B68-F248-11E8-B48F-1D18A9856A87
  last_name: Martius
- first_name: Gasper
  full_name: Tkacik, Gasper
  id: 3D494DCA-F248-11E8-B48F-1D18A9856A87
  last_name: Tkacik
  orcid: 0000-0002-6699-1455
citation:
  ama: Deny S, Marre O, Botella-Soler V, Martius GS, Tkačik G. Nonlinear decoding
    of a complex movie from the mammalian retina. 2018. doi:<a href="https://doi.org/10.15479/AT:ISTA:98">10.15479/AT:ISTA:98</a>
  apa: Deny, S., Marre, O., Botella-Soler, V., Martius, G. S., &#38; Tkačik, G. (2018).
    Nonlinear decoding of a complex movie from the mammalian retina. Institute of
    Science and Technology Austria. <a href="https://doi.org/10.15479/AT:ISTA:98">https://doi.org/10.15479/AT:ISTA:98</a>
  chicago: Deny, Stephane, Olivier Marre, Vicente Botella-Soler, Georg S Martius,
    and Gašper Tkačik. “Nonlinear Decoding of a Complex Movie from the Mammalian Retina.”
    Institute of Science and Technology Austria, 2018. <a href="https://doi.org/10.15479/AT:ISTA:98">https://doi.org/10.15479/AT:ISTA:98</a>.
  ieee: S. Deny, O. Marre, V. Botella-Soler, G. S. Martius, and G. Tkačik, “Nonlinear
    decoding of a complex movie from the mammalian retina.” Institute of Science and
    Technology Austria, 2018.
  ista: Deny S, Marre O, Botella-Soler V, Martius GS, Tkačik G. 2018. Nonlinear decoding
    of a complex movie from the mammalian retina, Institute of Science and Technology
    Austria, <a href="https://doi.org/10.15479/AT:ISTA:98">10.15479/AT:ISTA:98</a>.
  mla: Deny, Stephane, et al. <i>Nonlinear Decoding of a Complex Movie from the Mammalian
    Retina</i>. Institute of Science and Technology Austria, 2018, doi:<a href="https://doi.org/10.15479/AT:ISTA:98">10.15479/AT:ISTA:98</a>.
  short: S. Deny, O. Marre, V. Botella-Soler, G.S. Martius, G. Tkačik, (2018).
datarep_id: '98'
date_created: 2018-12-12T12:31:39Z
date_published: 2018-03-29T00:00:00Z
date_updated: 2024-02-21T13:45:26Z
day: '29'
ddc:
- '570'
department:
- _id: ChLa
- _id: GaTk
doi: 10.15479/AT:ISTA:98
file:
- access_level: open_access
  checksum: 6808748837b9afbbbabc2a356ca2b88a
  content_type: application/octet-stream
  creator: system
  date_created: 2018-12-12T13:02:24Z
  date_updated: 2020-07-14T12:47:07Z
  file_id: '5590'
  file_name: IST-2018-98-v1+1_BBalls_area2_tile2_20x20.mat
  file_size: 1142543971
  relation: main_file
- access_level: open_access
  checksum: d6d6cd07743038fe3a12352983fcf9dd
  content_type: application/pdf
  creator: system
  date_created: 2018-12-12T13:02:25Z
  date_updated: 2020-07-14T12:47:07Z
  file_id: '5591'
  file_name: IST-2018-98-v1+2_ExperimentStructure.pdf
  file_size: 702336
  relation: main_file
- access_level: open_access
  checksum: 0c9cfb4dab35bb3dc25a04395600b1c8
  content_type: application/octet-stream
  creator: system
  date_created: 2018-12-12T13:02:26Z
  date_updated: 2020-07-14T12:47:07Z
  file_id: '5592'
  file_name: IST-2018-98-v1+3_GoodLocations_area2_20x20.mat
  file_size: 432
  relation: main_file
- access_level: open_access
  checksum: 2a83b011012e21e934b4596285b1a183
  content_type: text/plain
  creator: system
  date_created: 2018-12-12T13:02:26Z
  date_updated: 2020-07-14T12:47:07Z
  file_id: '5593'
  file_name: IST-2018-98-v1+4_README.txt
  file_size: 986
  relation: main_file
file_date_updated: 2020-07-14T12:47:07Z
has_accepted_license: '1'
keyword:
- retina
- decoding
- regression
- neural networks
- complex stimulus
license: https://creativecommons.org/publicdomain/zero/1.0/
month: '03'
oa: 1
oa_version: Published Version
project:
- _id: 254D1A94-B435-11E9-9278-68D0E5697425
  call_identifier: FWF
  grant_number: P 25651-N26
  name: Sensitivity to higher-order statistics in natural scenes
publisher: Institute of Science and Technology Austria
related_material:
  record:
  - id: '292'
    relation: used_in_publication
    status: public
status: public
title: Nonlinear decoding of a complex movie from the mammalian retina
tmp:
  image: /images/cc_0.png
  legal_code_url: https://creativecommons.org/publicdomain/zero/1.0/legalcode
  name: Creative Commons Public Domain Dedication (CC0 1.0)
  short: CC0 (1.0)
type: research_data
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2018'
...
---
_id: '563'
abstract:
- lang: eng
  text: "In continuous populations with local migration, nearby pairs of individuals
    have on average more similar genotypes\r\nthan geographically well separated pairs.
    A barrier to gene flow distorts this classical pattern of isolation by distance.
    Genetic similarity is decreased for sample pairs on different sides of the barrier
    and increased for pairs on the same side near the barrier. Here, we introduce
    an inference scheme that utilizes this signal to detect and estimate the strength
    of a linear barrier to gene flow in two-dimensions. We use a diffusion approximation
    to model the effects of a barrier on the geographical spread of ancestry backwards
    in time. This approach allows us to calculate the chance of recent coalescence
    and probability of identity by descent. We introduce an inference scheme that
    fits these theoretical results to the geographical covariance structure of bialleleic
    genetic markers. It can estimate the strength of the barrier as well as several
    demographic parameters. We investigate the power of our inference scheme to detect
    barriers by applying it to a wide range of simulated data. We also showcase an
    example application to a Antirrhinum majus (snapdragon) flower color hybrid zone,
    where we do not detect any signal of a strong genome wide barrier to gene flow."
article_processing_charge: No
author:
- first_name: Harald
  full_name: Ringbauer, Harald
  id: 417FCFF4-F248-11E8-B48F-1D18A9856A87
  last_name: Ringbauer
  orcid: 0000-0002-4884-9682
- first_name: Alexander
  full_name: Kolesnikov, Alexander
  id: 2D157DB6-F248-11E8-B48F-1D18A9856A87
  last_name: Kolesnikov
- first_name: David
  full_name: Field, David
  last_name: Field
- first_name: Nicholas H
  full_name: Barton, Nicholas H
  id: 4880FE40-F248-11E8-B48F-1D18A9856A87
  last_name: Barton
  orcid: 0000-0002-8548-5240
citation:
  ama: Ringbauer H, Kolesnikov A, Field D, Barton NH. Estimating barriers to gene
    flow from distorted isolation-by-distance patterns. <i>Genetics</i>. 2018;208(3):1231-1245.
    doi:<a href="https://doi.org/10.1534/genetics.117.300638">10.1534/genetics.117.300638</a>
  apa: Ringbauer, H., Kolesnikov, A., Field, D., &#38; Barton, N. H. (2018). Estimating
    barriers to gene flow from distorted isolation-by-distance patterns. <i>Genetics</i>.
    Genetics Society of America. <a href="https://doi.org/10.1534/genetics.117.300638">https://doi.org/10.1534/genetics.117.300638</a>
  chicago: Ringbauer, Harald, Alexander Kolesnikov, David Field, and Nicholas H Barton.
    “Estimating Barriers to Gene Flow from Distorted Isolation-by-Distance Patterns.”
    <i>Genetics</i>. Genetics Society of America, 2018. <a href="https://doi.org/10.1534/genetics.117.300638">https://doi.org/10.1534/genetics.117.300638</a>.
  ieee: H. Ringbauer, A. Kolesnikov, D. Field, and N. H. Barton, “Estimating barriers
    to gene flow from distorted isolation-by-distance patterns,” <i>Genetics</i>,
    vol. 208, no. 3. Genetics Society of America, pp. 1231–1245, 2018.
  ista: Ringbauer H, Kolesnikov A, Field D, Barton NH. 2018. Estimating barriers to
    gene flow from distorted isolation-by-distance patterns. Genetics. 208(3), 1231–1245.
  mla: Ringbauer, Harald, et al. “Estimating Barriers to Gene Flow from Distorted
    Isolation-by-Distance Patterns.” <i>Genetics</i>, vol. 208, no. 3, Genetics Society
    of America, 2018, pp. 1231–45, doi:<a href="https://doi.org/10.1534/genetics.117.300638">10.1534/genetics.117.300638</a>.
  short: H. Ringbauer, A. Kolesnikov, D. Field, N.H. Barton, Genetics 208 (2018) 1231–1245.
date_created: 2018-12-11T11:47:12Z
date_published: 2018-03-01T00:00:00Z
date_updated: 2023-09-11T13:42:38Z
day: '01'
department:
- _id: NiBa
- _id: ChLa
doi: 10.1534/genetics.117.300638
external_id:
  isi:
  - '000426219600025'
intvolume: '       208'
isi: 1
issue: '3'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://www.biorxiv.org/content/10.1101/205484v1
month: '03'
oa: 1
oa_version: Preprint
page: 1231-1245
publication: Genetics
publication_status: published
publisher: Genetics Society of America
publist_id: '7251'
quality_controlled: '1'
related_material:
  record:
  - id: '200'
    relation: dissertation_contains
    status: public
scopus_import: '1'
status: public
title: Estimating barriers to gene flow from distorted isolation-by-distance patterns
type: journal_article
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
volume: 208
year: '2018'
...
---
_id: '6011'
abstract:
- lang: eng
  text: 'We establish a data-dependent notion of algorithmic stability for Stochastic
    Gradient Descent (SGD), and employ it to develop novel generalization bounds.
    This is in contrast to previous distribution-free algorithmic stability results
    for SGD which depend on the worst-case constants. By virtue of the data-dependent
    argument, our bounds provide new insights into learning with SGD on convex and
    non-convex problems. In the convex case, we show that the bound on the generalization
    error depends on the risk at the initialization point. In the non-convex case,
    we prove that the expected curvature of the objective function around the initialization
    point has crucial influence on the generalization error. In both cases, our results
    suggest a simple data-driven strategy to stabilize SGD by pre-screening its initialization.
    As a corollary, our results allow us to show optimistic generalization bounds
    that exhibit fast convergence rates for SGD subject to a vanishing empirical risk
    and low noise of stochastic gradient. '
article_processing_charge: No
arxiv: 1
author:
- first_name: Ilja
  full_name: Kuzborskij, Ilja
  last_name: Kuzborskij
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Kuzborskij I, Lampert C. Data-dependent stability of stochastic gradient descent.
    In: <i>Proceedings of the 35 Th International Conference on Machine Learning</i>.
    Vol 80. ML Research Press; 2018:2815-2824.'
  apa: 'Kuzborskij, I., &#38; Lampert, C. (2018). Data-dependent stability of stochastic
    gradient descent. In <i>Proceedings of the 35 th International Conference on Machine
    Learning</i> (Vol. 80, pp. 2815–2824). Stockholm, Sweden: ML Research Press.'
  chicago: Kuzborskij, Ilja, and Christoph Lampert. “Data-Dependent Stability of Stochastic
    Gradient Descent.” In <i>Proceedings of the 35 Th International Conference on
    Machine Learning</i>, 80:2815–24. ML Research Press, 2018.
  ieee: I. Kuzborskij and C. Lampert, “Data-dependent stability of stochastic gradient
    descent,” in <i>Proceedings of the 35 th International Conference on Machine Learning</i>,
    Stockholm, Sweden, 2018, vol. 80, pp. 2815–2824.
  ista: 'Kuzborskij I, Lampert C. 2018. Data-dependent stability of stochastic gradient
    descent. Proceedings of the 35 th International Conference on Machine Learning.
    ICML: International Conference on Machine Learning vol. 80, 2815–2824.'
  mla: Kuzborskij, Ilja, and Christoph Lampert. “Data-Dependent Stability of Stochastic
    Gradient Descent.” <i>Proceedings of the 35 Th International Conference on Machine
    Learning</i>, vol. 80, ML Research Press, 2018, pp. 2815–24.
  short: I. Kuzborskij, C. Lampert, in:, Proceedings of the 35 Th International Conference
    on Machine Learning, ML Research Press, 2018, pp. 2815–2824.
conference:
  end_date: 2018-07-15
  location: Stockholm, Sweden
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2018-07-10
date_created: 2019-02-14T14:51:57Z
date_published: 2018-02-01T00:00:00Z
date_updated: 2023-10-17T09:51:13Z
day: '01'
department:
- _id: ChLa
ec_funded: 1
external_id:
  arxiv:
  - '1703.01678'
  isi:
  - '000683379202095'
intvolume: '        80'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/1703.01678
month: '02'
oa: 1
oa_version: Preprint
page: 2815-2824
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication: Proceedings of the 35 th International Conference on Machine Learning
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: Data-dependent stability of stochastic gradient descent
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 80
year: '2018'
...
---
_id: '6012'
abstract:
- lang: eng
  text: We present an approach to identify concise equations from data using a shallow
    neural network approach. In contrast to ordinary black-box regression, this approach
    allows understanding functional relations and generalizing them from observed
    data to unseen parts of the parameter space. We show how to extend the class of
    learnable equations for a recently proposed equation learning network to include
    divisions, and we improve the learning and model selection strategy to be useful
    for challenging real-world data. For systems governed by analytical expressions,
    our method can in many cases identify the true underlying equation and extrapolate
    to unseen domains. We demonstrate its effectiveness by experiments on a cart-pendulum
    system, where only 2 random rollouts are required to learn the forward dynamics
    and successfully achieve the swing-up task.
article_processing_charge: No
arxiv: 1
author:
- first_name: Subham
  full_name: Sahoo, Subham
  last_name: Sahoo
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
- first_name: Georg S
  full_name: Martius, Georg S
  id: 3A276B68-F248-11E8-B48F-1D18A9856A87
  last_name: Martius
citation:
  ama: 'Sahoo S, Lampert C, Martius GS. Learning equations for extrapolation and control.
    In: <i>Proceedings of the 35th International Conference on Machine Learning</i>.
    Vol 80. ML Research Press; 2018:4442-4450.'
  apa: 'Sahoo, S., Lampert, C., &#38; Martius, G. S. (2018). Learning equations for
    extrapolation and control. In <i>Proceedings of the 35th International Conference
    on Machine Learning</i> (Vol. 80, pp. 4442–4450). Stockholm, Sweden: ML Research
    Press.'
  chicago: Sahoo, Subham, Christoph Lampert, and Georg S Martius. “Learning Equations
    for Extrapolation and Control.” In <i>Proceedings of the 35th International Conference
    on Machine Learning</i>, 80:4442–50. ML Research Press, 2018.
  ieee: S. Sahoo, C. Lampert, and G. S. Martius, “Learning equations for extrapolation
    and control,” in <i>Proceedings of the 35th International Conference on Machine
    Learning</i>, Stockholm, Sweden, 2018, vol. 80, pp. 4442–4450.
  ista: 'Sahoo S, Lampert C, Martius GS. 2018. Learning equations for extrapolation
    and control. Proceedings of the 35th International Conference on Machine Learning.
    ICML: International Conference on Machine Learning vol. 80, 4442–4450.'
  mla: Sahoo, Subham, et al. “Learning Equations for Extrapolation and Control.” <i>Proceedings
    of the 35th International Conference on Machine Learning</i>, vol. 80, ML Research
    Press, 2018, pp. 4442–50.
  short: S. Sahoo, C. Lampert, G.S. Martius, in:, Proceedings of the 35th International
    Conference on Machine Learning, ML Research Press, 2018, pp. 4442–4450.
conference:
  end_date: 2018-07-15
  location: Stockholm, Sweden
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2018-07-10
date_created: 2019-02-14T15:21:07Z
date_published: 2018-02-01T00:00:00Z
date_updated: 2023-10-17T09:50:53Z
day: '01'
department:
- _id: ChLa
ec_funded: 1
external_id:
  arxiv:
  - '1806.07259'
  isi:
  - '000683379204058'
intvolume: '        80'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/1806.07259
month: '02'
oa: 1
oa_version: Preprint
page: 4442-4450
project:
- _id: 25681D80-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '291734'
  name: International IST Postdoc Fellowship Programme
publication: Proceedings of the 35th International Conference on Machine Learning
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
related_material:
  link:
  - description: News on IST Homepage
    relation: press_release
    url: https://ist.ac.at/en/news/first-machine-learning-method-capable-of-accurate-extrapolation/
scopus_import: '1'
status: public
title: Learning equations for extrapolation and control
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 80
year: '2018'
...
---
_id: '6589'
abstract:
- lang: eng
  text: Distributed training of massive machine learning models, in particular deep
    neural networks, via Stochastic Gradient Descent (SGD) is becoming commonplace.
    Several families of communication-reduction methods, such as quantization, large-batch
    methods, and gradient sparsification, have been proposed. To date, gradient sparsification
    methods--where each node sorts gradients by magnitude, and only communicates a
    subset of the components, accumulating the rest locally--are known to yield some
    of the largest practical gains. Such methods can reduce the amount of communication
    per step by up to \emph{three orders of magnitude}, while preserving model accuracy.
    Yet, this family of methods currently has no theoretical justification. This is
    the question we address in this paper. We prove that, under analytic assumptions,
    sparsifying gradients by magnitude with local error correction provides convergence
    guarantees, for both convex and non-convex smooth objectives, for data-parallel
    SGD. The main insight is that sparsification methods implicitly maintain bounds
    on the maximum impact of stale updates, thanks to selection by magnitude. Our
    analysis and empirical validation also reveal that these methods do require analytical
    conditions to converge well, justifying existing heuristics.
article_processing_charge: No
arxiv: 1
author:
- first_name: Dan-Adrian
  full_name: Alistarh, Dan-Adrian
  id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
  last_name: Alistarh
  orcid: 0000-0003-3650-940X
- first_name: Torsten
  full_name: Hoefler, Torsten
  last_name: Hoefler
- first_name: Mikael
  full_name: Johansson, Mikael
  last_name: Johansson
- first_name: Nikola H
  full_name: Konstantinov, Nikola H
  id: 4B9D76E4-F248-11E8-B48F-1D18A9856A87
  last_name: Konstantinov
- first_name: Sarit
  full_name: Khirirat, Sarit
  last_name: Khirirat
- first_name: Cedric
  full_name: Renggli, Cedric
  last_name: Renggli
citation:
  ama: 'Alistarh D-A, Hoefler T, Johansson M, Konstantinov NH, Khirirat S, Renggli
    C. The convergence of sparsified gradient methods. In: <i>Advances in Neural Information
    Processing Systems 31</i>. Vol Volume 2018. Neural Information Processing Systems
    Foundation; 2018:5973-5983.'
  apa: 'Alistarh, D.-A., Hoefler, T., Johansson, M., Konstantinov, N. H., Khirirat,
    S., &#38; Renggli, C. (2018). The convergence of sparsified gradient methods.
    In <i>Advances in Neural Information Processing Systems 31</i> (Vol. Volume 2018,
    pp. 5973–5983). Montreal, Canada: Neural Information Processing Systems Foundation.'
  chicago: Alistarh, Dan-Adrian, Torsten Hoefler, Mikael Johansson, Nikola H Konstantinov,
    Sarit Khirirat, and Cedric Renggli. “The Convergence of Sparsified Gradient Methods.”
    In <i>Advances in Neural Information Processing Systems 31</i>, Volume 2018:5973–83.
    Neural Information Processing Systems Foundation, 2018.
  ieee: D.-A. Alistarh, T. Hoefler, M. Johansson, N. H. Konstantinov, S. Khirirat,
    and C. Renggli, “The convergence of sparsified gradient methods,” in <i>Advances
    in Neural Information Processing Systems 31</i>, Montreal, Canada, 2018, vol.
    Volume 2018, pp. 5973–5983.
  ista: 'Alistarh D-A, Hoefler T, Johansson M, Konstantinov NH, Khirirat S, Renggli
    C. 2018. The convergence of sparsified gradient methods. Advances in Neural Information
    Processing Systems 31. NeurIPS: Conference on Neural Information Processing Systems
    vol. Volume 2018, 5973–5983.'
  mla: Alistarh, Dan-Adrian, et al. “The Convergence of Sparsified Gradient Methods.”
    <i>Advances in Neural Information Processing Systems 31</i>, vol. Volume 2018,
    Neural Information Processing Systems Foundation, 2018, pp. 5973–83.
  short: D.-A. Alistarh, T. Hoefler, M. Johansson, N.H. Konstantinov, S. Khirirat,
    C. Renggli, in:, Advances in Neural Information Processing Systems 31, Neural
    Information Processing Systems Foundation, 2018, pp. 5973–5983.
conference:
  end_date: 2018-12-08
  location: Montreal, Canada
  name: 'NeurIPS: Conference on Neural Information Processing Systems'
  start_date: 2018-12-02
date_created: 2019-06-27T09:32:55Z
date_published: 2018-12-01T00:00:00Z
date_updated: 2023-10-17T11:47:20Z
day: '01'
department:
- _id: DaAl
- _id: ChLa
ec_funded: 1
external_id:
  arxiv:
  - '1809.10505'
  isi:
  - '000461852000047'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/1809.10505
month: '12'
oa: 1
oa_version: Preprint
page: 5973-5983
project:
- _id: 2564DBCA-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '665385'
  name: International IST Doctoral Program
publication: Advances in Neural Information Processing Systems 31
publication_status: published
publisher: Neural Information Processing Systems Foundation
quality_controlled: '1'
scopus_import: '1'
status: public
title: The convergence of sparsified gradient methods
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: Volume 2018
year: '2018'
...
---
_id: '1108'
abstract:
- lang: eng
  text: In this work we study the learnability of stochastic processes with respect
    to the conditional risk, i.e. the existence of a learning algorithm that improves
    its next-step performance with the amount of observed data. We introduce a notion
    of pairwise discrepancy between conditional distributions at different times steps
    and show how certain properties of these discrepancies can be used to construct
    a successful learning algorithm. Our main results are two theorems that establish
    criteria for learnability for many classes of stochastic processes, including
    all special cases studied previously in the literature.
alternative_title:
- PMLR
article_processing_charge: No
author:
- first_name: Alexander
  full_name: Zimin, Alexander
  id: 37099E9C-F248-11E8-B48F-1D18A9856A87
  last_name: Zimin
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Zimin A, Lampert C. Learning theory for conditional risk minimization. In:
    Vol 54. ML Research Press; 2017:213-222.'
  apa: 'Zimin, A., &#38; Lampert, C. (2017). Learning theory for conditional risk
    minimization (Vol. 54, pp. 213–222). Presented at the AISTATS: Artificial Intelligence
    and Statistics, Fort Lauderdale, FL, United States: ML Research Press.'
  chicago: Zimin, Alexander, and Christoph Lampert. “Learning Theory for Conditional
    Risk Minimization,” 54:213–22. ML Research Press, 2017.
  ieee: 'A. Zimin and C. Lampert, “Learning theory for conditional risk minimization,”
    presented at the AISTATS: Artificial Intelligence and Statistics, Fort Lauderdale,
    FL, United States, 2017, vol. 54, pp. 213–222.'
  ista: 'Zimin A, Lampert C. 2017. Learning theory for conditional risk minimization.
    AISTATS: Artificial Intelligence and Statistics, PMLR, vol. 54, 213–222.'
  mla: Zimin, Alexander, and Christoph Lampert. <i>Learning Theory for Conditional
    Risk Minimization</i>. Vol. 54, ML Research Press, 2017, pp. 213–22.
  short: A. Zimin, C. Lampert, in:, ML Research Press, 2017, pp. 213–222.
conference:
  end_date: 2017-04-22
  location: Fort Lauderdale, FL, United States
  name: 'AISTATS: Artificial Intelligence and Statistics'
  start_date: 2017-04-20
date_created: 2018-12-11T11:50:11Z
date_published: 2017-04-01T00:00:00Z
date_updated: 2023-10-17T10:01:12Z
day: '01'
department:
- _id: ChLa
ec_funded: 1
external_id:
  isi:
  - '000509368500024'
intvolume: '        54'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: http://proceedings.mlr.press/v54/zimin17a/zimin17a.pdf
month: '04'
oa: 1
oa_version: Submitted Version
page: 213 - 222
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication_status: published
publisher: ML Research Press
publist_id: '6261'
quality_controlled: '1'
status: public
title: Learning theory for conditional risk minimization
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 54
year: '2017'
...
---
_id: '6841'
abstract:
- lang: eng
  text: In classical machine learning, regression is treated as a black box process
    of identifying a suitable function from a hypothesis set without attempting to
    gain insight into the mechanism connecting inputs and outputs. In the natural
    sciences, however, finding an interpretable function for a phenomenon is the prime
    goal as it allows to understand and generalize results. This paper proposes a
    novel type of function learning network, called equation learner (EQL), that can
    learn analytical expressions and is able to extrapolate to unseen domains. It
    is implemented as an end-to-end differentiable feed-forward network and allows
    for efficient gradient based training. Due to sparsity regularization concise
    interpretable expressions can be obtained. Often the true underlying source expression
    is identified.
arxiv: 1
author:
- first_name: Georg S
  full_name: Martius, Georg S
  id: 3A276B68-F248-11E8-B48F-1D18A9856A87
  last_name: Martius
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Martius GS, Lampert C. Extrapolation and learning equations. In: <i>5th International
    Conference on Learning Representations, ICLR 2017 - Workshop Track Proceedings</i>.
    International Conference on Learning Representations; 2017.'
  apa: 'Martius, G. S., &#38; Lampert, C. (2017). Extrapolation and learning equations.
    In <i>5th International Conference on Learning Representations, ICLR 2017 - Workshop
    Track Proceedings</i>. Toulon, France: International Conference on Learning Representations.'
  chicago: Martius, Georg S, and Christoph Lampert. “Extrapolation and Learning Equations.”
    In <i>5th International Conference on Learning Representations, ICLR 2017 - Workshop
    Track Proceedings</i>. International Conference on Learning Representations, 2017.
  ieee: G. S. Martius and C. Lampert, “Extrapolation and learning equations,” in <i>5th
    International Conference on Learning Representations, ICLR 2017 - Workshop Track
    Proceedings</i>, Toulon, France, 2017.
  ista: 'Martius GS, Lampert C. 2017. Extrapolation and learning equations. 5th International
    Conference on Learning Representations, ICLR 2017 - Workshop Track Proceedings.
    ICLR: International Conference on Learning Representations.'
  mla: Martius, Georg S., and Christoph Lampert. “Extrapolation and Learning Equations.”
    <i>5th International Conference on Learning Representations, ICLR 2017 - Workshop
    Track Proceedings</i>, International Conference on Learning Representations, 2017.
  short: G.S. Martius, C. Lampert, in:, 5th International Conference on Learning Representations,
    ICLR 2017 - Workshop Track Proceedings, International Conference on Learning Representations,
    2017.
conference:
  end_date: 2017-04-26
  location: Toulon, France
  name: 'ICLR: International Conference on Learning Representations'
  start_date: 2017-04-24
date_created: 2019-09-01T22:01:00Z
date_published: 2017-02-21T00:00:00Z
date_updated: 2021-01-12T08:09:17Z
day: '21'
department:
- _id: ChLa
ec_funded: 1
external_id:
  arxiv:
  - '1610.02995'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/1610.02995
month: '02'
oa: 1
oa_version: Preprint
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication: 5th International Conference on Learning Representations, ICLR 2017 -
  Workshop Track Proceedings
publication_status: published
publisher: International Conference on Learning Representations
quality_controlled: '1'
scopus_import: 1
status: public
title: Extrapolation and learning equations
type: conference
user_id: 3E5EF7F0-F248-11E8-B48F-1D18A9856A87
year: '2017'
...
---
_id: '750'
abstract:
- lang: eng
  text: Modern communication technologies allow first responders to contact thousands
    of potential volunteers simultaneously for support during a crisis or disaster
    event. However, such volunteer efforts must be well coordinated and monitored,
    in order to offer an effective relief to the professionals. In this paper we extend
    earlier work on optimally assigning volunteers to selected landmark locations.
    In particular, we emphasize the aspect that obtaining good assignments requires
    not only advanced computational tools, but also a realistic measure of distance
    between volunteers and landmarks. Specifically, we propose the use of the Open
    Street Map (OSM) driving distance instead of he previously used flight distance.
    We find the OSM driving distance to be better aligned with the interests of volunteers
    and first responders. Furthermore, we show that relying on the flying distance
    leads to a substantial underestimation of the number of required volunteers, causing
    negative side effects in case of an actual crisis situation.
author:
- first_name: Jasmin
  full_name: Pielorz, Jasmin
  id: 49BC895A-F248-11E8-B48F-1D18A9856A87
  last_name: Pielorz
- first_name: Matthias
  full_name: Prandtstetter, Matthias
  last_name: Prandtstetter
- first_name: Markus
  full_name: Straub, Markus
  last_name: Straub
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Pielorz J, Prandtstetter M, Straub M, Lampert C. Optimal geospatial volunteer
    allocation needs realistic distances. In: <i>2017 IEEE International Conference
    on Big Data</i>. IEEE; 2017:3760-3763. doi:<a href="https://doi.org/10.1109/BigData.2017.8258375">10.1109/BigData.2017.8258375</a>'
  apa: 'Pielorz, J., Prandtstetter, M., Straub, M., &#38; Lampert, C. (2017). Optimal
    geospatial volunteer allocation needs realistic distances. In <i>2017 IEEE International
    Conference on Big Data</i> (pp. 3760–3763). Boston, MA, United States: IEEE. <a
    href="https://doi.org/10.1109/BigData.2017.8258375">https://doi.org/10.1109/BigData.2017.8258375</a>'
  chicago: Pielorz, Jasmin, Matthias Prandtstetter, Markus Straub, and Christoph Lampert.
    “Optimal Geospatial Volunteer Allocation Needs Realistic Distances.” In <i>2017
    IEEE International Conference on Big Data</i>, 3760–63. IEEE, 2017. <a href="https://doi.org/10.1109/BigData.2017.8258375">https://doi.org/10.1109/BigData.2017.8258375</a>.
  ieee: J. Pielorz, M. Prandtstetter, M. Straub, and C. Lampert, “Optimal geospatial
    volunteer allocation needs realistic distances,” in <i>2017 IEEE International
    Conference on Big Data</i>, Boston, MA, United States, 2017, pp. 3760–3763.
  ista: Pielorz J, Prandtstetter M, Straub M, Lampert C. 2017. Optimal geospatial
    volunteer allocation needs realistic distances. 2017 IEEE International Conference
    on Big Data. Big Data, 3760–3763.
  mla: Pielorz, Jasmin, et al. “Optimal Geospatial Volunteer Allocation Needs Realistic
    Distances.” <i>2017 IEEE International Conference on Big Data</i>, IEEE, 2017,
    pp. 3760–63, doi:<a href="https://doi.org/10.1109/BigData.2017.8258375">10.1109/BigData.2017.8258375</a>.
  short: J. Pielorz, M. Prandtstetter, M. Straub, C. Lampert, in:, 2017 IEEE International
    Conference on Big Data, IEEE, 2017, pp. 3760–3763.
conference:
  end_date: 2017-12-14
  location: Boston, MA, United States
  name: Big Data
  start_date: 2017-12-11
date_created: 2018-12-11T11:48:18Z
date_published: 2017-12-01T00:00:00Z
date_updated: 2021-01-12T08:13:55Z
day: '01'
department:
- _id: ChLa
doi: 10.1109/BigData.2017.8258375
language:
- iso: eng
month: '12'
oa_version: None
page: 3760 - 3763
publication: 2017 IEEE International Conference on Big Data
publication_identifier:
  isbn:
  - 978-153862714-3
publication_status: published
publisher: IEEE
publist_id: '6906'
quality_controlled: '1'
scopus_import: 1
status: public
title: Optimal geospatial volunteer allocation needs realistic distances
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2017'
...
---
_id: '652'
abstract:
- lang: eng
  text: 'We present an approach that enables robots to self-organize their sensorimotor
    behavior from scratch without providing specific information about neither the
    robot nor its environment. This is achieved by a simple neural control law that
    increases the consistency between external sensor dynamics and internal neural
    dynamics of the utterly simple controller. In this way, the embodiment and the
    agent-environment coupling are the only source of individual development. We show
    how an anthropomorphic tendon driven arm-shoulder system develops different behaviors
    depending on that coupling. For instance: Given a bottle half-filled with water,
    the arm starts to shake it, driven by the physical response of the water. When
    attaching a brush, the arm can be manipulated into wiping a table, and when connected
    to a revolvable wheel it finds out how to rotate it. Thus, the robot may be said
    to discover the affordances of the world. When allowing two (simulated) humanoid
    robots to interact physically, they engage into a joint behavior development leading
    to, for instance, spontaneous cooperation. More social effects are observed if
    the robots can visually perceive each other. Although, as an observer, it is tempting
    to attribute an apparent intentionality, there is nothing of the kind put in.
    As a conclusion, we argue that emergent behavior may be much less rooted in explicit
    intentions, internal motivations, or specific reward systems than is commonly
    believed.'
article_number: '7846789'
author:
- first_name: Ralf
  full_name: Der, Ralf
  last_name: Der
- first_name: Georg S
  full_name: Martius, Georg S
  id: 3A276B68-F248-11E8-B48F-1D18A9856A87
  last_name: Martius
citation:
  ama: 'Der R, Martius GS. Dynamical self consistency leads to behavioral development
    and emergent social interactions in robots. In: IEEE; 2017. doi:<a href="https://doi.org/10.1109/DEVLRN.2016.7846789">10.1109/DEVLRN.2016.7846789</a>'
  apa: 'Der, R., &#38; Martius, G. S. (2017). Dynamical self consistency leads to
    behavioral development and emergent social interactions in robots. Presented at
    the ICDL EpiRob: International Conference on Development and Learning and Epigenetic
    Robotics , Cergy-Pontoise, France: IEEE. <a href="https://doi.org/10.1109/DEVLRN.2016.7846789">https://doi.org/10.1109/DEVLRN.2016.7846789</a>'
  chicago: Der, Ralf, and Georg S Martius. “Dynamical Self Consistency Leads to Behavioral
    Development and Emergent Social Interactions in Robots.” IEEE, 2017. <a href="https://doi.org/10.1109/DEVLRN.2016.7846789">https://doi.org/10.1109/DEVLRN.2016.7846789</a>.
  ieee: 'R. Der and G. S. Martius, “Dynamical self consistency leads to behavioral
    development and emergent social interactions in robots,” presented at the ICDL
    EpiRob: International Conference on Development and Learning and Epigenetic Robotics
    , Cergy-Pontoise, France, 2017.'
  ista: 'Der R, Martius GS. 2017. Dynamical self consistency leads to behavioral development
    and emergent social interactions in robots. ICDL EpiRob: International Conference
    on Development and Learning and Epigenetic Robotics , 7846789.'
  mla: Der, Ralf, and Georg S. Martius. <i>Dynamical Self Consistency Leads to Behavioral
    Development and Emergent Social Interactions in Robots</i>. 7846789, IEEE, 2017,
    doi:<a href="https://doi.org/10.1109/DEVLRN.2016.7846789">10.1109/DEVLRN.2016.7846789</a>.
  short: R. Der, G.S. Martius, in:, IEEE, 2017.
conference:
  end_date: 2016-09-22
  location: Cergy-Pontoise, France
  name: 'ICDL EpiRob: International Conference on Development and Learning and Epigenetic
    Robotics '
  start_date: 2016-09-19
date_created: 2018-12-11T11:47:43Z
date_published: 2017-02-07T00:00:00Z
date_updated: 2021-01-12T08:07:51Z
day: '07'
department:
- _id: ChLa
- _id: GaTk
doi: 10.1109/DEVLRN.2016.7846789
language:
- iso: eng
month: '02'
oa_version: None
publication_identifier:
  isbn:
  - 978-150905069-7
publication_status: published
publisher: IEEE
publist_id: '7100'
quality_controlled: '1'
scopus_import: 1
status: public
title: Dynamical self consistency leads to behavioral development and emergent social
  interactions in robots
type: conference
user_id: 3E5EF7F0-F248-11E8-B48F-1D18A9856A87
year: '2017'
...
---
_id: '658'
abstract:
- lang: eng
  text: 'With the accelerated development of robot technologies, control becomes one
    of the central themes of research. In traditional approaches, the controller,
    by its internal functionality, finds appropriate actions on the basis of specific
    objectives for the task at hand. While very successful in many applications, self-organized
    control schemes seem to be favored in large complex systems with unknown dynamics
    or which are difficult to model. Reasons are the expected scalability, robustness,
    and resilience of self-organizing systems. The paper presents a self-learning
    neurocontroller based on extrinsic differential plasticity introduced recently,
    applying it to an anthropomorphic musculoskeletal robot arm with attached objects
    of unknown physical dynamics. The central finding of the paper is the following
    effect: by the mere feedback through the internal dynamics of the object, the
    robot is learning to relate each of the objects with a very specific sensorimotor
    pattern. Specifically, an attached pendulum pilots the arm into a circular motion,
    a half-filled bottle produces axis oriented shaking behavior, a wheel is getting
    rotated, and wiping patterns emerge automatically in a table-plus-brush setting.
    By these object-specific dynamical patterns, the robot may be said to recognize
    the object''s identity, or in other words, it discovers dynamical affordances
    of objects. Furthermore, when including hand coordinates obtained from a camera,
    a dedicated hand-eye coordination self-organizes spontaneously. These phenomena
    are discussed from a specific dynamical system perspective. Central is the dedicated
    working regime at the border to instability with its potentially infinite reservoir
    of (limit cycle) attractors &quot;waiting&quot; to be excited. Besides converging
    toward one of these attractors, variate behavior is also arising from a self-induced
    attractor morphing driven by the learning rule. We claim that experimental investigations
    with this anthropomorphic, self-learning robot not only generate interesting and
    potentially useful behaviors, but may also help to better understand what subjective
    human muscle feelings are, how they can be rooted in sensorimotor patterns, and
    how these concepts may feed back on robotics.'
article_number: '00008'
article_processing_charge: Yes
author:
- first_name: Ralf
  full_name: Der, Ralf
  last_name: Der
- first_name: Georg S
  full_name: Martius, Georg S
  id: 3A276B68-F248-11E8-B48F-1D18A9856A87
  last_name: Martius
citation:
  ama: Der R, Martius GS. Self organized behavior generation for musculoskeletal robots.
    <i>Frontiers in Neurorobotics</i>. 2017;11(MAR). doi:<a href="https://doi.org/10.3389/fnbot.2017.00008">10.3389/fnbot.2017.00008</a>
  apa: Der, R., &#38; Martius, G. S. (2017). Self organized behavior generation for
    musculoskeletal robots. <i>Frontiers in Neurorobotics</i>. Frontiers Research
    Foundation. <a href="https://doi.org/10.3389/fnbot.2017.00008">https://doi.org/10.3389/fnbot.2017.00008</a>
  chicago: Der, Ralf, and Georg S Martius. “Self Organized Behavior Generation for
    Musculoskeletal Robots.” <i>Frontiers in Neurorobotics</i>. Frontiers Research
    Foundation, 2017. <a href="https://doi.org/10.3389/fnbot.2017.00008">https://doi.org/10.3389/fnbot.2017.00008</a>.
  ieee: R. Der and G. S. Martius, “Self organized behavior generation for musculoskeletal
    robots,” <i>Frontiers in Neurorobotics</i>, vol. 11, no. MAR. Frontiers Research
    Foundation, 2017.
  ista: Der R, Martius GS. 2017. Self organized behavior generation for musculoskeletal
    robots. Frontiers in Neurorobotics. 11(MAR), 00008.
  mla: Der, Ralf, and Georg S. Martius. “Self Organized Behavior Generation for Musculoskeletal
    Robots.” <i>Frontiers in Neurorobotics</i>, vol. 11, no. MAR, 00008, Frontiers
    Research Foundation, 2017, doi:<a href="https://doi.org/10.3389/fnbot.2017.00008">10.3389/fnbot.2017.00008</a>.
  short: R. Der, G.S. Martius, Frontiers in Neurorobotics 11 (2017).
date_created: 2018-12-11T11:47:45Z
date_published: 2017-03-16T00:00:00Z
date_updated: 2021-01-12T08:08:04Z
day: '16'
ddc:
- '006'
department:
- _id: ChLa
- _id: GaTk
doi: 10.3389/fnbot.2017.00008
ec_funded: 1
file:
- access_level: open_access
  checksum: b1bc43f96d1df3313c03032c2a46388d
  content_type: application/pdf
  creator: system
  date_created: 2018-12-12T10:18:49Z
  date_updated: 2020-07-14T12:47:33Z
  file_id: '5371'
  file_name: IST-2017-903-v1+1_fnbot-11-00008.pdf
  file_size: 8439566
  relation: main_file
file_date_updated: 2020-07-14T12:47:33Z
has_accepted_license: '1'
intvolume: '        11'
issue: MAR
language:
- iso: eng
month: '03'
oa: 1
oa_version: Published Version
project:
- _id: 25681D80-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '291734'
  name: International IST Postdoc Fellowship Programme
publication: Frontiers in Neurorobotics
publication_identifier:
  issn:
  - '16625218'
publication_status: published
publisher: Frontiers Research Foundation
publist_id: '7078'
pubrep_id: '903'
quality_controlled: '1'
scopus_import: 1
status: public
title: Self organized behavior generation for musculoskeletal robots
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: journal_article
user_id: 2EBD1598-F248-11E8-B48F-1D18A9856A87
volume: 11
year: '2017'
...
---
_id: '911'
abstract:
- lang: eng
  text: We develop a probabilistic technique for colorizing grayscale natural images.
    In light of the intrinsic uncertainty of this task, the proposed probabilistic
    framework has numerous desirable properties. In particular, our model is able
    to produce multiple plausible and vivid colorizations for a given grayscale image
    and is one of the first colorization models to provide a proper stochastic sampling
    scheme. Moreover, our training procedure is supported by a rigorous theoretical
    framework that does not require any ad hoc heuristics and allows for efficient
    modeling and learning of the joint pixel color distribution.We demonstrate strong
    quantitative and qualitative experimental results on the CIFAR-10 dataset and
    the challenging ILSVRC 2012 dataset.
article_processing_charge: No
arxiv: 1
author:
- first_name: Amélie
  full_name: Royer, Amélie
  id: 3811D890-F248-11E8-B48F-1D18A9856A87
  last_name: Royer
  orcid: 0000-0002-8407-0705
- first_name: Alexander
  full_name: Kolesnikov, Alexander
  id: 2D157DB6-F248-11E8-B48F-1D18A9856A87
  last_name: Kolesnikov
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Royer A, Kolesnikov A, Lampert C. Probabilistic image colorization. In: BMVA
    Press; 2017:85.1-85.12. doi:<a href="https://doi.org/10.5244/c.31.85">10.5244/c.31.85</a>'
  apa: 'Royer, A., Kolesnikov, A., &#38; Lampert, C. (2017). Probabilistic image colorization
    (p. 85.1-85.12). Presented at the BMVC: British Machine Vision Conference, London,
    United Kingdom: BMVA Press. <a href="https://doi.org/10.5244/c.31.85">https://doi.org/10.5244/c.31.85</a>'
  chicago: Royer, Amélie, Alexander Kolesnikov, and Christoph Lampert. “Probabilistic
    Image Colorization,” 85.1-85.12. BMVA Press, 2017. <a href="https://doi.org/10.5244/c.31.85">https://doi.org/10.5244/c.31.85</a>.
  ieee: 'A. Royer, A. Kolesnikov, and C. Lampert, “Probabilistic image colorization,”
    presented at the BMVC: British Machine Vision Conference, London, United Kingdom,
    2017, p. 85.1-85.12.'
  ista: 'Royer A, Kolesnikov A, Lampert C. 2017. Probabilistic image colorization.
    BMVC: British Machine Vision Conference, 85.1-85.12.'
  mla: Royer, Amélie, et al. <i>Probabilistic Image Colorization</i>. BMVA Press,
    2017, p. 85.1-85.12, doi:<a href="https://doi.org/10.5244/c.31.85">10.5244/c.31.85</a>.
  short: A. Royer, A. Kolesnikov, C. Lampert, in:, BMVA Press, 2017, p. 85.1-85.12.
conference:
  end_date: 2017-09-07
  location: London, United Kingdom
  name: 'BMVC: British Machine Vision Conference'
  start_date: 2017-09-04
date_created: 2018-12-11T11:49:09Z
date_published: 2017-09-01T00:00:00Z
date_updated: 2023-10-16T10:04:02Z
day: '01'
ddc:
- '000'
department:
- _id: ChLa
doi: 10.5244/c.31.85
ec_funded: 1
external_id:
  arxiv:
  - '1705.04258'
file:
- access_level: open_access
  content_type: application/pdf
  creator: dernst
  date_created: 2020-08-10T07:14:33Z
  date_updated: 2020-08-10T07:14:33Z
  file_id: '8224'
  file_name: 2017_BMVC_Royer.pdf
  file_size: 1625363
  relation: main_file
  success: 1
file_date_updated: 2020-08-10T07:14:33Z
has_accepted_license: '1'
language:
- iso: eng
month: '09'
oa: 1
oa_version: Published Version
page: 85.1-85.12
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication_identifier:
  eisbn:
  - 190172560X
publication_status: published
publisher: BMVA Press
publist_id: '6532'
quality_controlled: '1'
related_material:
  record:
  - id: '8390'
    relation: dissertation_contains
    status: public
scopus_import: '1'
status: public
title: Probabilistic image colorization
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2017'
...
---
_id: '1000'
abstract:
- lang: eng
  text: 'We study probabilistic models of natural images and extend the autoregressive
    family of PixelCNN models by incorporating latent variables. Subsequently, we
    describe two new generative image models that exploit different image transformations
    as latent variables: a quantized grayscale view of the image or a multi-resolution
    image pyramid. The proposed models tackle two known shortcomings of existing PixelCNN
    models: 1) their tendency to focus on low-level image details, while largely ignoring
    high-level image information, such as object shapes, and 2) their computationally
    costly procedure for image sampling. We experimentally demonstrate benefits of
    our LatentPixelCNN models, in particular showing that they produce much more realistically
    looking image samples than previous state-of-the-art probabilistic models. '
acknowledgement: We thank Tim Salimans for spotting a mistake in our preliminary arXiv
  manuscript. This work was funded by the European Research Council under the European
  Unions Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no 308036.
article_processing_charge: No
arxiv: 1
author:
- first_name: Alexander
  full_name: Kolesnikov, Alexander
  id: 2D157DB6-F248-11E8-B48F-1D18A9856A87
  last_name: Kolesnikov
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Kolesnikov A, Lampert C. PixelCNN models with auxiliary variables for natural
    image modeling. In: <i>34th International Conference on Machine Learning</i>.
    Vol 70. JMLR; 2017:1905-1914.'
  apa: 'Kolesnikov, A., &#38; Lampert, C. (2017). PixelCNN models with auxiliary variables
    for natural image modeling. In <i>34th International Conference on Machine Learning</i>
    (Vol. 70, pp. 1905–1914). Sydney, Australia: JMLR.'
  chicago: Kolesnikov, Alexander, and Christoph Lampert. “PixelCNN Models with Auxiliary
    Variables for Natural Image Modeling.” In <i>34th International Conference on
    Machine Learning</i>, 70:1905–14. JMLR, 2017.
  ieee: A. Kolesnikov and C. Lampert, “PixelCNN models with auxiliary variables for
    natural image modeling,” in <i>34th International Conference on Machine Learning</i>,
    Sydney, Australia, 2017, vol. 70, pp. 1905–1914.
  ista: 'Kolesnikov A, Lampert C. 2017. PixelCNN models with auxiliary variables for
    natural image modeling. 34th International Conference on Machine Learning. ICML:
    International Conference on Machine Learning vol. 70, 1905–1914.'
  mla: Kolesnikov, Alexander, and Christoph Lampert. “PixelCNN Models with Auxiliary
    Variables for Natural Image Modeling.” <i>34th International Conference on Machine
    Learning</i>, vol. 70, JMLR, 2017, pp. 1905–14.
  short: A. Kolesnikov, C. Lampert, in:, 34th International Conference on Machine
    Learning, JMLR, 2017, pp. 1905–1914.
conference:
  end_date: 2017-08-11
  location: Sydney, Australia
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2017-08-06
date_created: 2018-12-11T11:49:37Z
date_published: 2017-08-01T00:00:00Z
date_updated: 2023-09-22T09:50:41Z
day: '01'
department:
- _id: ChLa
ec_funded: 1
external_id:
  arxiv:
  - '1612.08185'
  isi:
  - '000683309501102'
has_accepted_license: '1'
intvolume: '        70'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/1612.08185
month: '08'
oa: 1
oa_version: Submitted Version
page: 1905 - 1914
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication: 34th International Conference on Machine Learning
publication_identifier:
  isbn:
  - 978-151085514-4
publication_status: published
publisher: JMLR
publist_id: '6398'
quality_controlled: '1'
scopus_import: '1'
status: public
title: PixelCNN models with auxiliary variables for natural image modeling
type: conference
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
volume: 70
year: '2017'
...
---
_id: '998'
abstract:
- lang: eng
  text: 'A major open problem on the road to artificial intelligence is the development
    of incrementally learning systems that learn about more and more concepts over
    time from a stream of data. In this work, we introduce a new training strategy,
    iCaRL, that allows learning in such a class-incremental way: only the training
    data for a small number of classes has to be present at the same time and new
    classes can be added progressively. iCaRL learns strong classifiers and a data
    representation simultaneously. This distinguishes it from earlier works that were
    fundamentally limited to fixed data representations and therefore incompatible
    with deep learning architectures. We show by experiments on CIFAR-100 and ImageNet
    ILSVRC 2012 data that iCaRL can learn many classes incrementally over a long period
    of time where other strategies quickly fail. '
article_processing_charge: No
author:
- first_name: Sylvestre Alvise
  full_name: Rebuffi, Sylvestre Alvise
  last_name: Rebuffi
- first_name: Alexander
  full_name: Kolesnikov, Alexander
  id: 2D157DB6-F248-11E8-B48F-1D18A9856A87
  last_name: Kolesnikov
- first_name: Georg
  full_name: Sperl, Georg
  id: 4DD40360-F248-11E8-B48F-1D18A9856A87
  last_name: Sperl
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Rebuffi SA, Kolesnikov A, Sperl G, Lampert C. iCaRL: Incremental classifier
    and representation learning. In: Vol 2017. IEEE; 2017:5533-5542. doi:<a href="https://doi.org/10.1109/CVPR.2017.587">10.1109/CVPR.2017.587</a>'
  apa: 'Rebuffi, S. A., Kolesnikov, A., Sperl, G., &#38; Lampert, C. (2017). iCaRL:
    Incremental classifier and representation learning (Vol. 2017, pp. 5533–5542).
    Presented at the CVPR: Computer Vision and Pattern Recognition, Honolulu, HA,
    United States: IEEE. <a href="https://doi.org/10.1109/CVPR.2017.587">https://doi.org/10.1109/CVPR.2017.587</a>'
  chicago: 'Rebuffi, Sylvestre Alvise, Alexander Kolesnikov, Georg Sperl, and Christoph
    Lampert. “ICaRL: Incremental Classifier and Representation Learning,” 2017:5533–42.
    IEEE, 2017. <a href="https://doi.org/10.1109/CVPR.2017.587">https://doi.org/10.1109/CVPR.2017.587</a>.'
  ieee: 'S. A. Rebuffi, A. Kolesnikov, G. Sperl, and C. Lampert, “iCaRL: Incremental
    classifier and representation learning,” presented at the CVPR: Computer Vision
    and Pattern Recognition, Honolulu, HA, United States, 2017, vol. 2017, pp. 5533–5542.'
  ista: 'Rebuffi SA, Kolesnikov A, Sperl G, Lampert C. 2017. iCaRL: Incremental classifier
    and representation learning. CVPR: Computer Vision and Pattern Recognition vol.
    2017, 5533–5542.'
  mla: 'Rebuffi, Sylvestre Alvise, et al. <i>ICaRL: Incremental Classifier and Representation
    Learning</i>. Vol. 2017, IEEE, 2017, pp. 5533–42, doi:<a href="https://doi.org/10.1109/CVPR.2017.587">10.1109/CVPR.2017.587</a>.'
  short: S.A. Rebuffi, A. Kolesnikov, G. Sperl, C. Lampert, in:, IEEE, 2017, pp. 5533–5542.
conference:
  end_date: 2017-07-26
  location: Honolulu, HA, United States
  name: 'CVPR: Computer Vision and Pattern Recognition'
  start_date: 2017-07-21
date_created: 2018-12-11T11:49:37Z
date_published: 2017-04-14T00:00:00Z
date_updated: 2023-09-22T09:51:58Z
day: '14'
department:
- _id: ChLa
- _id: ChWo
doi: 10.1109/CVPR.2017.587
ec_funded: 1
external_id:
  isi:
  - '000418371405066'
intvolume: '      2017'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/1611.07725
month: '04'
oa: 1
oa_version: Submitted Version
page: 5533 - 5542
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication_identifier:
  isbn:
  - 978-153860457-1
publication_status: published
publisher: IEEE
publist_id: '6400'
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'iCaRL: Incremental classifier and representation learning'
type: conference
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
volume: 2017
year: '2017'
...
---
_id: '999'
abstract:
- lang: eng
  text: 'In multi-task learning, a learner is given a collection of prediction tasks
    and needs to solve all of them. In contrast to previous work, which required that
    annotated training data must be available for all tasks, we consider a new setting,
    in which for some tasks, potentially most of them, only unlabeled training data
    is provided. Consequently, to solve all tasks, information must be transferred
    between tasks with labels and tasks without labels. Focusing on an instance-based
    transfer method we analyze two variants of this setting: when the set of labeled
    tasks is fixed, and when it can be actively selected by the learner. We state
    and prove a generalization bound that covers both scenarios and derive from it
    an algorithm for making the choice of labeled tasks (in the active case) and for
    transferring information between the tasks in a principled way. We also illustrate
    the effectiveness of the algorithm on synthetic and real data. '
alternative_title:
- PMLR
article_processing_charge: No
author:
- first_name: Anastasia
  full_name: Pentina, Anastasia
  id: 42E87FC6-F248-11E8-B48F-1D18A9856A87
  last_name: Pentina
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Pentina A, Lampert C. Multi-task learning with labeled and unlabeled tasks.
    In: Vol 70. ML Research Press; 2017:2807-2816.'
  apa: 'Pentina, A., &#38; Lampert, C. (2017). Multi-task learning with labeled and
    unlabeled tasks (Vol. 70, pp. 2807–2816). Presented at the ICML: International
    Conference on Machine Learning, Sydney, Australia: ML Research Press.'
  chicago: Pentina, Anastasia, and Christoph Lampert. “Multi-Task Learning with Labeled
    and Unlabeled Tasks,” 70:2807–16. ML Research Press, 2017.
  ieee: 'A. Pentina and C. Lampert, “Multi-task learning with labeled and unlabeled
    tasks,” presented at the ICML: International Conference on Machine Learning, Sydney,
    Australia, 2017, vol. 70, pp. 2807–2816.'
  ista: 'Pentina A, Lampert C. 2017. Multi-task learning with labeled and unlabeled
    tasks. ICML: International Conference on Machine Learning, PMLR, vol. 70, 2807–2816.'
  mla: Pentina, Anastasia, and Christoph Lampert. <i>Multi-Task Learning with Labeled
    and Unlabeled Tasks</i>. Vol. 70, ML Research Press, 2017, pp. 2807–16.
  short: A. Pentina, C. Lampert, in:, ML Research Press, 2017, pp. 2807–2816.
conference:
  end_date: 2017-08-11
  location: Sydney, Australia
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2017-08-06
date_created: 2018-12-11T11:49:37Z
date_published: 2017-06-08T00:00:00Z
date_updated: 2023-10-17T11:53:32Z
day: '08'
department:
- _id: ChLa
ec_funded: 1
external_id:
  isi:
  - '000683309502093'
intvolume: '        70'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/1602.06518
month: '06'
oa: 1
oa_version: Submitted Version
page: 2807 - 2816
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication_identifier:
  isbn:
  - '9781510855144'
publication_status: published
publisher: ML Research Press
publist_id: '6399'
quality_controlled: '1'
scopus_import: '1'
status: public
title: Multi-task learning with labeled and unlabeled tasks
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 70
year: '2017'
...