---
_id: '13074'
abstract:
- lang: eng
  text: "Deep learning has become an integral part of a large number of important
    applications, and many of the recent breakthroughs have been enabled by the ability
    to train very large models, capable to capture complex patterns and relationships
    from the data. At the same time, the massive sizes of modern deep learning models
    have made their deployment to smaller devices more challenging; this is particularly
    important, as in many applications the users rely on accurate deep learning predictions,
    but they only have access to devices with limited memory and compute power. One
    solution to this problem is to prune neural networks, by setting as many of their
    parameters as possible to zero, to obtain accurate sparse models with lower memory
    footprint. Despite the great research progress in obtaining sparse models that
    preserve accuracy, while satisfying memory and computational constraints, there
    are still many challenges associated with efficiently training sparse models,
    as well as understanding their generalization properties.\r\n\r\nThe focus of
    this thesis is to investigate how the training process of sparse models can be
    made more efficient, and to understand the differences between sparse and dense
    models in terms of how well they can generalize to changes in the data distribution.
    We first study a method for co-training sparse and dense models, at a lower cost
    compared to regular training. With our method we can obtain very accurate sparse
    networks, and dense models that can recover the baseline accuracy. Furthermore,
    we are able to more easily analyze the differences, at prediction level, between
    the sparse-dense model pairs. Next, we investigate the generalization properties
    of sparse neural networks in more detail, by studying how well different sparse
    models trained on a larger task can adapt to smaller, more specialized tasks,
    in a transfer learning scenario. Our analysis across multiple pruning methods
    and sparsity levels reveals that sparse models provide features that can transfer
    similarly to or better than the dense baseline. However, the choice of the pruning
    method plays an important role, and can influence the results when the features
    are fixed (linear finetuning), or when they are allowed to adapt to the new task
    (full finetuning). Using sparse models with fixed masks for finetuning on new
    tasks has an important practical advantage, as it enables training neural networks
    on smaller devices. However, one drawback of current pruning methods is that the
    entire training cycle has to be repeated to obtain the initial sparse model, for
    every sparsity target; in consequence, the entire training process is costly and
    also multiple models need to be stored. In the last part of the thesis we propose
    a method that can train accurate dense models that are compressible in a single
    step, to multiple sparsity levels, without additional finetuning. Our method results
    in sparse models that can be competitive with existing pruning methods, and which
    can also successfully generalize to new tasks."
acknowledged_ssus:
- _id: ScienComp
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Elena-Alexandra
  full_name: Peste, Elena-Alexandra
  id: 32D78294-F248-11E8-B48F-1D18A9856A87
  last_name: Peste
citation:
  ama: Peste E-A. Efficiency and generalization of sparse neural networks. 2023. doi:<a
    href="https://doi.org/10.15479/at:ista:13074">10.15479/at:ista:13074</a>
  apa: Peste, E.-A. (2023). <i>Efficiency and generalization of sparse neural networks</i>.
    Institute of Science and Technology Austria. <a href="https://doi.org/10.15479/at:ista:13074">https://doi.org/10.15479/at:ista:13074</a>
  chicago: Peste, Elena-Alexandra. “Efficiency and Generalization of Sparse Neural
    Networks.” Institute of Science and Technology Austria, 2023. <a href="https://doi.org/10.15479/at:ista:13074">https://doi.org/10.15479/at:ista:13074</a>.
  ieee: E.-A. Peste, “Efficiency and generalization of sparse neural networks,” Institute
    of Science and Technology Austria, 2023.
  ista: Peste E-A. 2023. Efficiency and generalization of sparse neural networks.
    Institute of Science and Technology Austria.
  mla: Peste, Elena-Alexandra. <i>Efficiency and Generalization of Sparse Neural Networks</i>.
    Institute of Science and Technology Austria, 2023, doi:<a href="https://doi.org/10.15479/at:ista:13074">10.15479/at:ista:13074</a>.
  short: E.-A. Peste, Efficiency and Generalization of Sparse Neural Networks, Institute
    of Science and Technology Austria, 2023.
date_created: 2023-05-23T17:07:53Z
date_published: 2023-05-23T00:00:00Z
date_updated: 2023-08-04T10:33:27Z
day: '23'
ddc:
- '000'
degree_awarded: PhD
department:
- _id: GradSch
- _id: DaAl
- _id: ChLa
doi: 10.15479/at:ista:13074
ec_funded: 1
file:
- access_level: open_access
  checksum: 6b3354968403cb9d48cc5a83611fb571
  content_type: application/pdf
  creator: epeste
  date_created: 2023-05-24T16:11:16Z
  date_updated: 2023-05-24T16:11:16Z
  file_id: '13087'
  file_name: PhD_Thesis_Alexandra_Peste_final.pdf
  file_size: 2152072
  relation: main_file
  success: 1
- access_level: closed
  checksum: 8d0df94bbcf4db72c991f22503b3fd60
  content_type: application/zip
  creator: epeste
  date_created: 2023-05-24T16:12:59Z
  date_updated: 2023-05-24T16:12:59Z
  file_id: '13088'
  file_name: PhD_Thesis_APeste.zip
  file_size: 1658293
  relation: source_file
file_date_updated: 2023-05-24T16:12:59Z
has_accepted_license: '1'
language:
- iso: eng
month: '05'
oa: 1
oa_version: Published Version
page: '147'
project:
- _id: 2564DBCA-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '665385'
  name: International IST Doctoral Program
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '805223'
  name: Elastic Coordination for Scalable Machine Learning
publication_identifier:
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
related_material:
  record:
  - id: '11458'
    relation: part_of_dissertation
    status: public
  - id: '13053'
    relation: part_of_dissertation
    status: public
  - id: '12299'
    relation: part_of_dissertation
    status: public
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
- first_name: Dan-Adrian
  full_name: Alistarh, Dan-Adrian
  id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
  last_name: Alistarh
  orcid: 0000-0003-3650-940X
title: Efficiency and generalization of sparse neural networks
type: dissertation
user_id: 8b945eb4-e2f2-11eb-945a-df72226e66a9
year: '2023'
...
---
_id: '10799'
abstract:
- lang: eng
  text: "Because of the increasing popularity of machine learning methods, it is becoming
    important to understand the impact of learned components on automated decision-making
    systems and to guarantee that their consequences are beneficial to society. In
    other words, it is necessary to ensure that machine learning is sufficiently trustworthy
    to be used in real-world applications. This thesis studies two properties of machine
    learning models that are highly desirable for the\r\nsake of reliability: robustness
    and fairness. In the first part of the thesis we study the robustness of learning
    algorithms to training data corruption. Previous work has shown that machine learning
    models are vulnerable to a range\r\nof training set issues, varying from label
    noise through systematic biases to worst-case data manipulations. This is an especially
    relevant problem from a present perspective, since modern machine learning methods
    are particularly data hungry and therefore practitioners often have to rely on
    data collected from various external sources, e.g. from the Internet, from app
    users or via crowdsourcing. Naturally, such sources vary greatly in the quality
    and reliability of the\r\ndata they provide. With these considerations in mind,
    we study the problem of designing machine learning algorithms that are robust
    to corruptions in data coming from multiple sources. We show that, in contrast
    to the case of a single dataset with outliers, successful learning within this
    model is possible both theoretically and practically, even under worst-case data
    corruptions. The second part of this thesis deals with fairness-aware machine
    learning. There are multiple areas where machine learning models have shown promising
    results, but where careful considerations are required, in order to avoid discrimanative
    decisions taken by such learned components. Ensuring fairness can be particularly
    challenging, because real-world training datasets are expected to contain various
    forms of historical bias that may affect the learning process. In this thesis
    we show that data corruption can indeed render the problem of achieving fairness
    impossible, by tightly characterizing the theoretical limits of fair learning
    under worst-case data manipulations. However, assuming access to clean data, we
    also show how fairness-aware learning can be made practical in contexts beyond
    binary classification, in particular in the challenging learning to rank setting."
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Nikola H
  full_name: Konstantinov, Nikola H
  id: 4B9D76E4-F248-11E8-B48F-1D18A9856A87
  last_name: Konstantinov
citation:
  ama: Konstantinov NH. Robustness and fairness in machine learning. 2022. doi:<a
    href="https://doi.org/10.15479/at:ista:10799">10.15479/at:ista:10799</a>
  apa: Konstantinov, N. H. (2022). <i>Robustness and fairness in machine learning</i>.
    Institute of Science and Technology Austria. <a href="https://doi.org/10.15479/at:ista:10799">https://doi.org/10.15479/at:ista:10799</a>
  chicago: Konstantinov, Nikola H. “Robustness and Fairness in Machine Learning.”
    Institute of Science and Technology Austria, 2022. <a href="https://doi.org/10.15479/at:ista:10799">https://doi.org/10.15479/at:ista:10799</a>.
  ieee: N. H. Konstantinov, “Robustness and fairness in machine learning,” Institute
    of Science and Technology Austria, 2022.
  ista: Konstantinov NH. 2022. Robustness and fairness in machine learning. Institute
    of Science and Technology Austria.
  mla: Konstantinov, Nikola H. <i>Robustness and Fairness in Machine Learning</i>.
    Institute of Science and Technology Austria, 2022, doi:<a href="https://doi.org/10.15479/at:ista:10799">10.15479/at:ista:10799</a>.
  short: N.H. Konstantinov, Robustness and Fairness in Machine Learning, Institute
    of Science and Technology Austria, 2022.
date_created: 2022-02-28T13:03:49Z
date_published: 2022-03-08T00:00:00Z
date_updated: 2023-10-17T12:31:54Z
day: '08'
ddc:
- '000'
degree_awarded: PhD
department:
- _id: GradSch
- _id: ChLa
doi: 10.15479/at:ista:10799
ec_funded: 1
file:
- access_level: open_access
  checksum: 626bc523ae8822d20e635d0e2d95182e
  content_type: application/pdf
  creator: nkonstan
  date_created: 2022-03-06T11:42:54Z
  date_updated: 2022-03-06T11:42:54Z
  file_id: '10823'
  file_name: thesis.pdf
  file_size: 4204905
  relation: main_file
  success: 1
- access_level: closed
  checksum: e2ca2b88350ac8ea1515b948885cbcb1
  content_type: application/x-zip-compressed
  creator: nkonstan
  date_created: 2022-03-06T11:42:57Z
  date_updated: 2022-03-10T12:11:48Z
  file_id: '10824'
  file_name: thesis.zip
  file_size: 22841103
  relation: source_file
file_date_updated: 2022-03-10T12:11:48Z
has_accepted_license: '1'
keyword:
- robustness
- fairness
- machine learning
- PAC learning
- adversarial learning
language:
- iso: eng
month: '03'
oa: 1
oa_version: Published Version
page: '176'
project:
- _id: 2564DBCA-B435-11E9-9278-68D0E5697425
  call_identifier: H2020
  grant_number: '665385'
  name: International IST Doctoral Program
publication_identifier:
  isbn:
  - 978-3-99078-015-2
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
related_material:
  record:
  - id: '8724'
    relation: part_of_dissertation
    status: public
  - id: '10803'
    relation: part_of_dissertation
    status: public
  - id: '10802'
    relation: part_of_dissertation
    status: public
  - id: '6590'
    relation: part_of_dissertation
    status: public
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Robustness and fairness in machine learning
type: dissertation
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2022'
...
---
_id: '9418'
abstract:
- lang: eng
  text: "Deep learning is best known for its empirical success across a wide range
    of applications\r\nspanning computer vision, natural language processing and speech.
    Of equal significance,\r\nthough perhaps less known, are its ramifications for
    learning theory: deep networks have\r\nbeen observed to perform surprisingly well
    in the high-capacity regime, aka the overfitting\r\nor underspecified regime.
    Classically, this regime on the far right of the bias-variance curve\r\nis associated
    with poor generalisation; however, recent experiments with deep networks\r\nchallenge
    this view.\r\n\r\nThis thesis is devoted to investigating various aspects of underspecification
    in deep learning.\r\nFirst, we argue that deep learning models are underspecified
    on two levels: a) any given\r\ntraining dataset can be fit by many different functions,
    and b) any given function can be\r\nexpressed by many different parameter configurations.
    We refer to the second kind of\r\nunderspecification as parameterisation redundancy
    and we precisely characterise its extent.\r\nSecond, we characterise the implicit
    criteria (the inductive bias) that guide learning in the\r\nunderspecified regime.
    Specifically, we consider a nonlinear but tractable classification\r\nsetting,
    and show that given the choice, neural networks learn classifiers with a large
    margin.\r\nThird, we consider learning scenarios where the inductive bias is not
    by itself sufficient to\r\ndeal with underspecification. We then study different
    ways of ‘tightening the specification’: i)\r\nIn the setting of representation
    learning with variational autoencoders, we propose a hand-\r\ncrafted regulariser
    based on mutual information. ii) In the setting of binary classification, we\r\nconsider
    soft-label (real-valued) supervision. We derive a generalisation bound for linear\r\nnetworks
    supervised in this way and verify that soft labels facilitate fast learning. Finally,
    we\r\nexplore an application of soft-label supervision to the training of multi-exit
    models."
acknowledged_ssus:
- _id: ScienComp
- _id: CampIT
- _id: E-Lib
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Phuong
  full_name: Bui Thi Mai, Phuong
  id: 3EC6EE64-F248-11E8-B48F-1D18A9856A87
  last_name: Bui Thi Mai
citation:
  ama: Phuong M. Underspecification in deep learning. 2021. doi:<a href="https://doi.org/10.15479/AT:ISTA:9418">10.15479/AT:ISTA:9418</a>
  apa: Phuong, M. (2021). <i>Underspecification in deep learning</i>. Institute of
    Science and Technology Austria. <a href="https://doi.org/10.15479/AT:ISTA:9418">https://doi.org/10.15479/AT:ISTA:9418</a>
  chicago: Phuong, Mary. “Underspecification in Deep Learning.” Institute of Science
    and Technology Austria, 2021. <a href="https://doi.org/10.15479/AT:ISTA:9418">https://doi.org/10.15479/AT:ISTA:9418</a>.
  ieee: M. Phuong, “Underspecification in deep learning,” Institute of Science and
    Technology Austria, 2021.
  ista: Phuong M. 2021. Underspecification in deep learning. Institute of Science
    and Technology Austria.
  mla: Phuong, Mary. <i>Underspecification in Deep Learning</i>. Institute of Science
    and Technology Austria, 2021, doi:<a href="https://doi.org/10.15479/AT:ISTA:9418">10.15479/AT:ISTA:9418</a>.
  short: M. Phuong, Underspecification in Deep Learning, Institute of Science and
    Technology Austria, 2021.
date_created: 2021-05-24T13:06:23Z
date_published: 2021-05-30T00:00:00Z
date_updated: 2023-09-08T11:11:12Z
day: '30'
ddc:
- '000'
degree_awarded: PhD
department:
- _id: GradSch
- _id: ChLa
doi: 10.15479/AT:ISTA:9418
file:
- access_level: open_access
  checksum: 4f0abe64114cfed264f9d36e8d1197e3
  content_type: application/pdf
  creator: bphuong
  date_created: 2021-05-24T11:22:29Z
  date_updated: 2021-05-24T11:22:29Z
  file_id: '9419'
  file_name: mph-thesis-v519-pdfimages.pdf
  file_size: 2673905
  relation: main_file
  success: 1
- access_level: closed
  checksum: f5699e876bc770a9b0df8345a77720a2
  content_type: application/zip
  creator: bphuong
  date_created: 2021-05-24T11:56:02Z
  date_updated: 2021-05-24T11:56:02Z
  file_id: '9420'
  file_name: thesis.zip
  file_size: 92995100
  relation: source_file
file_date_updated: 2021-05-24T11:56:02Z
has_accepted_license: '1'
language:
- iso: eng
month: '05'
oa: 1
oa_version: Published Version
page: '125'
publication_identifier:
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
related_material:
  record:
  - id: '7435'
    relation: part_of_dissertation
    status: deleted
  - id: '7481'
    relation: part_of_dissertation
    status: public
  - id: '9416'
    relation: part_of_dissertation
    status: public
  - id: '7479'
    relation: part_of_dissertation
    status: public
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Underspecification in deep learning
type: dissertation
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2021'
...
---
_id: '8390'
abstract:
- lang: eng
  text: "Deep neural networks have established a new standard for data-dependent feature
    extraction pipelines in the Computer Vision literature. Despite their remarkable
    performance in the standard supervised learning scenario, i.e. when models are
    trained with labeled data and tested on samples that follow a similar distribution,
    neural networks have been shown to struggle with more advanced generalization
    abilities, such as transferring knowledge across visually different domains, or
    generalizing to new unseen combinations of known concepts. In this thesis we argue
    that, in contrast to the usual black-box behavior of neural networks, leveraging
    more structured internal representations is a promising direction\r\nfor tackling
    such problems. In particular, we focus on two forms of structure. First, we tackle
    modularity: We show that (i) compositional architectures are a natural tool for
    modeling reasoning tasks, in that they efficiently capture their combinatorial
    nature, which is key for generalizing beyond the compositions seen during training.
    We investigate how to to learn such models, both formally and experimentally,
    for the task of abstract visual reasoning. Then, we show that (ii) in some settings,
    modularity allows us to efficiently break down complex tasks into smaller, easier,
    modules, thereby improving computational efficiency; We study this behavior in
    the context of generative models for colorization, as well as for small objects
    detection. Secondly, we investigate the inherently layered structure of representations
    learned by neural networks, and analyze its role in the context of transfer learning
    and domain adaptation across visually\r\ndissimilar domains. "
acknowledged_ssus:
- _id: CampIT
- _id: ScienComp
acknowledgement: Last but not least, I would like to acknowledge the support of the
  IST IT and scientific computing team for helping provide a great work environment.
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Amélie
  full_name: Royer, Amélie
  id: 3811D890-F248-11E8-B48F-1D18A9856A87
  last_name: Royer
  orcid: 0000-0002-8407-0705
citation:
  ama: Royer A. Leveraging structure in Computer Vision tasks for flexible Deep Learning
    models. 2020. doi:<a href="https://doi.org/10.15479/AT:ISTA:8390">10.15479/AT:ISTA:8390</a>
  apa: Royer, A. (2020). <i>Leveraging structure in Computer Vision tasks for flexible
    Deep Learning models</i>. Institute of Science and Technology Austria. <a href="https://doi.org/10.15479/AT:ISTA:8390">https://doi.org/10.15479/AT:ISTA:8390</a>
  chicago: Royer, Amélie. “Leveraging Structure in Computer Vision Tasks for Flexible
    Deep Learning Models.” Institute of Science and Technology Austria, 2020. <a href="https://doi.org/10.15479/AT:ISTA:8390">https://doi.org/10.15479/AT:ISTA:8390</a>.
  ieee: A. Royer, “Leveraging structure in Computer Vision tasks for flexible Deep
    Learning models,” Institute of Science and Technology Austria, 2020.
  ista: Royer A. 2020. Leveraging structure in Computer Vision tasks for flexible
    Deep Learning models. Institute of Science and Technology Austria.
  mla: Royer, Amélie. <i>Leveraging Structure in Computer Vision Tasks for Flexible
    Deep Learning Models</i>. Institute of Science and Technology Austria, 2020, doi:<a
    href="https://doi.org/10.15479/AT:ISTA:8390">10.15479/AT:ISTA:8390</a>.
  short: A. Royer, Leveraging Structure in Computer Vision Tasks for Flexible Deep
    Learning Models, Institute of Science and Technology Austria, 2020.
date_created: 2020-09-14T13:42:09Z
date_published: 2020-09-14T00:00:00Z
date_updated: 2023-10-16T10:04:02Z
day: '14'
ddc:
- '000'
degree_awarded: PhD
department:
- _id: ChLa
doi: 10.15479/AT:ISTA:8390
file:
- access_level: open_access
  checksum: c914d2f88846032f3d8507734861b6ee
  content_type: application/pdf
  creator: dernst
  date_created: 2020-09-14T13:39:14Z
  date_updated: 2020-09-14T13:39:14Z
  file_id: '8391'
  file_name: 2020_Thesis_Royer.pdf
  file_size: 30224591
  relation: main_file
  success: 1
- access_level: closed
  checksum: ae98fb35d912cff84a89035ae5794d3c
  content_type: application/x-zip-compressed
  creator: dernst
  date_created: 2020-09-14T13:39:17Z
  date_updated: 2020-09-14T13:39:17Z
  file_id: '8392'
  file_name: thesis_sources.zip
  file_size: 74227627
  relation: main_file
file_date_updated: 2020-09-14T13:39:17Z
has_accepted_license: '1'
language:
- iso: eng
license: https://creativecommons.org/licenses/by-nc-sa/4.0/
month: '09'
oa: 1
oa_version: Published Version
page: '197'
publication_identifier:
  isbn:
  - 978-3-99078-007-7
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
related_material:
  record:
  - id: '7936'
    relation: part_of_dissertation
    status: public
  - id: '7937'
    relation: part_of_dissertation
    status: public
  - id: '8193'
    relation: part_of_dissertation
    status: public
  - id: '8092'
    relation: part_of_dissertation
    status: public
  - id: '911'
    relation: part_of_dissertation
    status: public
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Leveraging structure in Computer Vision tasks for flexible Deep Learning models
tmp:
  image: /images/cc_by_nc_sa.png
  legal_code_url: https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode
  name: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC
    BY-NC-SA 4.0)
  short: CC BY-NC-SA (4.0)
type: dissertation
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2020'
...
---
_id: '197'
abstract:
- lang: eng
  text: Modern computer vision systems heavily rely on statistical machine learning
    models, which typically require large amounts of labeled data to be learned reliably.
    Moreover, very recently computer vision research widely adopted techniques for
    representation learning, which further increase the demand for labeled data. However,
    for many important practical problems there is relatively small amount of labeled
    data available, so it is problematic to leverage full potential of the representation
    learning methods. One way to overcome this obstacle is to invest substantial resources
    into producing large labelled datasets. Unfortunately, this can be prohibitively
    expensive in practice. In this thesis we focus on the alternative way of tackling
    the aforementioned issue. We concentrate on methods, which make use of weakly-labeled
    or even unlabeled data. Specifically, the first half of the thesis is dedicated
    to the semantic image segmentation task. We develop a technique, which achieves
    competitive segmentation performance and only requires annotations in a form of
    global image-level labels instead of dense segmentation masks. Subsequently, we
    present a new methodology, which further improves segmentation performance by
    leveraging tiny additional feedback from a human annotator. By using our methods
    practitioners can greatly reduce the amount of data annotation effort, which is
    required to learn modern image segmentation models. In the second half of the
    thesis we focus on methods for learning from unlabeled visual data. We study a
    family of autoregressive models for modeling structure of natural images and discuss
    potential applications of these models. Moreover, we conduct in-depth study of
    one of these applications, where we develop the state-of-the-art model for the
    probabilistic image colorization task.
acknowledgement: I also gratefully acknowledge the support of NVIDIA Corporation with
  the donation of the GPUs used for this research.
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Alexander
  full_name: Kolesnikov, Alexander
  id: 2D157DB6-F248-11E8-B48F-1D18A9856A87
  last_name: Kolesnikov
citation:
  ama: Kolesnikov A. Weakly-Supervised Segmentation and Unsupervised Modeling of Natural
    Images. 2018. doi:<a href="https://doi.org/10.15479/AT:ISTA:th_1021">10.15479/AT:ISTA:th_1021</a>
  apa: Kolesnikov, A. (2018). <i>Weakly-Supervised Segmentation and Unsupervised Modeling
    of Natural Images</i>. Institute of Science and Technology Austria. <a href="https://doi.org/10.15479/AT:ISTA:th_1021">https://doi.org/10.15479/AT:ISTA:th_1021</a>
  chicago: Kolesnikov, Alexander. “Weakly-Supervised Segmentation and Unsupervised
    Modeling of Natural Images.” Institute of Science and Technology Austria, 2018.
    <a href="https://doi.org/10.15479/AT:ISTA:th_1021">https://doi.org/10.15479/AT:ISTA:th_1021</a>.
  ieee: A. Kolesnikov, “Weakly-Supervised Segmentation and Unsupervised Modeling of
    Natural Images,” Institute of Science and Technology Austria, 2018.
  ista: Kolesnikov A. 2018. Weakly-Supervised Segmentation and Unsupervised Modeling
    of Natural Images. Institute of Science and Technology Austria.
  mla: Kolesnikov, Alexander. <i>Weakly-Supervised Segmentation and Unsupervised Modeling
    of Natural Images</i>. Institute of Science and Technology Austria, 2018, doi:<a
    href="https://doi.org/10.15479/AT:ISTA:th_1021">10.15479/AT:ISTA:th_1021</a>.
  short: A. Kolesnikov, Weakly-Supervised Segmentation and Unsupervised Modeling of
    Natural Images, Institute of Science and Technology Austria, 2018.
date_created: 2018-12-11T11:45:09Z
date_published: 2018-05-25T00:00:00Z
date_updated: 2023-09-07T12:51:46Z
day: '25'
ddc:
- '004'
degree_awarded: PhD
department:
- _id: ChLa
doi: 10.15479/AT:ISTA:th_1021
ec_funded: 1
file:
- access_level: open_access
  checksum: bc678e02468d8ebc39dc7267dfb0a1c4
  content_type: application/pdf
  creator: system
  date_created: 2018-12-12T10:14:57Z
  date_updated: 2020-07-14T12:45:22Z
  file_id: '5113'
  file_name: IST-2018-1021-v1+1_thesis-unsigned-pdfa.pdf
  file_size: 12918758
  relation: main_file
- access_level: closed
  checksum: bc66973b086da5a043f1162dcfb1fde4
  content_type: application/zip
  creator: dernst
  date_created: 2019-04-05T09:34:49Z
  date_updated: 2020-07-14T12:45:22Z
  file_id: '6225'
  file_name: 2018_Thesis_Kolesnikov_source.zip
  file_size: 55973760
  relation: source_file
file_date_updated: 2020-07-14T12:45:22Z
has_accepted_license: '1'
language:
- iso: eng
month: '05'
oa: 1
oa_version: Published Version
page: '113'
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication_identifier:
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
publist_id: '7718'
pubrep_id: '1021'
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Weakly-Supervised Segmentation and Unsupervised Modeling of Natural Images
type: dissertation
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2018'
...
---
_id: '68'
abstract:
- lang: eng
  text: The most common assumption made in statistical learning theory is the assumption
    of the independent and identically distributed (i.i.d.) data. While being very
    convenient mathematically, it is often very clearly violated in practice. This
    disparity between the machine learning theory and applications underlies a growing
    demand in the development of algorithms that learn from dependent data and theory
    that can provide generalization guarantees similar to the independent situations.
    This thesis is dedicated to two variants of dependencies that can arise in practice.
    One is a dependence on the level of samples in a single learning task. Another
    dependency type arises in the multi-task setting when the tasks are dependent
    on each other even though the data for them can be i.i.d. In both cases we model
    the data (samples or tasks) as stochastic processes and introduce new algorithms
    for both settings that take into account and exploit the resulting dependencies.
    We prove the theoretical guarantees on the performance of the introduced algorithms
    under different evaluation criteria and, in addition, we compliment the theoretical
    study by the empirical one, where we evaluate some of the algorithms on two real
    world datasets to highlight their practical applicability.
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Alexander
  full_name: Zimin, Alexander
  id: 37099E9C-F248-11E8-B48F-1D18A9856A87
  last_name: Zimin
citation:
  ama: Zimin A. Learning from dependent data. 2018. doi:<a href="https://doi.org/10.15479/AT:ISTA:TH1048">10.15479/AT:ISTA:TH1048</a>
  apa: Zimin, A. (2018). <i>Learning from dependent data</i>. Institute of Science
    and Technology Austria. <a href="https://doi.org/10.15479/AT:ISTA:TH1048">https://doi.org/10.15479/AT:ISTA:TH1048</a>
  chicago: Zimin, Alexander. “Learning from Dependent Data.” Institute of Science
    and Technology Austria, 2018. <a href="https://doi.org/10.15479/AT:ISTA:TH1048">https://doi.org/10.15479/AT:ISTA:TH1048</a>.
  ieee: A. Zimin, “Learning from dependent data,” Institute of Science and Technology
    Austria, 2018.
  ista: Zimin A. 2018. Learning from dependent data. Institute of Science and Technology
    Austria.
  mla: Zimin, Alexander. <i>Learning from Dependent Data</i>. Institute of Science
    and Technology Austria, 2018, doi:<a href="https://doi.org/10.15479/AT:ISTA:TH1048">10.15479/AT:ISTA:TH1048</a>.
  short: A. Zimin, Learning from Dependent Data, Institute of Science and Technology
    Austria, 2018.
date_created: 2018-12-11T11:44:27Z
date_published: 2018-09-01T00:00:00Z
date_updated: 2023-09-07T12:29:07Z
day: '01'
ddc:
- '004'
- '519'
degree_awarded: PhD
department:
- _id: ChLa
doi: 10.15479/AT:ISTA:TH1048
ec_funded: 1
file:
- access_level: open_access
  checksum: e849dd40a915e4d6c5572b51b517f098
  content_type: application/pdf
  creator: dernst
  date_created: 2019-04-09T07:32:47Z
  date_updated: 2020-07-14T12:47:40Z
  file_id: '6253'
  file_name: 2018_Thesis_Zimin.pdf
  file_size: 1036137
  relation: main_file
- access_level: closed
  checksum: da092153cec55c97461bd53c45c5d139
  content_type: application/zip
  creator: dernst
  date_created: 2019-04-09T07:32:47Z
  date_updated: 2020-07-14T12:47:40Z
  file_id: '6254'
  file_name: 2018_Thesis_Zimin_Source.zip
  file_size: 637490
  relation: source_file
file_date_updated: 2020-07-14T12:47:40Z
has_accepted_license: '1'
language:
- iso: eng
month: '09'
oa: 1
oa_version: Published Version
page: '92'
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication_identifier:
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
publist_id: '7986'
pubrep_id: '1048'
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Learning from dependent data
type: dissertation
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2018'
...
---
_id: '1126'
abstract:
- lang: eng
  text: "Traditionally machine learning has been focusing on the problem of solving
    a single\r\ntask in isolation. While being quite well understood, this approach
    disregards an\r\nimportant aspect of human learning: when facing a new problem,
    humans are able to\r\nexploit knowledge acquired from previously learned tasks.
    Intuitively, access to several\r\nproblems simultaneously or sequentially could
    also be advantageous for a machine\r\nlearning system, especially if these tasks
    are closely related. Indeed, results of many\r\nempirical studies have provided
    justification for this intuition. However, theoretical\r\njustifications of this
    idea are rather limited.\r\nThe focus of this thesis is to expand the understanding
    of potential benefits of information\r\ntransfer between several related learning
    problems. We provide theoretical\r\nanalysis for three scenarios of multi-task
    learning - multiple kernel learning, sequential\r\nlearning and active task selection.
    We also provide a PAC-Bayesian perspective on\r\nlifelong learning and investigate
    how the task generation process influences the generalization\r\nguarantees in
    this scenario. In addition, we show how some of the obtained\r\ntheoretical results
    can be used to derive principled multi-task and lifelong learning\r\nalgorithms
    and illustrate their performance on various synthetic and real-world datasets."
acknowledgement: "First and foremost I would like to express my gratitude to my supervisor,
  Christoph\r\nLampert. Thank you for your patience in teaching me all aspects of
  doing research\r\n(including English grammar), for your trust in my capabilities
  and endless support. Thank\r\nyou for granting me freedom in my research and, at
  the same time, having time and\r\nhelping me cope with the consequences whenever
  I needed it. Thank you for creating\r\nan excellent atmosphere in the group, it
  was a great pleasure and honor to be a part of\r\nit. There could not have been
  a better and more inspiring adviser and mentor.\r\nI thank Shai Ben-David for welcoming
  me into his group at the University of Waterloo,\r\nfor inspiring discussions and
  support. It was a great pleasure to work together. I am\r\nalso thankful to Ruth
  Urner for hosting me at the Max-Planck Institute Tübingen, for the\r\nfruitful
  collaboration and for taking care of me during that not-so-sunny month of May.\r\nI
  thank Jan Maas for kindly joining my thesis committee despite the short notice and\r\nproviding
  me with insightful comments.\r\nI would like to thank my colleagues for their support,
  entertaining conversations and\r\nendless table soccer games we shared together:
  Georg, Jan, Amelie and Emilie, Michal\r\nand Alex, Alex K. and Alex Z., Thomas,
  Sameh, Vlad, Mayu, Nathaniel, Silvester, Neel,\r\nCsaba, Vladimir, Morten. Thank
  you, Mabel and Ram, for the wonderful time we spent\r\ntogether. I am thankful to
  Shrinu and Samira for taking care of me during my stay at the\r\nUniversity of Waterloo.
  Special thanks to Viktoriia for her never-ending optimism and for\r\nbeing so inspiring
  and supportive, especially at the beginning of my PhD journey.\r\nThanks to IST
  administration, in particular, Vlad and Elisabeth for shielding me from\r\nmost
  of the bureaucratic paperwork.\r\n\r\nThis dissertation would not have been possible
  without funding from the European\r\nResearch Council under the European Union's
  Seventh Framework Programme\r\n(FP7/2007-2013)/ERC grant agreement no 308036."
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Anastasia
  full_name: Pentina, Anastasia
  id: 42E87FC6-F248-11E8-B48F-1D18A9856A87
  last_name: Pentina
citation:
  ama: Pentina A. Theoretical foundations of multi-task lifelong learning. 2016. doi:<a
    href="https://doi.org/10.15479/AT:ISTA:TH_776">10.15479/AT:ISTA:TH_776</a>
  apa: Pentina, A. (2016). <i>Theoretical foundations of multi-task lifelong learning</i>.
    Institute of Science and Technology Austria. <a href="https://doi.org/10.15479/AT:ISTA:TH_776">https://doi.org/10.15479/AT:ISTA:TH_776</a>
  chicago: Pentina, Anastasia. “Theoretical Foundations of Multi-Task Lifelong Learning.”
    Institute of Science and Technology Austria, 2016. <a href="https://doi.org/10.15479/AT:ISTA:TH_776">https://doi.org/10.15479/AT:ISTA:TH_776</a>.
  ieee: A. Pentina, “Theoretical foundations of multi-task lifelong learning,” Institute
    of Science and Technology Austria, 2016.
  ista: Pentina A. 2016. Theoretical foundations of multi-task lifelong learning.
    Institute of Science and Technology Austria.
  mla: Pentina, Anastasia. <i>Theoretical Foundations of Multi-Task Lifelong Learning</i>.
    Institute of Science and Technology Austria, 2016, doi:<a href="https://doi.org/10.15479/AT:ISTA:TH_776">10.15479/AT:ISTA:TH_776</a>.
  short: A. Pentina, Theoretical Foundations of Multi-Task Lifelong Learning, Institute
    of Science and Technology Austria, 2016.
date_created: 2018-12-11T11:50:17Z
date_published: 2016-11-01T00:00:00Z
date_updated: 2023-09-07T11:52:03Z
day: '01'
ddc:
- '006'
degree_awarded: PhD
department:
- _id: ChLa
doi: 10.15479/AT:ISTA:TH_776
ec_funded: 1
file:
- access_level: open_access
  content_type: application/pdf
  creator: system
  date_created: 2018-12-12T10:14:07Z
  date_updated: 2018-12-12T10:14:07Z
  file_id: '5056'
  file_name: IST-2017-776-v1+1_Pentina_Thesis_2016.pdf
  file_size: 2140062
  relation: main_file
file_date_updated: 2018-12-12T10:14:07Z
has_accepted_license: '1'
language:
- iso: eng
month: '11'
oa: 1
oa_version: Published Version
page: '127'
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication_identifier:
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
publist_id: '6234'
pubrep_id: '776'
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Theoretical foundations of multi-task lifelong learning
type: dissertation
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2016'
...
---
_id: '1401'
abstract:
- lang: eng
  text: 'The human ability to recognize objects in complex scenes has driven research
    in the computer vision field over couple of decades. This thesis focuses on the
    object recognition task in images. That is, given the image, we want the computer
    system to be able to predict the class of the object that appears in the image.
    A recent successful attempt to bridge semantic understanding of the image perceived
    by humans and by computers uses attribute-based models. Attributes are semantic
    properties of the objects shared across different categories, which humans and
    computers can decide on. To explore the attribute-based models we take a statistical
    machine learning approach, and address two key learning challenges in view of
    object recognition task: learning augmented attributes as mid-level discriminative
    feature representation, and learning with attributes as privileged information.
    Our main contributions are parametric and non-parametric models and algorithms
    to solve these frameworks. In the parametric approach, we explore an autoencoder
    model combined with the large margin nearest neighbor principle for mid-level
    feature learning, and linear support vector machines for learning with privileged
    information. In the non-parametric approach, we propose a supervised Indian Buffet
    Process for automatic augmentation of semantic attributes, and explore the Gaussian
    Processes classification framework for learning with privileged information. A
    thorough experimental analysis shows the effectiveness of the proposed models
    in both parametric and non-parametric views.'
acknowledgement: "I would like to thank my supervisor, Christoph Lampert, for guidance
  throughout my studies and for patience in transforming me into a scientist, and
  my thesis committee, Chris Wojtan and Horst Bischof, for their help and advice.
  \r\n\r\nI would like to thank Elisabeth Hacker who perfectly assisted all my administrative
  needs and was always nice and friendly to me, and the campus team for making the
  IST Austria campus my second home. \r\nI was honored to collaborate with brilliant
  researchers and to learn from their experience. Undoubtedly, I learned most of all
  from Novi Quadrianto: brainstorming our projects and getting exciting results was
  the most enjoyable part of my work – thank you! I am also grateful to David Knowles,
  Zoubin Ghahramani, Daniel Hernández-Lobato, Kristian Kersting and Anastasia Pentina
  for the fantastic projects we worked on together, and to Kristen Grauman and Adriana
  Kovashka for the exceptional experience working with user studies. I would like
  to thank my colleagues at IST Austria and my office mates who shared their happy
  moods, scientific breakthroughs and thought-provoking conversations with me: Chao,
  Filip, Rustem, Asya, Sameh, Alex, Vlad, Mayu, Neel, Csaba, Thomas, Vladimir, Cristina,
  Alex Z., Avro, Amelie and Emilie, Andreas H. and Andreas E., Chris, Lena, Michael,
  Ali and Ipek, Vera, Igor, Katia. Special thanks to Morten for the countless games
  of table soccer we played together and the tournaments we teamed up for: we will
  definitely win next time:) A very warm hug to Asya for always being so inspiring
  and supportive to me, and for helping me to increase the proportion of female computer
  scientists in our group. "
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Viktoriia
  full_name: Sharmanska, Viktoriia
  id: 2EA6D09E-F248-11E8-B48F-1D18A9856A87
  last_name: Sharmanska
  orcid: 0000-0003-0192-9308
citation:
  ama: 'Sharmanska V. Learning with attributes for object recognition: Parametric
    and non-parametrics views. 2015. doi:<a href="https://doi.org/10.15479/at:ista:1401">10.15479/at:ista:1401</a>'
  apa: 'Sharmanska, V. (2015). <i>Learning with attributes for object recognition:
    Parametric and non-parametrics views</i>. Institute of Science and Technology
    Austria. <a href="https://doi.org/10.15479/at:ista:1401">https://doi.org/10.15479/at:ista:1401</a>'
  chicago: 'Sharmanska, Viktoriia. “Learning with Attributes for Object Recognition:
    Parametric and Non-Parametrics Views.” Institute of Science and Technology Austria,
    2015. <a href="https://doi.org/10.15479/at:ista:1401">https://doi.org/10.15479/at:ista:1401</a>.'
  ieee: 'V. Sharmanska, “Learning with attributes for object recognition: Parametric
    and non-parametrics views,” Institute of Science and Technology Austria, 2015.'
  ista: 'Sharmanska V. 2015. Learning with attributes for object recognition: Parametric
    and non-parametrics views. Institute of Science and Technology Austria.'
  mla: 'Sharmanska, Viktoriia. <i>Learning with Attributes for Object Recognition:
    Parametric and Non-Parametrics Views</i>. Institute of Science and Technology
    Austria, 2015, doi:<a href="https://doi.org/10.15479/at:ista:1401">10.15479/at:ista:1401</a>.'
  short: 'V. Sharmanska, Learning with Attributes for Object Recognition: Parametric
    and Non-Parametrics Views, Institute of Science and Technology Austria, 2015.'
date_created: 2018-12-11T11:51:48Z
date_published: 2015-04-01T00:00:00Z
date_updated: 2023-09-07T11:40:11Z
day: '01'
ddc:
- '000'
degree_awarded: PhD
department:
- _id: ChLa
- _id: GradSch
doi: 10.15479/at:ista:1401
file:
- access_level: open_access
  checksum: 3605b402bb6934e09ae4cf672c84baf7
  content_type: application/pdf
  creator: dernst
  date_created: 2021-02-22T11:33:17Z
  date_updated: 2021-02-22T11:33:17Z
  file_id: '9177'
  file_name: 2015_Thesis_Sharmanska.pdf
  file_size: 7964342
  relation: main_file
  success: 1
- access_level: closed
  checksum: e37593b3ee75bf3180629df2d6ca8f4e
  content_type: application/pdf
  creator: cchlebak
  date_created: 2021-11-16T14:40:45Z
  date_updated: 2021-11-17T13:47:24Z
  file_id: '10297'
  file_name: 2015_Thesis_Sharmanska_pdfa.pdf
  file_size: 7372241
  relation: main_file
file_date_updated: 2021-11-17T13:47:24Z
has_accepted_license: '1'
language:
- iso: eng
main_file_link:
- url: http://users.sussex.ac.uk/~nq28/viktoriia/Thesis_Sharmanska.pdf
month: '04'
oa: 1
oa_version: Published Version
page: '144'
publication_identifier:
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
publist_id: '5806'
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: 'Learning with attributes for object recognition: Parametric and non-parametrics
  views'
type: dissertation
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2015'
...