---
_id: '8390'
abstract:
- lang: eng
  text: "Deep neural networks have established a new standard for data-dependent feature
    extraction pipelines in the Computer Vision literature. Despite their remarkable
    performance in the standard supervised learning scenario, i.e. when models are
    trained with labeled data and tested on samples that follow a similar distribution,
    neural networks have been shown to struggle with more advanced generalization
    abilities, such as transferring knowledge across visually different domains, or
    generalizing to new unseen combinations of known concepts. In this thesis we argue
    that, in contrast to the usual black-box behavior of neural networks, leveraging
    more structured internal representations is a promising direction\r\nfor tackling
    such problems. In particular, we focus on two forms of structure. First, we tackle
    modularity: We show that (i) compositional architectures are a natural tool for
    modeling reasoning tasks, in that they efficiently capture their combinatorial
    nature, which is key for generalizing beyond the compositions seen during training.
    We investigate how to to learn such models, both formally and experimentally,
    for the task of abstract visual reasoning. Then, we show that (ii) in some settings,
    modularity allows us to efficiently break down complex tasks into smaller, easier,
    modules, thereby improving computational efficiency; We study this behavior in
    the context of generative models for colorization, as well as for small objects
    detection. Secondly, we investigate the inherently layered structure of representations
    learned by neural networks, and analyze its role in the context of transfer learning
    and domain adaptation across visually\r\ndissimilar domains. "
acknowledged_ssus:
- _id: CampIT
- _id: ScienComp
acknowledgement: Last but not least, I would like to acknowledge the support of the
  IST IT and scientific computing team for helping provide a great work environment.
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Amélie
  full_name: Royer, Amélie
  id: 3811D890-F248-11E8-B48F-1D18A9856A87
  last_name: Royer
  orcid: 0000-0002-8407-0705
citation:
  ama: Royer A. Leveraging structure in Computer Vision tasks for flexible Deep Learning
    models. 2020. doi:<a href="https://doi.org/10.15479/AT:ISTA:8390">10.15479/AT:ISTA:8390</a>
  apa: Royer, A. (2020). <i>Leveraging structure in Computer Vision tasks for flexible
    Deep Learning models</i>. Institute of Science and Technology Austria. <a href="https://doi.org/10.15479/AT:ISTA:8390">https://doi.org/10.15479/AT:ISTA:8390</a>
  chicago: Royer, Amélie. “Leveraging Structure in Computer Vision Tasks for Flexible
    Deep Learning Models.” Institute of Science and Technology Austria, 2020. <a href="https://doi.org/10.15479/AT:ISTA:8390">https://doi.org/10.15479/AT:ISTA:8390</a>.
  ieee: A. Royer, “Leveraging structure in Computer Vision tasks for flexible Deep
    Learning models,” Institute of Science and Technology Austria, 2020.
  ista: Royer A. 2020. Leveraging structure in Computer Vision tasks for flexible
    Deep Learning models. Institute of Science and Technology Austria.
  mla: Royer, Amélie. <i>Leveraging Structure in Computer Vision Tasks for Flexible
    Deep Learning Models</i>. Institute of Science and Technology Austria, 2020, doi:<a
    href="https://doi.org/10.15479/AT:ISTA:8390">10.15479/AT:ISTA:8390</a>.
  short: A. Royer, Leveraging Structure in Computer Vision Tasks for Flexible Deep
    Learning Models, Institute of Science and Technology Austria, 2020.
date_created: 2020-09-14T13:42:09Z
date_published: 2020-09-14T00:00:00Z
date_updated: 2023-10-16T10:04:02Z
day: '14'
ddc:
- '000'
degree_awarded: PhD
department:
- _id: ChLa
doi: 10.15479/AT:ISTA:8390
file:
- access_level: open_access
  checksum: c914d2f88846032f3d8507734861b6ee
  content_type: application/pdf
  creator: dernst
  date_created: 2020-09-14T13:39:14Z
  date_updated: 2020-09-14T13:39:14Z
  file_id: '8391'
  file_name: 2020_Thesis_Royer.pdf
  file_size: 30224591
  relation: main_file
  success: 1
- access_level: closed
  checksum: ae98fb35d912cff84a89035ae5794d3c
  content_type: application/x-zip-compressed
  creator: dernst
  date_created: 2020-09-14T13:39:17Z
  date_updated: 2020-09-14T13:39:17Z
  file_id: '8392'
  file_name: thesis_sources.zip
  file_size: 74227627
  relation: main_file
file_date_updated: 2020-09-14T13:39:17Z
has_accepted_license: '1'
language:
- iso: eng
license: https://creativecommons.org/licenses/by-nc-sa/4.0/
month: '09'
oa: 1
oa_version: Published Version
page: '197'
publication_identifier:
  isbn:
  - 978-3-99078-007-7
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
related_material:
  record:
  - id: '7936'
    relation: part_of_dissertation
    status: public
  - id: '7937'
    relation: part_of_dissertation
    status: public
  - id: '8193'
    relation: part_of_dissertation
    status: public
  - id: '8092'
    relation: part_of_dissertation
    status: public
  - id: '911'
    relation: part_of_dissertation
    status: public
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Leveraging structure in Computer Vision tasks for flexible Deep Learning models
tmp:
  image: /images/cc_by_nc_sa.png
  legal_code_url: https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode
  name: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC
    BY-NC-SA 4.0)
  short: CC BY-NC-SA (4.0)
type: dissertation
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2020'
...