---
_id: '14213'
abstract:
- lang: eng
  text: We introduce a method to segment the visual field into independently moving
    regions, trained with no ground truth or supervision. It consists of an adversarial
    conditional encoder-decoder architecture based on Slot Attention, modified to
    use the image as context to decode optical flow without attempting to reconstruct
    the image itself. In the resulting multi-modal representation, one modality (flow)
    feeds the encoder to produce separate latent codes (slots), whereas the other
    modality (image) conditions the decoder to generate the first (flow) from the
    slots. This design frees the representation from having to encode complex nuisance
    variability in the image due to, for instance, illumination and reflectance properties
    of the scene. Since customary autoencoding based on minimizing the reconstruction
    error does not preclude the entire flow from being encoded into a single slot,
    we modify the loss to an adversarial criterion based on Contextual Information
    Separation. The resulting min-max optimization fosters the separation of objects
    and their assignment to different attention slots, leading to Divided Attention,
    or DivA. DivA outperforms recent unsupervised multi-object motion segmentation
    methods while tripling run-time speed up to 104FPS and reducing the performance
    gap from supervised methods to 12% or less. DivA can handle different numbers
    of objects and different image sizes at training and test time, is invariant to
    permutation of object labels, and does not require explicit regularization.
article_processing_charge: No
arxiv: 1
author:
- first_name: Dong
  full_name: Lao, Dong
  last_name: Lao
- first_name: Zhengyang
  full_name: Hu, Zhengyang
  last_name: Hu
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Yanchao
  full_name: Yang, Yanchao
  last_name: Yang
- first_name: Stefano
  full_name: Soatto, Stefano
  last_name: Soatto
conference:
  end_date: 2024-01-03
  location: Hong Kong, China
  name: 'CPAL: Conference on Parsimony and Learning'
  start_date: 2024-01-03
date_created: 2023-08-22T14:19:59Z
date_published: 2024-01-03T00:00:00Z
date_updated: 2025-02-13T08:10:28Z
day: '03'
ddc:
- '000'
department:
- _id: FrLo
external_id:
  arxiv:
  - '2304.01430'
file:
- access_level: open_access
  checksum: 8fad894c34f1b3d5a14fb8ffb12f7277
  content_type: application/pdf
  creator: dernst
  date_created: 2024-02-12T08:40:36Z
  date_updated: 2024-02-12T08:40:36Z
  file_id: '14978'
  file_name: 2024_CPAL_Lao.pdf
  file_size: 8038511
  relation: main_file
  success: 1
file_date_updated: 2024-02-12T08:40:36Z
has_accepted_license: '1'
language:
- iso: eng
month: '01'
oa: 1
oa_version: Published Version
publication: 1st Conference on Parsimony and Learning
publication_identifier:
  issnl:
  - 1234-4321
publication_status: published
quality_controlled: '1'
status: public
title: 'Divided attention: Unsupervised multi-object discovery with contextually separated
  slots'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2024'
...
---
_id: '14333'
abstract:
- lang: eng
  text: "As causal ground truth is incredibly rare, causal discovery algorithms are\r\ncommonly
    only evaluated on simulated data. This is concerning, given that\r\nsimulations
    reflect common preconceptions about generating processes regarding\r\nnoise distributions,
    model classes, and more. In this work, we propose a novel\r\nmethod for falsifying
    the output of a causal discovery algorithm in the absence\r\nof ground truth.
    Our key insight is that while statistical learning seeks\r\nstability across subsets
    of data points, causal learning should seek stability\r\nacross subsets of variables.
    Motivated by this insight, our method relies on a\r\nnotion of compatibility between
    causal graphs learned on different subsets of\r\nvariables. We prove that detecting
    incompatibilities can falsify wrongly\r\ninferred causal relations due to violation
    of assumptions or errors from finite\r\nsample effects. Although passing such
    compatibility tests is only a necessary\r\ncriterion for good performance, we
    argue that it provides strong evidence for\r\nthe causal models whenever compatibility
    entails strong implications for the\r\njoint distribution. We also demonstrate
    experimentally that detection of\r\nincompatibilities can aid in causal model
    selection."
article_number: '2307.09552'
article_processing_charge: No
arxiv: 1
author:
- first_name: Philipp M.
  full_name: Faller, Philipp M.
  last_name: Faller
- first_name: Leena Chennuru
  full_name: Vankadara, Leena Chennuru
  last_name: Vankadara
- first_name: Atalanti A.
  full_name: Mastakouri, Atalanti A.
  last_name: Mastakouri
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Dominik
  full_name: Janzing, Dominik
  last_name: Janzing
citation:
  ama: 'Faller PM, Vankadara LC, Mastakouri AA, Locatello F, Janzing D. Self-compatibility:
    Evaluating causal discovery without ground truth. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2307.09552">10.48550/arXiv.2307.09552</a>'
  apa: 'Faller, P. M., Vankadara, L. C., Mastakouri, A. A., Locatello, F., &#38; Janzing,
    D. (n.d.). Self-compatibility: Evaluating causal discovery without ground truth.
    <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2307.09552">https://doi.org/10.48550/arXiv.2307.09552</a>'
  chicago: 'Faller, Philipp M., Leena Chennuru Vankadara, Atalanti A. Mastakouri,
    Francesco Locatello, and Dominik Janzing. “Self-Compatibility: Evaluating Causal
    Discovery without Ground Truth.” <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/arXiv.2307.09552">https://doi.org/10.48550/arXiv.2307.09552</a>.'
  ieee: 'P. M. Faller, L. C. Vankadara, A. A. Mastakouri, F. Locatello, and D. Janzing,
    “Self-compatibility: Evaluating causal discovery without ground truth,” <i>arXiv</i>.
    .'
  ista: 'Faller PM, Vankadara LC, Mastakouri AA, Locatello F, Janzing D. Self-compatibility:
    Evaluating causal discovery without ground truth. arXiv, 2307.09552.'
  mla: 'Faller, Philipp M., et al. “Self-Compatibility: Evaluating Causal Discovery
    without Ground Truth.” <i>ArXiv</i>, 2307.09552, doi:<a href="https://doi.org/10.48550/arXiv.2307.09552">10.48550/arXiv.2307.09552</a>.'
  short: P.M. Faller, L.C. Vankadara, A.A. Mastakouri, F. Locatello, D. Janzing, ArXiv
    (n.d.).
date_created: 2023-09-13T12:44:59Z
date_published: 2023-07-18T00:00:00Z
date_updated: 2023-09-13T12:47:53Z
day: '18'
department:
- _id: FrLo
doi: 10.48550/arXiv.2307.09552
extern: '1'
external_id:
  arxiv:
  - '2307.09552'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2307.09552
month: '07'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: 'Self-compatibility: Evaluating causal discovery without ground truth'
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14946'
abstract:
- lang: eng
  text: "We present a unified framework for studying the identifiability of\r\nrepresentations
    learned from simultaneously observed views, such as different\r\ndata modalities.
    We allow a partially observed setting in which each view\r\nconstitutes a nonlinear
    mixture of a subset of underlying latent variables,\r\nwhich can be causally related.
    We prove that the information shared across all\r\nsubsets of any number of views
    can be learned up to a smooth bijection using\r\ncontrastive learning and a single
    encoder per view. We also provide graphical\r\ncriteria indicating which latent
    variables can be identified through a simple\r\nset of rules, which we refer to
    as identifiability algebra. Our general\r\nframework and theoretical results unify
    and extend several previous works on\r\nmulti-view nonlinear ICA, disentanglement,
    and causal representation learning.\r\nWe experimentally validate our claims on
    numerical, image, and multi-modal data\r\nsets. Further, we demonstrate that the
    performance of prior methods is\r\nrecovered in different special cases of our
    setup. Overall, we find that access\r\nto multiple partial views enables us to
    identify a more fine-grained\r\nrepresentation, under the generally milder assumption
    of partial observability."
acknowledgement: "This work was initiated at the Second Bellairs Workshop on Causality
  held at the Bellairs Research Institute, January 6–13, 2022; we thank all workshop
  participants for providing a stimulating research environment. Further, we thank
  Cian Eastwood, Luigi Gresele, Stefano Soatto, Marco Bagatella, and A. René Geist
  for helpful discussion. GM is a member of the Machine Learning Cluster of Excellence,
  EXC number 2064/1 – Project number 390727645. JvK and GM acknowledge support from
  the German Federal Ministry of Education and Research (BMBF) through the Tübingen
  AI Center (FKZ: 01IS18039B). The research of DX and SM was supported by the Air
  Force Office of Scientific Research under award number FA8655-22-1-7155. Any opinions,
  findings, and conclusions or recommendations expressed in\r\nthis material are those
  of the author(s) and do not necessarily reflect the views of the United States Air
  Force. We also thank SURF for the support in using the Dutch National Supercomputer
  Snellius. DY was supported by an Amazon fellowship and the International Max Planck
  Research School for Intelligent Systems (IMPRS-IS). Work done outside of Amazon.
  SL was supported by an IVADO excellence PhD scholarship and by Samsung Electronics
  Co., Ldt."
article_number: '2311.04056'
article_processing_charge: No
arxiv: 1
author:
- first_name: Dingling
  full_name: Yao, Dingling
  id: d3e02e50-48a8-11ee-8f62-c108061797fa
  last_name: Yao
- first_name: Danru
  full_name: Xu, Danru
  last_name: Xu
- first_name: Sébastien
  full_name: Lachapelle, Sébastien
  last_name: Lachapelle
- first_name: Sara
  full_name: Magliacane, Sara
  last_name: Magliacane
- first_name: Perouz
  full_name: Taslakian, Perouz
  last_name: Taslakian
- first_name: Georg
  full_name: Martius, Georg
  last_name: Martius
- first_name: Julius von
  full_name: Kügelgen, Julius von
  last_name: Kügelgen
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
citation:
  ama: Yao D, Xu D, Lachapelle S, et al. Multi-view causal representation learning
    with partial observability. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2311.04056">10.48550/arXiv.2311.04056</a>
  apa: Yao, D., Xu, D., Lachapelle, S., Magliacane, S., Taslakian, P., Martius, G.,
    … Locatello, F. (n.d.). Multi-view causal representation learning with partial
    observability. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2311.04056">https://doi.org/10.48550/arXiv.2311.04056</a>
  chicago: Yao, Dingling, Danru Xu, Sébastien Lachapelle, Sara Magliacane, Perouz
    Taslakian, Georg Martius, Julius von Kügelgen, and Francesco Locatello. “Multi-View
    Causal Representation Learning with Partial Observability.” <i>ArXiv</i>, n.d.
    <a href="https://doi.org/10.48550/arXiv.2311.04056">https://doi.org/10.48550/arXiv.2311.04056</a>.
  ieee: D. Yao <i>et al.</i>, “Multi-view causal representation learning with partial
    observability,” <i>arXiv</i>. .
  ista: Yao D, Xu D, Lachapelle S, Magliacane S, Taslakian P, Martius G, Kügelgen
    J von, Locatello F. Multi-view causal representation learning with partial observability.
    arXiv, 2311.04056.
  mla: Yao, Dingling, et al. “Multi-View Causal Representation Learning with Partial
    Observability.” <i>ArXiv</i>, 2311.04056, doi:<a href="https://doi.org/10.48550/arXiv.2311.04056">10.48550/arXiv.2311.04056</a>.
  short: D. Yao, D. Xu, S. Lachapelle, S. Magliacane, P. Taslakian, G. Martius, J.
    von Kügelgen, F. Locatello, ArXiv (n.d.).
date_created: 2024-02-07T14:28:34Z
date_published: 2023-11-07T00:00:00Z
date_updated: 2024-02-12T08:07:33Z
day: '07'
department:
- _id: FrLo
doi: 10.48550/arXiv.2311.04056
external_id:
  arxiv:
  - '2311.04056'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2311.04056
month: '11'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: Multi-view causal representation learning with partial observability
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14948'
abstract:
- lang: eng
  text: "The extraction of modular object-centric representations for downstream tasks\r\nis
    an emerging area of research. Learning grounded representations of objects\r\nthat
    are guaranteed to be stable and invariant promises robust performance\r\nacross
    different tasks and environments. Slot Attention (SA) learns\r\nobject-centric
    representations by assigning objects to \\textit{slots}, but\r\npresupposes a
    \\textit{single} distribution from which all slots are randomly\r\ninitialised.
    This results in an inability to learn \\textit{specialized} slots\r\nwhich bind
    to specific object types and remain invariant to identity-preserving\r\nchanges
    in object appearance. To address this, we present\r\n\\emph{\\textsc{Co}nditional
    \\textsc{S}lot \\textsc{A}ttention} (\\textsc{CoSA})\r\nusing a novel concept
    of \\emph{Grounded Slot Dictionary} (GSD) inspired by\r\nvector quantization.
    Our proposed GSD comprises (i) canonical object-level\r\nproperty vectors and
    (ii) parametric Gaussian distributions, which define a\r\nprior over the slots.
    We demonstrate the benefits of our method in multiple\r\ndownstream tasks such
    as scene generation, composition, and task adaptation,\r\nwhilst remaining competitive
    with SA in popular object discovery benchmarks."
acknowledgement: "This work was supported by supported by UKRI (grant agreement no.
  EP/S023356/1), in the UKRI\r\nCentre for Doctoral Training in Safe and Trusted AI
  via A. Kori."
article_number: '2307.09437'
article_processing_charge: No
arxiv: 1
author:
- first_name: Avinash
  full_name: Kori, Avinash
  last_name: Kori
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Fabio De Sousa
  full_name: Ribeiro, Fabio De Sousa
  last_name: Ribeiro
- first_name: Francesca
  full_name: Toni, Francesca
  last_name: Toni
- first_name: Ben
  full_name: Glocker, Ben
  last_name: Glocker
citation:
  ama: Kori A, Locatello F, Ribeiro FDS, Toni F, Glocker B. Grounded object centric
    learning. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2307.09437">10.48550/arXiv.2307.09437</a>
  apa: Kori, A., Locatello, F., Ribeiro, F. D. S., Toni, F., &#38; Glocker, B. (n.d.).
    Grounded object centric learning. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2307.09437">https://doi.org/10.48550/arXiv.2307.09437</a>
  chicago: Kori, Avinash, Francesco Locatello, Fabio De Sousa Ribeiro, Francesca Toni,
    and Ben Glocker. “Grounded Object Centric Learning.” <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/arXiv.2307.09437">https://doi.org/10.48550/arXiv.2307.09437</a>.
  ieee: A. Kori, F. Locatello, F. D. S. Ribeiro, F. Toni, and B. Glocker, “Grounded
    object centric learning,” <i>arXiv</i>. .
  ista: Kori A, Locatello F, Ribeiro FDS, Toni F, Glocker B. Grounded object centric
    learning. arXiv, 2307.09437.
  mla: Kori, Avinash, et al. “Grounded Object Centric Learning.” <i>ArXiv</i>, 2307.09437,
    doi:<a href="https://doi.org/10.48550/arXiv.2307.09437">10.48550/arXiv.2307.09437</a>.
  short: A. Kori, F. Locatello, F.D.S. Ribeiro, F. Toni, B. Glocker, ArXiv (n.d.).
date_created: 2024-02-07T14:47:04Z
date_published: 2023-07-18T00:00:00Z
date_updated: 2024-02-12T08:13:12Z
day: '18'
department:
- _id: FrLo
doi: 10.48550/arXiv.2307.09437
external_id:
  arxiv:
  - '2307.09437'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2307.09437
month: '07'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: Grounded object centric learning
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14949'
abstract:
- lang: eng
  text: Many approaches have been proposed to use diffusion models to augment training
    datasets for downstream tasks, such as classification. However, diffusion models
    are themselves trained on large datasets, often with noisy annotations, and it
    remains an open question to which extent these models contribute to downstream
    classification performance. In particular, it remains unclear if they generalize
    enough to improve over directly using the additional data of their pre-training
    process for augmentation. We systematically evaluate a range of existing methods
    to generate images from diffusion models and study new extensions to assess their
    benefit for data augmentation. Personalizing diffusion models towards the target
    data outperforms simpler prompting strategies. However, using the pre-training
    data of the diffusion model alone, via a simple nearest-neighbor retrieval procedure,
    leads to even stronger downstream performance. Our study explores the potential
    of diffusion models in generating new training data, and surprisingly finds that
    these sophisticated models are not yet able to beat a simple and strong image
    retrieval baseline on simple downstream vision tasks.
acknowledgement: The authors would like to thank Varad Gunjal and Vishaal Udandarao.
  MFB thanks the International Max Planck Research School for Intelligent Systems
  (IMPRS-IS).
alternative_title:
- TMLR
article_processing_charge: No
article_type: original
author:
- first_name: Max
  full_name: Burg, Max
  last_name: Burg
- first_name: Florian
  full_name: Wenzel, Florian
  last_name: Wenzel
- first_name: Dominik
  full_name: Zietlow, Dominik
  last_name: Zietlow
- first_name: Max
  full_name: Horn, Max
  last_name: Horn
- first_name: Osama
  full_name: Makansi, Osama
  last_name: Makansi
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Chris
  full_name: Russell, Chris
  last_name: Russell
citation:
  ama: Burg M, Wenzel F, Zietlow D, et al. Image retrieval outperforms diffusion models
    on data augmentation. <i>Journal of Machine Learning Research</i>. 2023.
  apa: Burg, M., Wenzel, F., Zietlow, D., Horn, M., Makansi, O., Locatello, F., &#38;
    Russell, C. (2023). Image retrieval outperforms diffusion models on data augmentation.
    <i>Journal of Machine Learning Research</i>. ML Research Press.
  chicago: Burg, Max, Florian Wenzel, Dominik Zietlow, Max Horn, Osama Makansi, Francesco
    Locatello, and Chris Russell. “Image Retrieval Outperforms Diffusion Models on
    Data Augmentation.” <i>Journal of Machine Learning Research</i>. ML Research Press,
    2023.
  ieee: M. Burg <i>et al.</i>, “Image retrieval outperforms diffusion models on data
    augmentation,” <i>Journal of Machine Learning Research</i>. ML Research Press,
    2023.
  ista: Burg M, Wenzel F, Zietlow D, Horn M, Makansi O, Locatello F, Russell C. 2023.
    Image retrieval outperforms diffusion models on data augmentation. Journal of
    Machine Learning Research.
  mla: Burg, Max, et al. “Image Retrieval Outperforms Diffusion Models on Data Augmentation.”
    <i>Journal of Machine Learning Research</i>, ML Research Press, 2023.
  short: M. Burg, F. Wenzel, D. Zietlow, M. Horn, O. Makansi, F. Locatello, C. Russell,
    Journal of Machine Learning Research (2023).
date_created: 2024-02-07T14:57:39Z
date_published: 2023-12-10T00:00:00Z
date_updated: 2024-02-12T08:30:21Z
day: '10'
ddc:
- '000'
department:
- _id: FrLo
file:
- access_level: open_access
  checksum: af87ddea7908923426365347b9c87ba7
  content_type: application/pdf
  creator: ptazenko
  date_created: 2024-02-07T14:57:32Z
  date_updated: 2024-02-07T14:57:32Z
  file_id: '14950'
  file_name: Burg_et_al_2023_Image_retrieval_outperforms.pdf
  file_size: 27325153
  relation: main_file
file_date_updated: 2024-02-07T14:57:32Z
has_accepted_license: '1'
language:
- iso: eng
license: https://creativecommons.org/licenses/by/4.0/
main_file_link:
- open_access: '1'
  url: https://openreview.net/forum?id=xflYdGZMpv
month: '12'
oa: 1
oa_version: Published Version
publication: Journal of Machine Learning Research
publication_identifier:
  eissn:
  - 2835-8856
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
status: public
title: Image retrieval outperforms diffusion models on data augmentation
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14952'
abstract:
- lang: eng
  text: "While different neural models often exhibit latent spaces that are alike
    when exposed to semantically related data, this intrinsic similarity is not always
    immediately discernible. Towards a better understanding of this phenomenon, our
    work shows how representations learned from these neural modules can be translated
    between different pre-trained networks via simpler transformations than previously
    thought. An advantage of this approach is the ability to\r\nestimate these transformations
    using standard, well-understood algebraic procedures that have closed-form solutions.
    Our method directly estimates a transformation between two given latent spaces,
    thereby enabling effective stitching of encoders and decoders without additional
    training. We extensively validate the adaptability of this translation procedure
    in different\r\nexperimental settings: across various trainings, domains, architectures
    (e.g., ResNet, CNN, ViT), and in multiple downstream tasks (classification, reconstruction).
    Notably, we show how it is possible to zero-shot stitch text encoders and vision
    decoders, or vice-versa, yielding surprisingly good classification performance
    in this multimodal setting."
acknowledgement: "This work is supported by the ERC grant no.802554 (SPECGEO), PRIN
  2020 project no.2020TA3K9N (LEGO.AI), and PNRR MUR project PE0000013-FAIR. Francesco\r\nLocatello
  did not contribute to this work at Amazon."
article_number: '2311.00664'
article_processing_charge: No
arxiv: 1
author:
- first_name: Valentino
  full_name: Maiorca, Valentino
  last_name: Maiorca
- first_name: Luca
  full_name: Moschella, Luca
  last_name: Moschella
- first_name: Antonio
  full_name: Norelli, Antonio
  last_name: Norelli
- first_name: Marco
  full_name: Fumero, Marco
  last_name: Fumero
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Emanuele
  full_name: Rodolà, Emanuele
  last_name: Rodolà
citation:
  ama: Maiorca V, Moschella L, Norelli A, Fumero M, Locatello F, Rodolà E. Latent
    space translation via semantic alignment. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2311.00664">10.48550/arXiv.2311.00664</a>
  apa: Maiorca, V., Moschella, L., Norelli, A., Fumero, M., Locatello, F., &#38; Rodolà,
    E. (n.d.). Latent space translation via semantic alignment. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2311.00664">https://doi.org/10.48550/arXiv.2311.00664</a>
  chicago: Maiorca, Valentino, Luca Moschella, Antonio Norelli, Marco Fumero, Francesco
    Locatello, and Emanuele Rodolà. “Latent Space Translation via Semantic Alignment.”
    <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/arXiv.2311.00664">https://doi.org/10.48550/arXiv.2311.00664</a>.
  ieee: V. Maiorca, L. Moschella, A. Norelli, M. Fumero, F. Locatello, and E. Rodolà,
    “Latent space translation via semantic alignment,” <i>arXiv</i>. .
  ista: Maiorca V, Moschella L, Norelli A, Fumero M, Locatello F, Rodolà E. Latent
    space translation via semantic alignment. arXiv, 2311.00664.
  mla: Maiorca, Valentino, et al. “Latent Space Translation via Semantic Alignment.”
    <i>ArXiv</i>, 2311.00664, doi:<a href="https://doi.org/10.48550/arXiv.2311.00664">10.48550/arXiv.2311.00664</a>.
  short: V. Maiorca, L. Moschella, A. Norelli, M. Fumero, F. Locatello, E. Rodolà,
    ArXiv (n.d.).
date_created: 2024-02-07T15:08:55Z
date_published: 2023-11-01T00:00:00Z
date_updated: 2024-02-12T09:40:23Z
day: '01'
department:
- _id: FrLo
doi: 10.48550/arXiv.2311.00664
external_id:
  arxiv:
  - '2311.00664'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2311.00664
month: '11'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: Latent space translation via semantic alignment
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14953'
abstract:
- lang: eng
  text: This paper provides statistical sample complexity bounds for score-matching
    and its applications in causal discovery. We demonstrate that accurate estimation
    of the score function is achievable by training a standard deep ReLU neural network
    using stochastic gradient descent. We establish bounds on the error rate of recovering
    causal relationships using the score-matching-based causal discovery method of
    Rolland et al. [2022], assuming a sufficiently good estimation of the score function.
    Finally, we analyze the upper bound of score-matching estimation within the score-based
    generative modeling, which has been applied for causal discovery but is also of
    independent interest within the domain of generative models.
acknowledgement: 'We are thankful to the reviewers for providing constructive feedback
  and Kun Zhang and Dominik Janzing for helpful discussion on the special case of
  deterministic children. This work was supported by Hasler Foundation Program: Hasler
  Responsible AI (project number 21043). This work was supported by the Swiss National
  Science Foundation (SNSF) under grant number 200021_205011. Francesco Locatello
  did not contribute to this work at Amazon. '
article_number: '2310.18123'
article_processing_charge: No
arxiv: 1
author:
- first_name: Zhenyu
  full_name: Zhu, Zhenyu
  last_name: Zhu
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Volkan
  full_name: Cevher, Volkan
  last_name: Cevher
citation:
  ama: 'Zhu Z, Locatello F, Cevher V. Sample complexity bounds for score-matching:
    Causal discovery and generative modeling. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2310.18123">10.48550/arXiv.2310.18123</a>'
  apa: 'Zhu, Z., Locatello, F., &#38; Cevher, V. (n.d.). Sample complexity bounds
    for score-matching: Causal discovery and generative modeling. <i>arXiv</i>. <a
    href="https://doi.org/10.48550/arXiv.2310.18123">https://doi.org/10.48550/arXiv.2310.18123</a>'
  chicago: 'Zhu, Zhenyu, Francesco Locatello, and Volkan Cevher. “Sample Complexity
    Bounds for Score-Matching: Causal Discovery and Generative Modeling.” <i>ArXiv</i>,
    n.d. <a href="https://doi.org/10.48550/arXiv.2310.18123">https://doi.org/10.48550/arXiv.2310.18123</a>.'
  ieee: 'Z. Zhu, F. Locatello, and V. Cevher, “Sample complexity bounds for score-matching:
    Causal discovery and generative modeling,” <i>arXiv</i>. .'
  ista: 'Zhu Z, Locatello F, Cevher V. Sample complexity bounds for score-matching:
    Causal discovery and generative modeling. arXiv, 2310.18123.'
  mla: 'Zhu, Zhenyu, et al. “Sample Complexity Bounds for Score-Matching: Causal Discovery
    and Generative Modeling.” <i>ArXiv</i>, 2310.18123, doi:<a href="https://doi.org/10.48550/arXiv.2310.18123">10.48550/arXiv.2310.18123</a>.'
  short: Z. Zhu, F. Locatello, V. Cevher, ArXiv (n.d.).
date_created: 2024-02-07T15:11:11Z
date_published: 2023-10-27T00:00:00Z
date_updated: 2024-02-12T09:45:58Z
day: '27'
department:
- _id: FrLo
doi: 10.48550/arXiv.2310.18123
external_id:
  arxiv:
  - '2310.18123'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2310.18123
month: '10'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: 'Sample complexity bounds for score-matching: Causal discovery and generative
  modeling'
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14954'
abstract:
- lang: eng
  text: "When domain knowledge is limited and experimentation is restricted by ethical,
    financial, or time constraints, practitioners turn to observational causal discovery
    methods to recover the causal structure, exploiting the statistical properties
    of their data. Because causal discovery without further assumptions is an ill-posed
    problem, each algorithm comes with its own set of\r\nusually untestable assumptions,
    some of which are hard to meet in real datasets. Motivated by these considerations,
    this paper extensively benchmarks the empirical performance of recent causal discovery
    methods on observational i.i.d. data generated under different background conditions,
    allowing for violations of the critical assumptions required by each selected
    approach. Our experimental findings show that score matching-based methods demonstrate\r\nsurprising
    performance in the false positive and false negative rate of the inferred graph
    in these challenging scenarios, and we provide theoretical insights into their
    performance. This work is also the first effort to benchmark the stability of
    causal discovery algorithms with respect to the values of their hyperparameters.
    Finally, we hope this paper will set a new standard for the evaluation of causal
    discovery methods and can serve as an accessible entry point for practitioners
    interested in the field, highlighting the empirical implications of different
    algorithm choices."
acknowledgement: "We thank Kun Zhang and Carl-Johann Simon-Gabriel for the insightful
  discussions. This work\r\nhas been supported by AFOSR, grant n. FA8655-20-1-7035.
  FM is supported by Programma\r\nOperativo Nazionale ricerca e innovazione 2014-2020.
  FM partially contributed to this work during an internship at Amazon Web Services
  with FL. FL partially contributed while at AWS."
article_number: '2310.13387'
article_processing_charge: No
arxiv: 1
author:
- first_name: Francesco
  full_name: Montagna, Francesco
  last_name: Montagna
- first_name: Atalanti A.
  full_name: Mastakouri, Atalanti A.
  last_name: Mastakouri
- first_name: Elias
  full_name: Eulig, Elias
  last_name: Eulig
- first_name: Nicoletta
  full_name: Noceti, Nicoletta
  last_name: Noceti
- first_name: Lorenzo
  full_name: Rosasco, Lorenzo
  last_name: Rosasco
- first_name: Dominik
  full_name: Janzing, Dominik
  last_name: Janzing
- first_name: Bryon
  full_name: Aragam, Bryon
  last_name: Aragam
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
citation:
  ama: Montagna F, Mastakouri AA, Eulig E, et al. Assumption violations in causal
    discovery and the robustness of score matching. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2310.13387">10.48550/arXiv.2310.13387</a>
  apa: Montagna, F., Mastakouri, A. A., Eulig, E., Noceti, N., Rosasco, L., Janzing,
    D., … Locatello, F. (n.d.). Assumption violations in causal discovery and the
    robustness of score matching. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2310.13387">https://doi.org/10.48550/arXiv.2310.13387</a>
  chicago: Montagna, Francesco, Atalanti A. Mastakouri, Elias Eulig, Nicoletta Noceti,
    Lorenzo Rosasco, Dominik Janzing, Bryon Aragam, and Francesco Locatello. “Assumption
    Violations in Causal Discovery and the Robustness of Score Matching.” <i>ArXiv</i>,
    n.d. <a href="https://doi.org/10.48550/arXiv.2310.13387">https://doi.org/10.48550/arXiv.2310.13387</a>.
  ieee: F. Montagna <i>et al.</i>, “Assumption violations in causal discovery and
    the robustness of score matching,” <i>arXiv</i>. .
  ista: Montagna F, Mastakouri AA, Eulig E, Noceti N, Rosasco L, Janzing D, Aragam
    B, Locatello F. Assumption violations in causal discovery and the robustness of
    score matching. arXiv, 2310.13387.
  mla: Montagna, Francesco, et al. “Assumption Violations in Causal Discovery and
    the Robustness of Score Matching.” <i>ArXiv</i>, 2310.13387, doi:<a href="https://doi.org/10.48550/arXiv.2310.13387">10.48550/arXiv.2310.13387</a>.
  short: F. Montagna, A.A. Mastakouri, E. Eulig, N. Noceti, L. Rosasco, D. Janzing,
    B. Aragam, F. Locatello, ArXiv (n.d.).
date_created: 2024-02-07T15:11:56Z
date_published: 2023-10-20T00:00:00Z
date_updated: 2024-02-12T09:51:15Z
day: '20'
department:
- _id: FrLo
doi: 10.48550/arXiv.2310.13387
external_id:
  arxiv:
  - '2310.13387'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2310.13387
month: '10'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: Assumption violations in causal discovery and the robustness of score matching
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14958'
abstract:
- lang: eng
  text: Causal representation learning (CRL) aims at identifying high-level causal
    variables from low-level data, e.g. images. Current methods usually assume that
    all causal variables are captured in the high-dimensional observations. In this
    work, we focus on learning causal representations from data under partial observability,
    i.e., when some of the causal variables are not observed in the measurements,
    and the set of masked variables changes across the different samples. We introduce
    some initial theoretical results for identifying causal variables under partial
    observability by exploiting a sparsity regularizer, focusing in particular on
    the linear and piecewise linear mixing function case. We provide a theorem that
    allows us to identify the causal variables up to permutation and element-wise
    linear transformations in the linear case and a lemma that allows us to identify
    causal variables up to linear transformation in the piecewise case. Finally, we
    provide a conjecture that would allow us to identify the causal variables up to
    permutation and element-wise linear transformations also in the piecewise linear
    case. We test the theorem and conjecture on simulated data, showing the effectiveness
    of our method.
acknowledgement: "This work was initiated at the Second Bellairs Workshop on Causality
  held at the Bellairs Research Institute, January 6–13, 2022; we thank all workshop
  participants for providing a stimulating research environment. The research of DX
  and SM was supported by the Air Force Office of Scientific Research under award
  number FA8655-22-1-7155. Any opinions, findings, and conclusions or recommendations
  expressed in this material are those of the author(s) and do not necessarily reflect
  the views of the United States Air Force. We also thank SURF for the support in
  using the Dutch National Supercomputer Snellius. DY was supported by an Amazon fellowship
  and the International Max Planck Research School for Intelligent Systems (IMPRS-IS).
  Work done outside of Amazon. SL was supported by an IVADO excellence PhD scholarship
  and by Samsung Electronics Co., Ldt. JvK acknowledges support from the German Federal
  Ministry of Education and Research (BMBF)\r\nthrough the Tübingen AI Center (FKZ:
  01IS18039B).\r\n"
article_number: '54'
article_processing_charge: No
author:
- first_name: Danru
  full_name: Xu, Danru
  last_name: Xu
- first_name: Dingling
  full_name: Yao, Dingling
  id: d3e02e50-48a8-11ee-8f62-c108061797fa
  last_name: Yao
- first_name: Sebastien
  full_name: Lachapelle, Sebastien
  last_name: Lachapelle
- first_name: Perouz
  full_name: Taslakian, Perouz
  last_name: Taslakian
- first_name: Julius
  full_name: von Kügelgen, Julius
  last_name: von Kügelgen
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Sara
  full_name: Magliacane, Sara
  last_name: Magliacane
citation:
  ama: 'Xu D, Yao D, Lachapelle S, et al. A sparsity principle for partially observable
    causal representation learning. In: <i>Causal Representation Learning Workshop
    at NeurIPS 2023</i>. OpenReview; 2023.'
  apa: 'Xu, D., Yao, D., Lachapelle, S., Taslakian, P., von Kügelgen, J., Locatello,
    F., &#38; Magliacane, S. (2023). A sparsity principle for partially observable
    causal representation learning. In <i>Causal Representation Learning Workshop
    at NeurIPS 2023</i>. New Orleans, LA, United States: OpenReview.'
  chicago: Xu, Danru, Dingling Yao, Sebastien Lachapelle, Perouz Taslakian, Julius
    von Kügelgen, Francesco Locatello, and Sara Magliacane. “A Sparsity Principle
    for Partially Observable Causal Representation Learning.” In <i>Causal Representation
    Learning Workshop at NeurIPS 2023</i>. OpenReview, 2023.
  ieee: D. Xu <i>et al.</i>, “A sparsity principle for partially observable causal
    representation learning,” in <i>Causal Representation Learning Workshop at NeurIPS
    2023</i>, New Orleans, LA, United States, 2023.
  ista: 'Xu D, Yao D, Lachapelle S, Taslakian P, von Kügelgen J, Locatello F, Magliacane
    S. 2023. A sparsity principle for partially observable causal representation learning.
    Causal Representation Learning Workshop at NeurIPS 2023. CRL: Causal Representation
    Learning Workshop at NeurIPS, 54.'
  mla: Xu, Danru, et al. “A Sparsity Principle for Partially Observable Causal Representation
    Learning.” <i>Causal Representation Learning Workshop at NeurIPS 2023</i>, 54,
    OpenReview, 2023.
  short: D. Xu, D. Yao, S. Lachapelle, P. Taslakian, J. von Kügelgen, F. Locatello,
    S. Magliacane, in:, Causal Representation Learning Workshop at NeurIPS 2023, OpenReview,
    2023.
conference:
  end_date: 2023-12-15
  location: New Orleans, LA, United States
  name: 'CRL: Causal Representation Learning Workshop at NeurIPS'
  start_date: 2023-12-15
date_created: 2024-02-07T15:17:51Z
date_published: 2023-12-05T00:00:00Z
date_updated: 2024-02-13T08:59:27Z
day: '05'
ddc:
- '000'
department:
- _id: FrLo
file:
- access_level: open_access
  checksum: 484efc27bda75ed6666044989695d9b6
  content_type: application/pdf
  creator: dernst
  date_created: 2024-02-13T08:50:53Z
  date_updated: 2024-02-13T08:50:53Z
  file_id: '14982'
  file_name: 2023_CRL_Xu.pdf
  file_size: 552357
  relation: main_file
  success: 1
file_date_updated: 2024-02-13T08:50:53Z
has_accepted_license: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://openreview.net/forum?id=Whr6uobelR
month: '12'
oa: 1
oa_version: Published Version
publication: Causal Representation Learning Workshop at NeurIPS 2023
publication_status: published
publisher: OpenReview
quality_controlled: '1'
status: public
title: A sparsity principle for partially observable causal representation learning
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14961'
abstract:
- lang: eng
  text: "The use of simulated data in the field of causal discovery is ubiquitous
    due to the scarcity of annotated real data. Recently, Reisach et al., 2021 highlighted
    the emergence of patterns in simulated linear data, which displays increasing
    marginal variance in the casual direction. As an ablation in their experiments,
    Montagna et al., 2023 found that similar patterns may emerge in\r\nnonlinear models
    for the variance of the score vector $\\nabla \\log p_{\\mathbf{X}}$, and introduced
    the ScoreSort algorithm. In this work, we formally define and characterize this
    score-sortability pattern of nonlinear additive noise models. We find that it
    defines a class of identifiable (bivariate) causal models overlapping with nonlinear
    additive noise models. We\r\ntheoretically demonstrate the advantages of ScoreSort
    in terms of statistical efficiency compared to prior state-of-the-art score matching-based
    methods and empirically show the score-sortability of the most common synthetic
    benchmarks in the literature. Our findings remark (1) the lack of diversity in
    the data as an important limitation in the evaluation of nonlinear causal discovery
    approaches, (2) the importance of thoroughly testing different settings within
    a problem class, and (3) the importance of analyzing statistical properties in\r\ncausal
    discovery, where research is often limited to defining identifiability conditions
    of the model. "
article_number: '2310.14246'
article_processing_charge: No
arxiv: 1
author:
- first_name: Francesco
  full_name: Montagna, Francesco
  last_name: Montagna
- first_name: Nicoletta
  full_name: Noceti, Nicoletta
  last_name: Noceti
- first_name: Lorenzo
  full_name: Rosasco, Lorenzo
  last_name: Rosasco
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
citation:
  ama: Montagna F, Noceti N, Rosasco L, Locatello F. Shortcuts for causal discovery
    of nonlinear models by score matching. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2310.14246">10.48550/arXiv.2310.14246</a>
  apa: Montagna, F., Noceti, N., Rosasco, L., &#38; Locatello, F. (n.d.). Shortcuts
    for causal discovery of nonlinear models by score matching. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2310.14246">https://doi.org/10.48550/arXiv.2310.14246</a>
  chicago: Montagna, Francesco, Nicoletta Noceti, Lorenzo Rosasco, and Francesco Locatello.
    “Shortcuts for Causal Discovery of Nonlinear Models by Score Matching.” <i>ArXiv</i>,
    n.d. <a href="https://doi.org/10.48550/arXiv.2310.14246">https://doi.org/10.48550/arXiv.2310.14246</a>.
  ieee: F. Montagna, N. Noceti, L. Rosasco, and F. Locatello, “Shortcuts for causal
    discovery of nonlinear models by score matching,” <i>arXiv</i>. .
  ista: Montagna F, Noceti N, Rosasco L, Locatello F. Shortcuts for causal discovery
    of nonlinear models by score matching. arXiv, 2310.14246.
  mla: Montagna, Francesco, et al. “Shortcuts for Causal Discovery of Nonlinear Models
    by Score Matching.” <i>ArXiv</i>, 2310.14246, doi:<a href="https://doi.org/10.48550/arXiv.2310.14246">10.48550/arXiv.2310.14246</a>.
  short: F. Montagna, N. Noceti, L. Rosasco, F. Locatello, ArXiv (n.d.).
date_created: 2024-02-08T15:31:46Z
date_published: 2023-10-22T00:00:00Z
date_updated: 2024-02-12T10:03:33Z
day: '22'
department:
- _id: FrLo
doi: 10.48550/arXiv.2310.14246
external_id:
  arxiv:
  - '2310.14246'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2310.14246
month: '10'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: Shortcuts for causal discovery of nonlinear models by score matching
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14962'
abstract:
- lang: eng
  text: "In this paper, we show that recent advances in video representation learning\r\nand
    pre-trained vision-language models allow for substantial improvements in\r\nself-supervised
    video object localization. We propose a method that first\r\nlocalizes objects
    in videos via a slot attention approach and then assigns text\r\nto the obtained
    slots. The latter is achieved by an unsupervised way to read\r\nlocalized semantic
    information from the pre-trained CLIP model. The resulting\r\nvideo object localization
    is entirely unsupervised apart from the implicit\r\nannotation contained in CLIP,
    and it is effectively the first unsupervised\r\napproach that yields good results
    on regular video benchmarks."
article_number: '2309.09858'
article_processing_charge: No
arxiv: 1
author:
- first_name: Ke
  full_name: Fan, Ke
  last_name: Fan
- first_name: Zechen
  full_name: Bai, Zechen
  last_name: Bai
- first_name: Tianjun
  full_name: Xiao, Tianjun
  last_name: Xiao
- first_name: Dominik
  full_name: Zietlow, Dominik
  last_name: Zietlow
- first_name: Max
  full_name: Horn, Max
  last_name: Horn
- first_name: Zixu
  full_name: Zhao, Zixu
  last_name: Zhao
- first_name: Carl-Johann Simon-Gabriel
  full_name: Carl-Johann Simon-Gabriel, Carl-Johann Simon-Gabriel
  last_name: Carl-Johann Simon-Gabriel
- first_name: Mike Zheng
  full_name: Shou, Mike Zheng
  last_name: Shou
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Bernt
  full_name: Schiele, Bernt
  last_name: Schiele
- first_name: Thomas
  full_name: Brox, Thomas
  last_name: Brox
- first_name: Zheng
  full_name: Zhang, Zheng
  last_name: Zhang
- first_name: Yanwei
  full_name: Fu, Yanwei
  last_name: Fu
- first_name: Tong
  full_name: He, Tong
  last_name: He
citation:
  ama: Fan K, Bai Z, Xiao T, et al. Unsupervised open-vocabulary object localization
    in videos. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2309.09858">10.48550/arXiv.2309.09858</a>
  apa: Fan, K., Bai, Z., Xiao, T., Zietlow, D., Horn, M., Zhao, Z., … He, T. (n.d.).
    Unsupervised open-vocabulary object localization in videos. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2309.09858">https://doi.org/10.48550/arXiv.2309.09858</a>
  chicago: Fan, Ke, Zechen Bai, Tianjun Xiao, Dominik Zietlow, Max Horn, Zixu Zhao,
    Carl-Johann Simon-Gabriel Carl-Johann Simon-Gabriel, et al. “Unsupervised Open-Vocabulary
    Object Localization in Videos.” <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/arXiv.2309.09858">https://doi.org/10.48550/arXiv.2309.09858</a>.
  ieee: K. Fan <i>et al.</i>, “Unsupervised open-vocabulary object localization in
    videos,” <i>arXiv</i>. .
  ista: Fan K, Bai Z, Xiao T, Zietlow D, Horn M, Zhao Z, Carl-Johann Simon-Gabriel
    C-JS-G, Shou MZ, Locatello F, Schiele B, Brox T, Zhang Z, Fu Y, He T. Unsupervised
    open-vocabulary object localization in videos. arXiv, 2309.09858.
  mla: Fan, Ke, et al. “Unsupervised Open-Vocabulary Object Localization in Videos.”
    <i>ArXiv</i>, 2309.09858, doi:<a href="https://doi.org/10.48550/arXiv.2309.09858">10.48550/arXiv.2309.09858</a>.
  short: K. Fan, Z. Bai, T. Xiao, D. Zietlow, M. Horn, Z. Zhao, C.-J.S.-G. Carl-Johann
    Simon-Gabriel, M.Z. Shou, F. Locatello, B. Schiele, T. Brox, Z. Zhang, Y. Fu,
    T. He, ArXiv (n.d.).
date_created: 2024-02-08T15:33:39Z
date_published: 2023-09-18T00:00:00Z
date_updated: 2024-02-12T10:12:22Z
day: '18'
department:
- _id: FrLo
doi: 10.48550/arXiv.2309.09858
extern: '1'
external_id:
  arxiv:
  - '2309.09858'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2309.09858
month: '09'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: Unsupervised open-vocabulary object localization in videos
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14963'
abstract:
- lang: eng
  text: "Unsupervised object-centric learning methods allow the partitioning of scenes\r\ninto
    entities without additional localization information and are excellent\r\ncandidates
    for reducing the annotation burden of multiple-object tracking (MOT)\r\npipelines.
    Unfortunately, they lack two key properties: objects are often split\r\ninto parts
    and are not consistently tracked over time. In fact,\r\nstate-of-the-art models
    achieve pixel-level accuracy and temporal consistency\r\nby relying on supervised
    object detection with additional ID labels for the\r\nassociation through time.
    This paper proposes a video object-centric model for\r\nMOT. It consists of an
    index-merge module that adapts the object-centric slots\r\ninto detection outputs
    and an object memory module that builds complete object\r\nprototypes to handle
    occlusions. Benefited from object-centric learning, we\r\nonly require sparse
    detection labels (0%-6.25%) for object localization and\r\nfeature binding. Relying
    on our self-supervised\r\nExpectation-Maximization-inspired loss for object association,
    our approach\r\nrequires no ID labels. Our experiments significantly narrow the
    gap between the\r\nexisting object-centric model and the fully supervised state-of-the-art
    and\r\noutperform several unsupervised trackers."
article_number: '2309.00233'
article_processing_charge: No
arxiv: 1
author:
- first_name: Zixu
  full_name: Zhao, Zixu
  last_name: Zhao
- first_name: Jiaze
  full_name: Wang, Jiaze
  last_name: Wang
- first_name: Max
  full_name: Horn, Max
  last_name: Horn
- first_name: Yizhuo
  full_name: Ding, Yizhuo
  last_name: Ding
- first_name: Tong
  full_name: He, Tong
  last_name: He
- first_name: Zechen
  full_name: Bai, Zechen
  last_name: Bai
- first_name: Dominik
  full_name: Zietlow, Dominik
  last_name: Zietlow
- first_name: Carl-Johann Simon-Gabriel
  full_name: Carl-Johann Simon-Gabriel, Carl-Johann Simon-Gabriel
  last_name: Carl-Johann Simon-Gabriel
- first_name: Bing
  full_name: Shuai, Bing
  last_name: Shuai
- first_name: Zhuowen
  full_name: Tu, Zhuowen
  last_name: Tu
- first_name: Thomas
  full_name: Brox, Thomas
  last_name: Brox
- first_name: Bernt
  full_name: Schiele, Bernt
  last_name: Schiele
- first_name: Yanwei
  full_name: Fu, Yanwei
  last_name: Fu
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Zheng
  full_name: Zhang, Zheng
  last_name: Zhang
- first_name: Tianjun
  full_name: Xiao, Tianjun
  last_name: Xiao
citation:
  ama: Zhao Z, Wang J, Horn M, et al. Object-centric multiple object tracking. <i>arXiv</i>.
    doi:<a href="https://doi.org/10.48550/arXiv.2309.00233">10.48550/arXiv.2309.00233</a>
  apa: Zhao, Z., Wang, J., Horn, M., Ding, Y., He, T., Bai, Z., … Xiao, T. (n.d.).
    Object-centric multiple object tracking. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2309.00233">https://doi.org/10.48550/arXiv.2309.00233</a>
  chicago: Zhao, Zixu, Jiaze Wang, Max Horn, Yizhuo Ding, Tong He, Zechen Bai, Dominik
    Zietlow, et al. “Object-Centric Multiple Object Tracking.” <i>ArXiv</i>, n.d.
    <a href="https://doi.org/10.48550/arXiv.2309.00233">https://doi.org/10.48550/arXiv.2309.00233</a>.
  ieee: Z. Zhao <i>et al.</i>, “Object-centric multiple object tracking,” <i>arXiv</i>.
    .
  ista: Zhao Z, Wang J, Horn M, Ding Y, He T, Bai Z, Zietlow D, Carl-Johann Simon-Gabriel
    C-JS-G, Shuai B, Tu Z, Brox T, Schiele B, Fu Y, Locatello F, Zhang Z, Xiao T.
    Object-centric multiple object tracking. arXiv, 2309.00233.
  mla: Zhao, Zixu, et al. “Object-Centric Multiple Object Tracking.” <i>ArXiv</i>,
    2309.00233, doi:<a href="https://doi.org/10.48550/arXiv.2309.00233">10.48550/arXiv.2309.00233</a>.
  short: Z. Zhao, J. Wang, M. Horn, Y. Ding, T. He, Z. Bai, D. Zietlow, C.-J.S.-G.
    Carl-Johann Simon-Gabriel, B. Shuai, Z. Tu, T. Brox, B. Schiele, Y. Fu, F. Locatello,
    Z. Zhang, T. Xiao, ArXiv (n.d.).
date_created: 2024-02-08T15:34:43Z
date_published: 2023-09-01T00:00:00Z
date_updated: 2024-02-12T10:16:21Z
day: '01'
department:
- _id: FrLo
doi: 10.48550/arXiv.2309.00233
extern: '1'
external_id:
  arxiv:
  - '2309.00233'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: ' https://doi.org/10.48550/arXiv.2309.00233'
month: '09'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: Object-centric multiple object tracking
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14105'
abstract:
- lang: eng
  text: "Despite their recent success, deep neural networks continue to perform poorly
    when they encounter distribution shifts at test time. Many recently proposed approaches
    try to counter this by aligning the model to the new distribution prior to inference.
    With no labels available this requires unsupervised objectives to adapt the model
    on the observed test data. In this paper, we propose Test-Time SelfTraining (TeST):
    a technique that takes as input a model trained on some source data and a novel
    data distribution at test time, and learns invariant and robust representations
    using a student-teacher framework. We find that models adapted using TeST significantly
    improve over baseline testtime adaptation algorithms. TeST achieves competitive
    performance to modern domain adaptation algorithms [4, 43], while having access
    to 5-10x less data at time of adaption. We thoroughly evaluate a variety of baselines
    on two tasks:\r\nobject detection and image segmentation and find that models
    adapted with TeST. We find that TeST sets the new stateof-the art for test-time
    domain adaptation algorithms. "
article_processing_charge: No
arxiv: 1
author:
- first_name: Samarth
  full_name: Sinha, Samarth
  last_name: Sinha
- first_name: Peter
  full_name: Gehler, Peter
  last_name: Gehler
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Bernt
  full_name: Schiele, Bernt
  last_name: Schiele
citation:
  ama: 'Sinha S, Gehler P, Locatello F, Schiele B. TeST: Test-time Self-Training under
    distribution shift. In: <i>2023 IEEE/CVF Winter Conference on Applications of
    Computer Vision</i>. Institute of Electrical and Electronics Engineers; 2023.
    doi:<a href="https://doi.org/10.1109/wacv56688.2023.00278">10.1109/wacv56688.2023.00278</a>'
  apa: 'Sinha, S., Gehler, P., Locatello, F., &#38; Schiele, B. (2023). TeST: Test-time
    Self-Training under distribution shift. In <i>2023 IEEE/CVF Winter Conference
    on Applications of Computer Vision</i>. Waikoloa, HI, United States: Institute
    of Electrical and Electronics Engineers. <a href="https://doi.org/10.1109/wacv56688.2023.00278">https://doi.org/10.1109/wacv56688.2023.00278</a>'
  chicago: 'Sinha, Samarth, Peter Gehler, Francesco Locatello, and Bernt Schiele.
    “TeST: Test-Time Self-Training under Distribution Shift.” In <i>2023 IEEE/CVF
    Winter Conference on Applications of Computer Vision</i>. Institute of Electrical
    and Electronics Engineers, 2023. <a href="https://doi.org/10.1109/wacv56688.2023.00278">https://doi.org/10.1109/wacv56688.2023.00278</a>.'
  ieee: 'S. Sinha, P. Gehler, F. Locatello, and B. Schiele, “TeST: Test-time Self-Training
    under distribution shift,” in <i>2023 IEEE/CVF Winter Conference on Applications
    of Computer Vision</i>, Waikoloa, HI, United States, 2023.'
  ista: 'Sinha S, Gehler P, Locatello F, Schiele B. 2023. TeST: Test-time Self-Training
    under distribution shift. 2023 IEEE/CVF Winter Conference on Applications of Computer
    Vision. WACV: Winter Conference on Applications of Computer Vision.'
  mla: 'Sinha, Samarth, et al. “TeST: Test-Time Self-Training under Distribution Shift.”
    <i>2023 IEEE/CVF Winter Conference on Applications of Computer Vision</i>, Institute
    of Electrical and Electronics Engineers, 2023, doi:<a href="https://doi.org/10.1109/wacv56688.2023.00278">10.1109/wacv56688.2023.00278</a>.'
  short: S. Sinha, P. Gehler, F. Locatello, B. Schiele, in:, 2023 IEEE/CVF Winter
    Conference on Applications of Computer Vision, Institute of Electrical and Electronics
    Engineers, 2023.
conference:
  end_date: 2023-01-07
  location: Waikoloa, HI, United States
  name: 'WACV: Winter Conference on Applications of Computer Vision'
  start_date: 2023-01-02
date_created: 2023-08-21T12:11:38Z
date_published: 2023-02-06T00:00:00Z
date_updated: 2023-09-06T10:26:56Z
day: '06'
department:
- _id: FrLo
doi: 10.1109/wacv56688.2023.00278
extern: '1'
external_id:
  arxiv:
  - '2209.11459'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2209.11459
month: '02'
oa: 1
oa_version: Preprint
publication: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision
publication_identifier:
  eissn:
  - 2642-9381
  isbn:
  - '9781665493475'
publication_status: published
publisher: Institute of Electrical and Electronics Engineers
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'TeST: Test-time Self-Training under distribution shift'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14207'
abstract:
- lang: eng
  text: The binding problem in human cognition, concerning how the brain represents
    and connects objects within a fixed network of neural connections, remains a subject
    of intense debate. Most machine learning efforts addressing this issue in an unsupervised
    setting have focused on slot-based methods, which may be limiting due to their
    discrete nature and difficulty to express uncertainty. Recently, the Complex AutoEncoder
    was proposed as an alternative that learns continuous and distributed object-centric
    representations. However, it is only applicable to simple toy data. In this paper,
    we present Rotating Features, a generalization of complex-valued features to higher
    dimensions, and a new evaluation procedure for extracting objects from distributed
    representations. Additionally, we show the applicability of our approach to pre-trained
    features. Together, these advancements enable us to scale distributed object-centric
    representations from simple toy to real-world data. We believe this work advances
    a new paradigm for addressing the binding problem in machine learning and has
    the potential to inspire further innovation in the field.
article_number: '2306.00600'
article_processing_charge: No
arxiv: 1
author:
- first_name: Sindy
  full_name: Löwe, Sindy
  last_name: Löwe
- first_name: Phillip
  full_name: Lippe, Phillip
  last_name: Lippe
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Max
  full_name: Welling, Max
  last_name: Welling
citation:
  ama: Löwe S, Lippe P, Locatello F, Welling M. Rotating features for object discovery.
    <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2306.00600">10.48550/arXiv.2306.00600</a>
  apa: Löwe, S., Lippe, P., Locatello, F., &#38; Welling, M. (n.d.). Rotating features
    for object discovery. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2306.00600">https://doi.org/10.48550/arXiv.2306.00600</a>
  chicago: Löwe, Sindy, Phillip Lippe, Francesco Locatello, and Max Welling. “Rotating
    Features for Object Discovery.” <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/arXiv.2306.00600">https://doi.org/10.48550/arXiv.2306.00600</a>.
  ieee: S. Löwe, P. Lippe, F. Locatello, and M. Welling, “Rotating features for object
    discovery,” <i>arXiv</i>. .
  ista: Löwe S, Lippe P, Locatello F, Welling M. Rotating features for object discovery.
    arXiv, 2306.00600.
  mla: Löwe, Sindy, et al. “Rotating Features for Object Discovery.” <i>ArXiv</i>,
    2306.00600, doi:<a href="https://doi.org/10.48550/arXiv.2306.00600">10.48550/arXiv.2306.00600</a>.
  short: S. Löwe, P. Lippe, F. Locatello, M. Welling, ArXiv (n.d.).
date_created: 2023-08-22T14:18:00Z
date_published: 2023-06-01T00:00:00Z
date_updated: 2024-02-12T09:53:44Z
day: '01'
department:
- _id: FrLo
doi: 10.48550/arXiv.2306.00600
external_id:
  arxiv:
  - '2306.00600'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2306.00600
month: '06'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: Rotating features for object discovery
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14208'
abstract:
- lang: eng
  text: This paper focuses on over-parameterized deep neural networks (DNNs) with
    ReLU activation functions and proves that when the data distribution is well-separated,
    DNNs can achieve Bayes-optimal test error for classification while obtaining (nearly)
    zero-training error under the lazy training regime. For this purpose, we unify
    three interrelated concepts of overparameterization, benign overfitting, and the
    Lipschitz constant of DNNs. Our results indicate that interpolating with smoother
    functions leads to better generalization. Furthermore, we investigate the special
    case where interpolating smooth ground-truth functions is performed by DNNs under
    the Neural Tangent Kernel (NTK) regime for generalization. Our result demonstrates
    that the generalization error converges to a constant order that only depends
    on label noise and initialization noise, which theoretically verifies benign overfitting.
    Our analysis provides a tight lower bound on the normalized margin under non-smooth
    activation functions, as well as the minimum eigenvalue of NTK under high-dimensional
    settings, which has its own interest in learning theory.
alternative_title:
- PMLR
article_processing_charge: No
arxiv: 1
author:
- first_name: Zhenyu
  full_name: Zhu, Zhenyu
  last_name: Zhu
- first_name: Fanghui
  full_name: Liu, Fanghui
  last_name: Liu
- first_name: Grigorios G
  full_name: Chrysos, Grigorios G
  last_name: Chrysos
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Volkan
  full_name: Cevher, Volkan
  last_name: Cevher
citation:
  ama: 'Zhu Z, Liu F, Chrysos GG, Locatello F, Cevher V. Benign overfitting in deep
    neural networks under lazy training. In: <i>Proceedings of the 40th International
    Conference on Machine Learning</i>. Vol 202. ML Research Press; 2023:43105-43128.'
  apa: 'Zhu, Z., Liu, F., Chrysos, G. G., Locatello, F., &#38; Cevher, V. (2023).
    Benign overfitting in deep neural networks under lazy training. In <i>Proceedings
    of the 40th International Conference on Machine Learning</i> (Vol. 202, pp. 43105–43128).
    Honolulu, Hawaii, United States: ML Research Press.'
  chicago: Zhu, Zhenyu, Fanghui Liu, Grigorios G Chrysos, Francesco Locatello, and
    Volkan Cevher. “Benign Overfitting in Deep Neural Networks under Lazy Training.”
    In <i>Proceedings of the 40th International Conference on Machine Learning</i>,
    202:43105–28. ML Research Press, 2023.
  ieee: Z. Zhu, F. Liu, G. G. Chrysos, F. Locatello, and V. Cevher, “Benign overfitting
    in deep neural networks under lazy training,” in <i>Proceedings of the 40th International
    Conference on Machine Learning</i>, Honolulu, Hawaii, United States, 2023, vol.
    202, pp. 43105–43128.
  ista: Zhu Z, Liu F, Chrysos GG, Locatello F, Cevher V. 2023. Benign overfitting
    in deep neural networks under lazy training. Proceedings of the 40th International
    Conference on Machine Learning. International Conference on Machine Learning,
    PMLR, vol. 202, 43105–43128.
  mla: Zhu, Zhenyu, et al. “Benign Overfitting in Deep Neural Networks under Lazy
    Training.” <i>Proceedings of the 40th International Conference on Machine Learning</i>,
    vol. 202, ML Research Press, 2023, pp. 43105–28.
  short: Z. Zhu, F. Liu, G.G. Chrysos, F. Locatello, V. Cevher, in:, Proceedings of
    the 40th International Conference on Machine Learning, ML Research Press, 2023,
    pp. 43105–43128.
conference:
  end_date: 2023-07-29
  location: Honolulu, Hawaii, United States
  name: International Conference on Machine Learning
  start_date: 2023-07-23
date_created: 2023-08-22T14:18:18Z
date_published: 2023-05-30T00:00:00Z
date_updated: 2023-09-13T08:46:46Z
day: '30'
department:
- _id: FrLo
extern: '1'
external_id:
  arxiv:
  - '2305.19377'
intvolume: '       202'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2305.19377
month: '05'
oa: 1
oa_version: Preprint
page: 43105-43128
publication: Proceedings of the 40th International Conference on Machine Learning
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
status: public
title: Benign overfitting in deep neural networks under lazy training
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 202
year: '2023'
...
---
_id: '14209'
abstract:
- lang: eng
  text: Diffusion models excel at generating photorealistic images from text-queries.
    Naturally, many approaches have been proposed to use these generative abilities
    to augment training datasets for downstream tasks, such as classification. However,
    diffusion models are themselves trained on large noisily supervised, but nonetheless,
    annotated datasets. It is an open question whether the generalization capabilities
    of diffusion models beyond using the additional data of the pre-training process
    for augmentation lead to improved downstream performance. We perform a systematic
    evaluation of existing methods to generate images from diffusion models and study
    new extensions to assess their benefit for data augmentation. While we find that
    personalizing diffusion models towards the target data outperforms simpler prompting
    strategies, we also show that using the training data of the diffusion model alone,
    via a simple nearest neighbor retrieval procedure, leads to even stronger downstream
    performance. Overall, our study probes the limitations of diffusion models for
    data augmentation but also highlights its potential in generating new training
    data to improve performance on simple downstream vision tasks.
article_number: '2304.10253'
article_processing_charge: No
arxiv: 1
author:
- first_name: Max F.
  full_name: Burg, Max F.
  last_name: Burg
- first_name: Florian
  full_name: Wenzel, Florian
  last_name: Wenzel
- first_name: Dominik
  full_name: Zietlow, Dominik
  last_name: Zietlow
- first_name: Max
  full_name: Horn, Max
  last_name: Horn
- first_name: Osama
  full_name: Makansi, Osama
  last_name: Makansi
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
- first_name: Chris
  full_name: Russell, Chris
  last_name: Russell
citation:
  ama: Burg MF, Wenzel F, Zietlow D, et al. A data augmentation perspective on diffusion
    models and retrieval. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2304.10253">10.48550/arXiv.2304.10253</a>
  apa: Burg, M. F., Wenzel, F., Zietlow, D., Horn, M., Makansi, O., Locatello, F.,
    &#38; Russell, C. (n.d.). A data augmentation perspective on diffusion models
    and retrieval. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2304.10253">https://doi.org/10.48550/arXiv.2304.10253</a>
  chicago: Burg, Max F., Florian Wenzel, Dominik Zietlow, Max Horn, Osama Makansi,
    Francesco Locatello, and Chris Russell. “A Data Augmentation Perspective on Diffusion
    Models and Retrieval.” <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/arXiv.2304.10253">https://doi.org/10.48550/arXiv.2304.10253</a>.
  ieee: M. F. Burg <i>et al.</i>, “A data augmentation perspective on diffusion models
    and retrieval,” <i>arXiv</i>. .
  ista: Burg MF, Wenzel F, Zietlow D, Horn M, Makansi O, Locatello F, Russell C. A
    data augmentation perspective on diffusion models and retrieval. arXiv, 2304.10253.
  mla: Burg, Max F., et al. “A Data Augmentation Perspective on Diffusion Models and
    Retrieval.” <i>ArXiv</i>, 2304.10253, doi:<a href="https://doi.org/10.48550/arXiv.2304.10253">10.48550/arXiv.2304.10253</a>.
  short: M.F. Burg, F. Wenzel, D. Zietlow, M. Horn, O. Makansi, F. Locatello, C. Russell,
    ArXiv (n.d.).
date_created: 2023-08-22T14:18:43Z
date_published: 2023-04-20T00:00:00Z
date_updated: 2023-09-13T08:51:56Z
day: '20'
department:
- _id: FrLo
doi: 10.48550/arXiv.2304.10253
extern: '1'
external_id:
  arxiv:
  - '2304.10253'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2304.10253
month: '04'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: A data augmentation perspective on diffusion models and retrieval
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14210'
abstract:
- lang: eng
  text: Recovering the latent factors of variation of high dimensional data has so
    far focused on simple synthetic settings. Mostly building on unsupervised and
    weakly-supervised objectives, prior work missed out on the positive implications
    for representation learning on real world data. In this work, we propose to leverage
    knowledge extracted from a diversified set of supervised tasks to learn a common
    disentangled representation. Assuming each supervised task only depends on an
    unknown subset of the factors of variation, we disentangle the feature space of
    a supervised multi-task model, with features activating sparsely across different
    tasks and information being shared as appropriate. Importantly, we never directly
    observe the factors of variations but establish that access to multiple tasks
    is sufficient for identifiability under sufficiency and minimality assumptions.
    We validate our approach on six real world distribution shift benchmarks, and
    different data modalities (images, text), demonstrating how disentangled representations
    can be transferred to real settings.
article_number: '2304.07939'
article_processing_charge: No
arxiv: 1
author:
- first_name: Marco
  full_name: Fumero, Marco
  last_name: Fumero
- first_name: Florian
  full_name: Wenzel, Florian
  last_name: Wenzel
- first_name: Luca
  full_name: Zancato, Luca
  last_name: Zancato
- first_name: Alessandro
  full_name: Achille, Alessandro
  last_name: Achille
- first_name: Emanuele
  full_name: Rodolà, Emanuele
  last_name: Rodolà
- first_name: Stefano
  full_name: Soatto, Stefano
  last_name: Soatto
- first_name: Bernhard
  full_name: Schölkopf, Bernhard
  last_name: Schölkopf
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
citation:
  ama: Fumero M, Wenzel F, Zancato L, et al. Leveraging sparse and shared feature
    activations for disentangled representation learning. <i>arXiv</i>. doi:<a href="https://doi.org/10.48550/arXiv.2304.07939">10.48550/arXiv.2304.07939</a>
  apa: Fumero, M., Wenzel, F., Zancato, L., Achille, A., Rodolà, E., Soatto, S., …
    Locatello, F. (n.d.). Leveraging sparse and shared feature activations for disentangled
    representation learning. <i>arXiv</i>. <a href="https://doi.org/10.48550/arXiv.2304.07939">https://doi.org/10.48550/arXiv.2304.07939</a>
  chicago: Fumero, Marco, Florian Wenzel, Luca Zancato, Alessandro Achille, Emanuele
    Rodolà, Stefano Soatto, Bernhard Schölkopf, and Francesco Locatello. “Leveraging
    Sparse and Shared Feature Activations for Disentangled Representation Learning.”
    <i>ArXiv</i>, n.d. <a href="https://doi.org/10.48550/arXiv.2304.07939">https://doi.org/10.48550/arXiv.2304.07939</a>.
  ieee: M. Fumero <i>et al.</i>, “Leveraging sparse and shared feature activations
    for disentangled representation learning,” <i>arXiv</i>. .
  ista: Fumero M, Wenzel F, Zancato L, Achille A, Rodolà E, Soatto S, Schölkopf B,
    Locatello F. Leveraging sparse and shared feature activations for disentangled
    representation learning. arXiv, 2304.07939.
  mla: Fumero, Marco, et al. “Leveraging Sparse and Shared Feature Activations for
    Disentangled Representation Learning.” <i>ArXiv</i>, 2304.07939, doi:<a href="https://doi.org/10.48550/arXiv.2304.07939">10.48550/arXiv.2304.07939</a>.
  short: M. Fumero, F. Wenzel, L. Zancato, A. Achille, E. Rodolà, S. Soatto, B. Schölkopf,
    F. Locatello, ArXiv (n.d.).
date_created: 2023-08-22T14:19:03Z
date_published: 2023-04-17T00:00:00Z
date_updated: 2024-02-12T09:55:48Z
day: '17'
department:
- _id: FrLo
doi: 10.48550/arXiv.2304.07939
external_id:
  arxiv:
  - '2304.07939'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2304.07939
month: '04'
oa: 1
oa_version: Preprint
publication: arXiv
publication_status: submitted
status: public
title: Leveraging sparse and shared feature activations for disentangled representation
  learning
type: preprint
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14211'
abstract:
- lang: eng
  text: 'Causal discovery methods are intrinsically constrained by the set of assumptions
    needed to ensure structure identifiability. Moreover additional restrictions are
    often imposed in order to simplify the inference task: this is the case for the
    Gaussian noise assumption on additive non-linear models, which is common to many
    causal discovery approaches. In this paper we show the shortcomings of inference
    under this hypothesis, analyzing the risk of edge inversion under violation of
    Gaussianity of the noise terms. Then, we propose a novel method for inferring
    the topological ordering of the variables in the causal graph, from data generated
    according to an additive non-linear model with a generic noise distribution. This
    leads to NoGAM (Not only Gaussian Additive noise Models), a causal discovery algorithm
    with a minimal set of assumptions and state of the art performance, experimentally
    benchmarked on synthetic data.'
article_processing_charge: No
arxiv: 1
author:
- first_name: Francesco
  full_name: Montagna, Francesco
  last_name: Montagna
- first_name: Nicoletta
  full_name: Noceti, Nicoletta
  last_name: Noceti
- first_name: Lorenzo
  full_name: Rosasco, Lorenzo
  last_name: Rosasco
- first_name: Kun
  full_name: Zhang, Kun
  last_name: Zhang
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
citation:
  ama: 'Montagna F, Noceti N, Rosasco L, Zhang K, Locatello F. Causal discovery with
    score matching on additive models with arbitrary noise. In: <i>2nd Conference
    on Causal Learning and Reasoning</i>. ; 2023.'
  apa: Montagna, F., Noceti, N., Rosasco, L., Zhang, K., &#38; Locatello, F. (2023).
    Causal discovery with score matching on additive models with arbitrary noise.
    In <i>2nd Conference on Causal Learning and Reasoning</i>. Tübingen, Germany.
  chicago: Montagna, Francesco, Nicoletta Noceti, Lorenzo Rosasco, Kun Zhang, and
    Francesco Locatello. “Causal Discovery with Score Matching on Additive Models
    with Arbitrary Noise.” In <i>2nd Conference on Causal Learning and Reasoning</i>,
    2023.
  ieee: F. Montagna, N. Noceti, L. Rosasco, K. Zhang, and F. Locatello, “Causal discovery
    with score matching on additive models with arbitrary noise,” in <i>2nd Conference
    on Causal Learning and Reasoning</i>, Tübingen, Germany, 2023.
  ista: 'Montagna F, Noceti N, Rosasco L, Zhang K, Locatello F. 2023. Causal discovery
    with score matching on additive models with arbitrary noise. 2nd Conference on
    Causal Learning and Reasoning. CLeaR: Conference on Causal Learning and Reasoning.'
  mla: Montagna, Francesco, et al. “Causal Discovery with Score Matching on Additive
    Models with Arbitrary Noise.” <i>2nd Conference on Causal Learning and Reasoning</i>,
    2023.
  short: F. Montagna, N. Noceti, L. Rosasco, K. Zhang, F. Locatello, in:, 2nd Conference
    on Causal Learning and Reasoning, 2023.
conference:
  end_date: 2023-04-14
  location: Tübingen, Germany
  name: 'CLeaR: Conference on Causal Learning and Reasoning'
  start_date: 2023-04-11
date_created: 2023-08-22T14:19:21Z
date_published: 2023-04-01T00:00:00Z
date_updated: 2023-09-13T09:00:31Z
day: '01'
department:
- _id: FrLo
extern: '1'
external_id:
  arxiv:
  - '2304.03265'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2304.03265
month: '04'
oa: 1
oa_version: Preprint
publication: 2nd Conference on Causal Learning and Reasoning
publication_status: published
quality_controlled: '1'
scopus_import: '1'
status: public
title: Causal discovery with score matching on additive models with arbitrary noise
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14212'
abstract:
- lang: eng
  text: This paper demonstrates how to discover the whole causal graph from the second
    derivative of the log-likelihood in observational non-linear additive Gaussian
    noise models. Leveraging scalable machine learning approaches to approximate the
    score function ∇logp(X), we extend the work of Rolland et al. (2022) that only
    recovers the topological order from the score and requires an expensive pruning
    step removing spurious edges among those admitted by the ordering. Our analysis
    leads to DAS (acronym for Discovery At Scale), a practical algorithm that reduces
    the complexity of the pruning by a factor proportional to the graph size. In practice,
    DAS achieves competitive accuracy with current state-of-the-art while being over
    an order of magnitude faster. Overall, our approach enables principled and scalable
    causal discovery, significantly lowering the compute bar.
article_processing_charge: No
arxiv: 1
author:
- first_name: Francesco
  full_name: Montagna, Francesco
  last_name: Montagna
- first_name: Nicoletta
  full_name: Noceti, Nicoletta
  last_name: Noceti
- first_name: Lorenzo
  full_name: Rosasco, Lorenzo
  last_name: Rosasco
- first_name: Kun
  full_name: Zhang, Kun
  last_name: Zhang
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
citation:
  ama: 'Montagna F, Noceti N, Rosasco L, Zhang K, Locatello F. Scalable causal discovery
    with score matching. In: <i>2nd Conference on Causal Learning and Reasoning</i>.
    ; 2023.'
  apa: Montagna, F., Noceti, N., Rosasco, L., Zhang, K., &#38; Locatello, F. (2023).
    Scalable causal discovery with score matching. In <i>2nd Conference on Causal
    Learning and Reasoning</i>. Tübingen, Germany.
  chicago: Montagna, Francesco, Nicoletta Noceti, Lorenzo Rosasco, Kun Zhang, and
    Francesco Locatello. “Scalable Causal Discovery with Score Matching.” In <i>2nd
    Conference on Causal Learning and Reasoning</i>, 2023.
  ieee: F. Montagna, N. Noceti, L. Rosasco, K. Zhang, and F. Locatello, “Scalable
    causal discovery with score matching,” in <i>2nd Conference on Causal Learning
    and Reasoning</i>, Tübingen, Germany, 2023.
  ista: 'Montagna F, Noceti N, Rosasco L, Zhang K, Locatello F. 2023. Scalable causal
    discovery with score matching. 2nd Conference on Causal Learning and Reasoning.
    CLeaR: Conference on Causal Learning and Reasoning.'
  mla: Montagna, Francesco, et al. “Scalable Causal Discovery with Score Matching.”
    <i>2nd Conference on Causal Learning and Reasoning</i>, 2023.
  short: F. Montagna, N. Noceti, L. Rosasco, K. Zhang, F. Locatello, in:, 2nd Conference
    on Causal Learning and Reasoning, 2023.
conference:
  end_date: 2023-04-14
  location: Tübingen, Germany
  name: 'CLeaR: Conference on Causal Learning and Reasoning'
  start_date: 2023-04-11
date_created: 2023-08-22T14:19:40Z
date_published: 2023-04-01T00:00:00Z
date_updated: 2023-09-13T09:03:24Z
day: '01'
department:
- _id: FrLo
extern: '1'
external_id:
  arxiv:
  - '2304.03382'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2304.03382
month: '04'
oa: 1
oa_version: Preprint
publication: 2nd Conference on Causal Learning and Reasoning
publication_status: published
quality_controlled: '1'
scopus_import: '1'
status: public
title: Scalable causal discovery with score matching
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14214'
abstract:
- lang: eng
  text: 'Recent years have seen a surge of interest in learning high-level causal
    representations from low-level image pairs under interventions. Yet, existing
    efforts are largely limited to simple synthetic settings that are far away from
    real-world problems. In this paper, we present Causal Triplet, a causal representation
    learning benchmark featuring not only visually more complex scenes, but also two
    crucial desiderata commonly overlooked in previous works: (i) an actionable counterfactual
    setting, where only certain object-level variables allow for counterfactual observations
    whereas others do not; (ii) an interventional downstream task with an emphasis
    on out-of-distribution robustness from the independent causal mechanisms principle.
    Through extensive experiments, we find that models built with the knowledge of
    disentangled or object-centric representations significantly outperform their
    distributed counterparts. However, recent causal representation learning methods
    still struggle to identify such latent structures, indicating substantial challenges
    and opportunities for future work.'
article_processing_charge: No
arxiv: 1
author:
- first_name: Yuejiang
  full_name: Liu, Yuejiang
  last_name: Liu
- first_name: Alexandre
  full_name: Alahi, Alexandre
  last_name: Alahi
- first_name: Chris
  full_name: Russell, Chris
  last_name: Russell
- first_name: Max
  full_name: Horn, Max
  last_name: Horn
- first_name: Dominik
  full_name: Zietlow, Dominik
  last_name: Zietlow
- first_name: Bernhard
  full_name: Schölkopf, Bernhard
  last_name: Schölkopf
- first_name: Francesco
  full_name: Locatello, Francesco
  id: 26cfd52f-2483-11ee-8040-88983bcc06d4
  last_name: Locatello
  orcid: 0000-0002-4850-0683
citation:
  ama: 'Liu Y, Alahi A, Russell C, et al. Causal triplet: An open challenge for intervention-centric
    causal representation learning. In: <i>2nd Conference on Causal Learning and Reasoning</i>.
    ; 2023.'
  apa: 'Liu, Y., Alahi, A., Russell, C., Horn, M., Zietlow, D., Schölkopf, B., &#38;
    Locatello, F. (2023). Causal triplet: An open challenge for intervention-centric
    causal representation learning. In <i>2nd Conference on Causal Learning and Reasoning</i>.
    Tübingen, Germany.'
  chicago: 'Liu, Yuejiang, Alexandre Alahi, Chris Russell, Max Horn, Dominik Zietlow,
    Bernhard Schölkopf, and Francesco Locatello. “Causal Triplet: An Open Challenge
    for Intervention-Centric Causal Representation Learning.” In <i>2nd Conference
    on Causal Learning and Reasoning</i>, 2023.'
  ieee: 'Y. Liu <i>et al.</i>, “Causal triplet: An open challenge for intervention-centric
    causal representation learning,” in <i>2nd Conference on Causal Learning and Reasoning</i>,
    Tübingen, Germany, 2023.'
  ista: 'Liu Y, Alahi A, Russell C, Horn M, Zietlow D, Schölkopf B, Locatello F. 2023.
    Causal triplet: An open challenge for intervention-centric causal representation
    learning. 2nd Conference on Causal Learning and Reasoning. CLeaR: Conference on
    Causal Learning and Reasoning.'
  mla: 'Liu, Yuejiang, et al. “Causal Triplet: An Open Challenge for Intervention-Centric
    Causal Representation Learning.” <i>2nd Conference on Causal Learning and Reasoning</i>,
    2023.'
  short: Y. Liu, A. Alahi, C. Russell, M. Horn, D. Zietlow, B. Schölkopf, F. Locatello,
    in:, 2nd Conference on Causal Learning and Reasoning, 2023.
conference:
  end_date: 2023-04-14
  location: Tübingen, Germany
  name: 'CLeaR: Conference on Causal Learning and Reasoning'
  start_date: 2023-04-11
date_created: 2023-08-22T14:20:18Z
date_published: 2023-04-12T00:00:00Z
date_updated: 2023-09-13T09:23:08Z
day: '12'
department:
- _id: FrLo
extern: '1'
external_id:
  arxiv:
  - '2301.05169'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://doi.org/10.48550/arXiv.2301.05169
month: '04'
oa: 1
oa_version: Preprint
publication: 2nd Conference on Causal Learning and Reasoning
publication_status: published
quality_controlled: '1'
status: public
title: 'Causal triplet: An open challenge for intervention-centric causal representation
  learning'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...