{"date_created":"2024-02-07T14:47:04Z","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","oa_version":"Preprint","oa":1,"year":"2023","status":"public","acknowledgement":"This work was supported by supported by UKRI (grant agreement no. EP/S023356/1), in the UKRI\r\nCentre for Doctoral Training in Safe and Trusted AI via A. Kori.","title":"Grounded object centric learning","day":"18","publication":"arXiv","author":[{"full_name":"Kori, Avinash","first_name":"Avinash","last_name":"Kori"},{"id":"26cfd52f-2483-11ee-8040-88983bcc06d4","full_name":"Locatello, Francesco","first_name":"Francesco","last_name":"Locatello","orcid":"0000-0002-4850-0683"},{"first_name":"Fabio De Sousa","last_name":"Ribeiro","full_name":"Ribeiro, Fabio De Sousa"},{"first_name":"Francesca","last_name":"Toni","full_name":"Toni, Francesca"},{"full_name":"Glocker, Ben","last_name":"Glocker","first_name":"Ben"}],"doi":"10.48550/arXiv.2307.09437","type":"preprint","date_updated":"2024-02-12T08:13:12Z","article_processing_charge":"No","month":"07","external_id":{"arxiv":["2307.09437"]},"citation":{"ieee":"A. Kori, F. Locatello, F. D. S. Ribeiro, F. Toni, and B. Glocker, “Grounded object centric learning,” arXiv. .","mla":"Kori, Avinash, et al. “Grounded Object Centric Learning.” ArXiv, 2307.09437, doi:10.48550/arXiv.2307.09437.","short":"A. Kori, F. Locatello, F.D.S. Ribeiro, F. Toni, B. Glocker, ArXiv (n.d.).","ista":"Kori A, Locatello F, Ribeiro FDS, Toni F, Glocker B. Grounded object centric learning. arXiv, 2307.09437.","chicago":"Kori, Avinash, Francesco Locatello, Fabio De Sousa Ribeiro, Francesca Toni, and Ben Glocker. “Grounded Object Centric Learning.” ArXiv, n.d. https://doi.org/10.48550/arXiv.2307.09437.","apa":"Kori, A., Locatello, F., Ribeiro, F. D. S., Toni, F., & Glocker, B. (n.d.). Grounded object centric learning. arXiv. https://doi.org/10.48550/arXiv.2307.09437","ama":"Kori A, Locatello F, Ribeiro FDS, Toni F, Glocker B. Grounded object centric learning. arXiv. doi:10.48550/arXiv.2307.09437"},"date_published":"2023-07-18T00:00:00Z","publication_status":"submitted","abstract":[{"text":"The extraction of modular object-centric representations for downstream tasks\r\nis an emerging area of research. Learning grounded representations of objects\r\nthat are guaranteed to be stable and invariant promises robust performance\r\nacross different tasks and environments. Slot Attention (SA) learns\r\nobject-centric representations by assigning objects to \\textit{slots}, but\r\npresupposes a \\textit{single} distribution from which all slots are randomly\r\ninitialised. This results in an inability to learn \\textit{specialized} slots\r\nwhich bind to specific object types and remain invariant to identity-preserving\r\nchanges in object appearance. To address this, we present\r\n\\emph{\\textsc{Co}nditional \\textsc{S}lot \\textsc{A}ttention} (\\textsc{CoSA})\r\nusing a novel concept of \\emph{Grounded Slot Dictionary} (GSD) inspired by\r\nvector quantization. Our proposed GSD comprises (i) canonical object-level\r\nproperty vectors and (ii) parametric Gaussian distributions, which define a\r\nprior over the slots. We demonstrate the benefits of our method in multiple\r\ndownstream tasks such as scene generation, composition, and task adaptation,\r\nwhilst remaining competitive with SA in popular object discovery benchmarks.","lang":"eng"}],"main_file_link":[{"open_access":"1","url":"https://doi.org/10.48550/arXiv.2307.09437"}],"_id":"14948","article_number":"2307.09437","language":[{"iso":"eng"}],"department":[{"_id":"FrLo"}]}