{"date_published":"2023-11-01T00:00:00Z","citation":{"ama":"Maiorca V, Moschella L, Norelli A, Fumero M, Locatello F, Rodolà E. Latent space translation via semantic alignment. arXiv. doi:10.48550/arXiv.2311.00664","ista":"Maiorca V, Moschella L, Norelli A, Fumero M, Locatello F, Rodolà E. Latent space translation via semantic alignment. arXiv, 2311.00664.","apa":"Maiorca, V., Moschella, L., Norelli, A., Fumero, M., Locatello, F., & Rodolà, E. (n.d.). Latent space translation via semantic alignment. arXiv. https://doi.org/10.48550/arXiv.2311.00664","chicago":"Maiorca, Valentino, Luca Moschella, Antonio Norelli, Marco Fumero, Francesco Locatello, and Emanuele Rodolà. “Latent Space Translation via Semantic Alignment.” ArXiv, n.d. https://doi.org/10.48550/arXiv.2311.00664.","short":"V. Maiorca, L. Moschella, A. Norelli, M. Fumero, F. Locatello, E. Rodolà, ArXiv (n.d.).","mla":"Maiorca, Valentino, et al. “Latent Space Translation via Semantic Alignment.” ArXiv, 2311.00664, doi:10.48550/arXiv.2311.00664.","ieee":"V. Maiorca, L. Moschella, A. Norelli, M. Fumero, F. Locatello, and E. Rodolà, “Latent space translation via semantic alignment,” arXiv. ."},"publication_status":"submitted","abstract":[{"lang":"eng","text":"While different neural models often exhibit latent spaces that are alike when exposed to semantically related data, this intrinsic similarity is not always immediately discernible. Towards a better understanding of this phenomenon, our work shows how representations learned from these neural modules can be translated between different pre-trained networks via simpler transformations than previously thought. An advantage of this approach is the ability to\r\nestimate these transformations using standard, well-understood algebraic procedures that have closed-form solutions. Our method directly estimates a transformation between two given latent spaces, thereby enabling effective stitching of encoders and decoders without additional training. We extensively validate the adaptability of this translation procedure in different\r\nexperimental settings: across various trainings, domains, architectures (e.g., ResNet, CNN, ViT), and in multiple downstream tasks (classification, reconstruction). Notably, we show how it is possible to zero-shot stitch text encoders and vision decoders, or vice-versa, yielding surprisingly good classification performance in this multimodal setting."}],"main_file_link":[{"open_access":"1","url":"https://doi.org/10.48550/arXiv.2311.00664"}],"article_number":"2311.00664","_id":"14952","department":[{"_id":"FrLo"}],"language":[{"iso":"eng"}],"author":[{"full_name":"Maiorca, Valentino","last_name":"Maiorca","first_name":"Valentino"},{"full_name":"Moschella, Luca","last_name":"Moschella","first_name":"Luca"},{"first_name":"Antonio","last_name":"Norelli","full_name":"Norelli, Antonio"},{"full_name":"Fumero, Marco","first_name":"Marco","last_name":"Fumero"},{"id":"26cfd52f-2483-11ee-8040-88983bcc06d4","full_name":"Locatello, Francesco","last_name":"Locatello","first_name":"Francesco","orcid":"0000-0002-4850-0683"},{"last_name":"Rodolà","first_name":"Emanuele","full_name":"Rodolà, Emanuele"}],"day":"01","publication":"arXiv","doi":"10.48550/arXiv.2311.00664","type":"preprint","date_updated":"2024-02-12T09:40:23Z","article_processing_charge":"No","month":"11","external_id":{"arxiv":["2311.00664"]},"status":"public","acknowledgement":"This work is supported by the ERC grant no.802554 (SPECGEO), PRIN 2020 project no.2020TA3K9N (LEGO.AI), and PNRR MUR project PE0000013-FAIR. Francesco\r\nLocatello did not contribute to this work at Amazon.","title":"Latent space translation via semantic alignment","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","date_created":"2024-02-07T15:08:55Z","oa_version":"Preprint","oa":1,"year":"2023"}