{"acknowledgement":"We would like to thank Abby Schantz, Abe Ittycheriah, Aliaksei Severyn, Allan Heydon, Aly\r\nGrealish, Andrey Vlasov, Arkaitz Zubiaga, Ashwin Kakarla, Chen Sun, Clayton Williams, Cong\r\nYu, Cordelia Schmid, Da-Cheng Juan, Dan Finnie, Dani Valevski, Daniel Rocha, David Price, David Sklar, Devi Krishna, Elena Kochkina, Enrique Alfonseca, Franc¸oise Beaufays, Isabelle Augenstein, Jialu Liu, John Cantwell, John Palowitch, Jordan Boyd-Graber, Lei Shi, Luis Valente, Maria Voitovich, Mehmet Aktuna, Mogan Brown, Mor Naaman, Natalia P, Nidhi Hebbar, Pete Aykroyd, Rahul Sukthankar, Richa Dixit, Steve Pucci, Tania Bedrax-Weiss, Tobias Kaufmann, Tom Boulos, Tu Tsao, Vladimir Chtchetkine, Yair Kurzion, Yifan Xu and Zach Hynes.","file_date_updated":"2021-11-29T08:41:00Z","ddc":["000"],"year":"2021","citation":{"apa":"Ilharco, C., Shirazi, A., Gopalan, A., Nagrani, A., Bratanič, B., Bregler, C., … Imbrasaite, V. (2021). Recognizing multimodal entailment. In 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts (pp. 29–30). Bangkok, Thailand: Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-tutorials.6","ista":"Ilharco C, Shirazi A, Gopalan A, Nagrani A, Bratanič B, Bregler C, Liu C, Ferreira F, Barcik G, Ilharco G, Osang GF, Bulian J, Frank J, Smaira L, Cao Q, Marino R, Patel R, Leung T, Imbrasaite V. 2021. Recognizing multimodal entailment. 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts. ACL: Association for Computational Linguistics ; IJCNLP: International Joint Conference on Natural Language Processing, 29–30.","chicago":"Ilharco, Cesar, Afsaneh Shirazi, Arjun Gopalan, Arsha Nagrani, Blaž Bratanič, Chris Bregler, Christina Liu, et al. “Recognizing Multimodal Entailment.” In 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts, 29–30. Association for Computational Linguistics, 2021. https://doi.org/10.18653/v1/2021.acl-tutorials.6.","ama":"Ilharco C, Shirazi A, Gopalan A, et al. Recognizing multimodal entailment. In: 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts. Association for Computational Linguistics; 2021:29-30. doi:10.18653/v1/2021.acl-tutorials.6","short":"C. Ilharco, A. Shirazi, A. Gopalan, A. Nagrani, B. Bratanič, C. Bregler, C. Liu, F. Ferreira, G. Barcik, G. Ilharco, G.F. Osang, J. Bulian, J. Frank, L. Smaira, Q. Cao, R. Marino, R. Patel, T. Leung, V. Imbrasaite, in:, 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts, Association for Computational Linguistics, 2021, pp. 29–30.","mla":"Ilharco, Cesar, et al. “Recognizing Multimodal Entailment.” 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts, Association for Computational Linguistics, 2021, pp. 29–30, doi:10.18653/v1/2021.acl-tutorials.6.","ieee":"C. Ilharco et al., “Recognizing multimodal entailment,” in 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts, Bangkok, Thailand, 2021, pp. 29–30."},"conference":{"location":"Bangkok, Thailand","end_date":"2021-08-06","name":"ACL: Association for Computational Linguistics ; IJCNLP: International Joint Conference on Natural Language Processing","start_date":"2021-08-01"},"language":[{"iso":"eng"}],"_id":"10367","doi":"10.18653/v1/2021.acl-tutorials.6","page":"29-30","has_accepted_license":"1","publication":"59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts","author":[{"full_name":"Ilharco, Cesar","last_name":"Ilharco","first_name":"Cesar"},{"full_name":"Shirazi, Afsaneh","first_name":"Afsaneh","last_name":"Shirazi"},{"first_name":"Arjun","last_name":"Gopalan","full_name":"Gopalan, Arjun"},{"last_name":"Nagrani","first_name":"Arsha","full_name":"Nagrani, Arsha"},{"last_name":"Bratanič","first_name":"Blaž","full_name":"Bratanič, Blaž"},{"full_name":"Bregler, Chris","first_name":"Chris","last_name":"Bregler"},{"last_name":"Liu","first_name":"Christina","full_name":"Liu, Christina"},{"last_name":"Ferreira","first_name":"Felipe","full_name":"Ferreira, Felipe"},{"full_name":"Barcik, Gabriek","last_name":"Barcik","first_name":"Gabriek"},{"full_name":"Ilharco, Gabriel","last_name":"Ilharco","first_name":"Gabriel"},{"first_name":"Georg F","last_name":"Osang","id":"464B40D6-F248-11E8-B48F-1D18A9856A87","full_name":"Osang, Georg F"},{"full_name":"Bulian, Jannis","first_name":"Jannis","last_name":"Bulian"},{"full_name":"Frank, Jared","first_name":"Jared","last_name":"Frank"},{"first_name":"Lucas","last_name":"Smaira","full_name":"Smaira, Lucas"},{"first_name":"Qin","last_name":"Cao","full_name":"Cao, Qin"},{"full_name":"Marino, Ricardo","last_name":"Marino","first_name":"Ricardo"},{"full_name":"Patel, Roma","last_name":"Patel","first_name":"Roma"},{"full_name":"Leung, Thomas","last_name":"Leung","first_name":"Thomas"},{"full_name":"Imbrasaite, Vaiva","first_name":"Vaiva","last_name":"Imbrasaite"}],"publication_identifier":{"isbn":["9-781-9540-8557-2"]},"status":"public","quality_controlled":"1","title":"Recognizing multimodal entailment","date_created":"2021-11-28T23:01:30Z","user_id":"8b945eb4-e2f2-11eb-945a-df72226e66a9","oa":1,"oa_version":"Published Version","scopus_import":"1","publisher":"Association for Computational Linguistics","publication_status":"published","date_published":"2021-08-01T00:00:00Z","abstract":[{"text":"How information is created, shared and consumed has changed rapidly in recent decades, in part thanks to new social platforms and technologies on the web. With ever-larger amounts of unstructured and limited labels, organizing and reconciling information from different sources and modalities is a central challenge in machine learning. This cutting-edge tutorial aims to introduce the multimodal entailment task, which can be useful for detecting semantic alignments when a single modality alone does not suffice for a whole content understanding. Starting with a brief overview of natural language processing, computer vision, structured data and neural graph learning, we lay the foundations for the multimodal sections to follow. We then discuss recent multimodal learning literature covering visual, audio and language streams, and explore case studies focusing on tasks which require fine-grained understanding of visual and linguistic semantics question answering, veracity and hatred classification. Finally, we introduce a new dataset for recognizing multimodal entailment, exploring it in a hands-on collaborative section. Overall, this tutorial gives an overview of multimodal learning, introduces a multimodal entailment dataset, and encourages future research in the topic.","lang":"eng"}],"main_file_link":[{"open_access":"1","url":"https://aclanthology.org/2021.acl-tutorials.6/"}],"department":[{"_id":"HeEd"}],"day":"01","file":[{"relation":"main_file","checksum":"b14052a025a6ecf675bdfe51db98c0d7","file_name":"2021_ACL_Ilharco.pdf","access_level":"open_access","creator":"cchlebak","content_type":"application/pdf","date_updated":"2021-11-29T08:41:00Z","date_created":"2021-11-29T08:41:00Z","file_size":1227703,"file_id":"10368","success":1}],"type":"conference","tmp":{"image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)","short":"CC BY (4.0)","legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode"},"date_updated":"2022-01-26T14:26:36Z","article_processing_charge":"No","month":"08"}