{"abstract":[{"text":"Deep learning is best known for its empirical success across a wide range of applications\r\nspanning computer vision, natural language processing and speech. Of equal significance,\r\nthough perhaps less known, are its ramifications for learning theory: deep networks have\r\nbeen observed to perform surprisingly well in the high-capacity regime, aka the overfitting\r\nor underspecified regime. Classically, this regime on the far right of the bias-variance curve\r\nis associated with poor generalisation; however, recent experiments with deep networks\r\nchallenge this view.\r\n\r\nThis thesis is devoted to investigating various aspects of underspecification in deep learning.\r\nFirst, we argue that deep learning models are underspecified on two levels: a) any given\r\ntraining dataset can be fit by many different functions, and b) any given function can be\r\nexpressed by many different parameter configurations. We refer to the second kind of\r\nunderspecification as parameterisation redundancy and we precisely characterise its extent.\r\nSecond, we characterise the implicit criteria (the inductive bias) that guide learning in the\r\nunderspecified regime. Specifically, we consider a nonlinear but tractable classification\r\nsetting, and show that given the choice, neural networks learn classifiers with a large margin.\r\nThird, we consider learning scenarios where the inductive bias is not by itself sufficient to\r\ndeal with underspecification. We then study different ways of ‘tightening the specification’: i)\r\nIn the setting of representation learning with variational autoencoders, we propose a hand-\r\ncrafted regulariser based on mutual information. ii) In the setting of binary classification, we\r\nconsider soft-label (real-valued) supervision. We derive a generalisation bound for linear\r\nnetworks supervised in this way and verify that soft labels facilitate fast learning. Finally, we\r\nexplore an application of soft-label supervision to the training of multi-exit models.","lang":"eng"}],"department":[{"_id":"GradSch"},{"_id":"ChLa"}],"publication_status":"published","date_published":"2021-05-30T00:00:00Z","date_updated":"2023-09-08T11:11:12Z","supervisor":[{"full_name":"Lampert, Christoph","id":"40C20FD2-F248-11E8-B48F-1D18A9856A87","last_name":"Lampert","first_name":"Christoph","orcid":"0000-0001-8622-7887"}],"article_processing_charge":"No","month":"05","degree_awarded":"PhD","day":"30","type":"dissertation","file":[{"content_type":"application/pdf","access_level":"open_access","creator":"bphuong","file_name":"mph-thesis-v519-pdfimages.pdf","checksum":"4f0abe64114cfed264f9d36e8d1197e3","relation":"main_file","success":1,"file_id":"9419","date_created":"2021-05-24T11:22:29Z","file_size":2673905,"date_updated":"2021-05-24T11:22:29Z"},{"relation":"source_file","checksum":"f5699e876bc770a9b0df8345a77720a2","content_type":"application/zip","file_name":"thesis.zip","access_level":"closed","creator":"bphuong","date_updated":"2021-05-24T11:56:02Z","file_id":"9420","file_size":92995100,"date_created":"2021-05-24T11:56:02Z"}],"alternative_title":["ISTA Thesis"],"status":"public","related_material":{"record":[{"relation":"part_of_dissertation","id":"7435","status":"deleted"},{"id":"7481","relation":"part_of_dissertation","status":"public"},{"status":"public","id":"9416","relation":"part_of_dissertation"},{"id":"7479","relation":"part_of_dissertation","status":"public"}]},"title":"Underspecification in deep learning","publisher":"Institute of Science and Technology Austria","date_created":"2021-05-24T13:06:23Z","user_id":"c635000d-4b10-11ee-a964-aac5a93f6ac1","oa_version":"Published Version","oa":1,"_id":"9418","language":[{"iso":"eng"}],"citation":{"mla":"Phuong, Mary. Underspecification in Deep Learning. Institute of Science and Technology Austria, 2021, doi:10.15479/AT:ISTA:9418.","ieee":"M. Phuong, “Underspecification in deep learning,” Institute of Science and Technology Austria, 2021.","ama":"Phuong M. Underspecification in deep learning. 2021. doi:10.15479/AT:ISTA:9418","apa":"Phuong, M. (2021). Underspecification in deep learning. Institute of Science and Technology Austria. https://doi.org/10.15479/AT:ISTA:9418","chicago":"Phuong, Mary. “Underspecification in Deep Learning.” Institute of Science and Technology Austria, 2021. https://doi.org/10.15479/AT:ISTA:9418.","ista":"Phuong M. 2021. Underspecification in deep learning. Institute of Science and Technology Austria.","short":"M. Phuong, Underspecification in Deep Learning, Institute of Science and Technology Austria, 2021."},"publication_identifier":{"issn":["2663-337X"]},"author":[{"last_name":"Bui Thi Mai","first_name":"Phuong","full_name":"Bui Thi Mai, Phuong","id":"3EC6EE64-F248-11E8-B48F-1D18A9856A87"}],"has_accepted_license":"1","page":"125","doi":"10.15479/AT:ISTA:9418","ddc":["000"],"file_date_updated":"2021-05-24T11:56:02Z","acknowledged_ssus":[{"_id":"ScienComp"},{"_id":"CampIT"},{"_id":"E-Lib"}],"year":"2021"}