Lecture2Go-Wartungsarbeiten am 30.9.2021

Aufgrund dringender Wartungsarbeiten steht Lecture2Go leider am 30.09.2021 von 9:00 bis 21:00 Uhr nur eingeschränkt zur Verfügung. Mit zeitweiligen Ausfällen ist zu rechnen, Logins und Uploads sind während der Wartungsarbeiten nicht möglich. Wir versuchen, den tatsächlichen Ausfall in diesem Zeitraum so kurz wie möglich zu halten, und bitten die Unterbrechung zu entschuldigen.

Ihr Lecture2Go-Team im RRZ

Video Catalog

Views: 1317

Jointly Representing Images and Text: Dependency Graphs, Word Senses, and Multimodal Embeddings

Informatisches Kolloquium

In this presentation, I will argue that we can make progress in language/vision tasks if we represent images in structured ways, rather than just labeling objects, actions, or attributes. In particular, deploying structured representations from natural language processing is fruitful: I will discuss how visual dependency representations (VDRs), which borrow ideas for dependency parsing, can be used to capture how the objects in an scene interact with each other. VDRs are useful for tasks such as image retrieval or image description. Secondly, I will argue that much more fine-grained representations of actions are needed for most language/vision tasks. Again, ideas from NLP are be leveraged: I will introduce algorithms that use multimodal embeddings to perform verb sense disambiguation in a visual context.

This video may be embedded in other websites. You must copy the embeding code and paste it in the desired location in the HTML text of a Web page. Please always include the source and point it to lecture2go!



Social Media