征稿已开启

查看我的稿件

注册已开启

查看我的门票

已截止
活动简介

The interaction between language and vision, despite seeing traction as of late, is still largely unexplored. This is a particularly relevant topic to the vision community because humans routinely perform tasks which involve both modalities. We do so largely without even noticing. Every time you ask for an object, ask someone to imagine a scene, or describe what you're seeing, you're performing a task which bridges a linguistic and a visual representation. The importance of vision-language interaction can also be seen by the numerous approaches that often cross domains, such as the popularity of image grammars. More concretely, we've recently seen a renewed interest in one-shot learning for object and event models. Humans go further than this using our linguistic abilities; we perform zero-shot learning without seeing a single example. You can recognize a picture of a zebra after hearing the description "horse-like animal with black and white stripes" without ever having seen one.

Furthermore, integrating language with vision brings with it the possibility of expanding the horizons and tasks of the vision community. We have seen significant growth in image and video-to-text tasks but many other potential applications of such integration – answering questions, dialog systems, and grounded language acquisition – remain largely unexplored. Going beyond such novel tasks, language can make a deeper contribution to vision: it provides a prism through which to understand the world. A major difference between human and machine vision is that humans form a coherent and global understanding of a scene. This process is facilitated by our ability to affect our perception with high-level knowledge which provides resilience in the face of errors from low-level perception. It also provides a framework through which one can learn about the world: language can be used to describe many phenomena succinctly thereby helping filter out irrelevant details.

征稿信息

重要日期

2017-05-31
初稿截稿日期
2017-06-15
初稿录用日期

征稿范围

Topics covered (non-exhaustive):

  • language as a mechanism to structure and reason about visual perception

  • language as a learning bias to aid vision in both machines and humans

  • novel tasks which combine language and vision

  • dialogue as means of sharing knowledge about visual perception

  • stories as means of abstraction

  • transfer learning across language and vision

  • understanding the relationship between language and vision in humans

  • reasoning visually about language problems

  • visual captioning dialogue and question-answering

  • visual synthesis from language

  • sequence learning towards bridging vision and language

  • joint video and language alignment and parsing and

  • video sentiment analysis

留言
验证码 看不清楚,更换一张
全部留言
重要日期
  • 07月21日

    2017

    会议日期

  • 05月31日 2017

    初稿截稿日期

  • 06月15日 2017

    初稿录用通知日期

  • 07月21日 2017

    注册截止日期

移动端
在手机上打开
小程序
打开微信小程序
客服
扫码或点此咨询