When an event occurs in real-life, there may be little information available at first. However, as pictures and textual descriptions about the event arrive, they form an evolving story, which is of critical value for the affected or interested persons. This evolving story corresponds to a semantic topic that can be tracked over time with both textual and visual information from multiple social-media sources (i.e., end-users or online services). Moreover, as social-media sources continue to publish information about the story, it becomes critical to select the most relevant information. Thus, based on all collaborative audio-visual and textual information, one can create summaries or stories of those real-world events.
The ICCV 2017 Workshop on Collaborative Visual Stories seeks contributions in the area of visual story creation, with collaborative videos, images and texts available from professional media and social-media users. The challenges in creating such visual timelines are various: alignment of videos and images through sound or timestamps, detection of video intervals of high interest, caption generation for a group of pictures, among others.
The workshop also welcomes contributions that address specific real-world use-cases (e.g. news, life-logs, festivals), where the visual and textual domain content provide complementary evidences that push the development of new cross-modal storytelling research. These contributions should use public datasets, e.g. MediaEval, LifeLogs, made available by the authors.
The CoViStories workshop seeks contributions in the area of visual story creation, with collaborative videos, images and texts available from professional media and social-media users. The challenges in creating such visual timelines are various: alignment of videos and images through sound or timestamps, detection of video intervals of high interest, caption generation for a group of pictures, among others. Contributions in the following topics are expected:
a. Image concept detection
b. Collaborative event-episodes annotation
c. Visual forensics
a. Collaborative event-episodes detection
b. Temporal and semantic media alignment
c. Information freshness
d. Multimodal event timelines
a. Event summary diversity
b. QoE in visual summaries
c. Affect and emotion in visual summaries and caption generation
a. Multimodal caption generation
b. Collaborative media summaries
c. Non-linear and multiple view summarization
We will accept contributions as full-papers and position-papers on the above topics. Submissions are limited to 6 pages.
10月23日
2017
会议日期
初稿截稿日期
初稿录用通知日期
终稿截稿日期
注册截止日期
留言