October 17th, 8:00 - 11:00 am EDT
Time (EDT) | Speaker | Topic |
---|---|---|
8:00-8:10 am EDT | Organizers | Welcome & introduction |
8:10-8:40 am EDT | Alan Yuille | Invited talks topic 1 |
8:40-8:50 am EDT | OVIS 3rd place team | A Single-Stage, Bottom-up Approach for Occluded VIS using Spatio-temporal Embeddings |
8:50-9:00 am EDT | OVIS 1st place team | Limited Sampling Reference Frame for MaskTrack R-CNN |
9:00-9:10 am EDT | Khalid J. Almalki | Oral section - Characterizing Scattered Occlusions for Effective Dense-Mode Crowd Counting |
9:10-9:20 am EDT | Heechul Bae | Oral section - Occluded Video Instance Segmentation with Set Prediction Approach |
9:20-9:30 am EDT | Shane Gilroy | Oral section - Pedestrian Occlusion Level Classification using Keypoint Detection and 2D Body Surface Area Estimation |
9:30-10:00 am EDT | Hanwang Zhang | Invited talks topic 2 |
In the visual world, objects rarely occur in isolation. The psychophysical and computational studies have demonstrated that human vision systems can perceive heavily occluded objects with contextual reasoning and association. The question then becomes, can our video understanding system perceive objects that are severely obscured? The OVIS competition will be hosted on an online platform and presentations will be delivered on Zoom.
We use average precision (AP) at different intersection-over-union (IoU) thresholds and average recall (AR) as our evaluation metrics, following Youtube-VIS. The IoU in video instance segmentation is the sum of intersection area over the sum of union area across the video.
Dataset DownloadCompetition | Date |
---|---|
Competition Phase 1 (open the submission of the val results) | June 1, 2021 (11:59PM Pacific Time) |
Competition Phase 2 (open the submission of the test results) | July 25, 2021 (11:59PM Pacific Time) |
Deadline for Submitting the Final Predictions | August 1, 2021 (11:59PM Pacific Time) |
Decisions to Participants | August 6, 2021 (11:59PM Pacific Time) |
Rank | Team Name | Team Members | Organization | Technical Report |
---|---|---|---|---|
1st | Ach | Zhuang Li, Leilei Cao, Hongbin Wang | Ant Group | |
2nd | huapohen | Wenbo Li, Xuesheng Li, Qiwei Xu, Chen Li, Jiaxue Wang, Zongxiang Fu | University of Electronic Science and Technology of China, Chengdu DELU Dynamics Ltd |
|
3rd | Ali2500 | Ali Athar1, Sabarinath Mahadevan1, Aljosa Osep2, Bastian Leibe1 | 1 RWTH Aachen University, 2 Carnegie Mellon University |
Although deep learning methods have achieved advanced video object recognition performance in recent years, perceiving objects in heavy occlusion video scenes is still a very challenging task. The difficulty of precisely localizing and reasoning heavily occluded objects in videos reveals that current deep learning models perform differently with the human vision system, and confirms that it is urgent to design new paradigms for video understanding.
Paper | Date |
---|---|
Submission Deadline | July 25, 2021 (11:59PM Pacific Time) |
Author Notification | August 9, 2021 (11:59PM Pacific Time) - Extended |
Camera-ready Due | August 16, 2021 (11:59PM Pacific Time) - Extended |