Home / Papers / You Only Look Once in Panorama: Object Detection for 360°...

You Only Look Once in Panorama: Object Detection for 360° Videos with MLaaS

88 Citations2024
Linfeng Shen, Miao Zhang, Cong Zhang
Proceedings of the 34th edition of the Workshop on Network and Operating System Support for Digital Audio and Video

A novel MLaaS-based system that partitions 360° frames into distortion-free 2D regions with dynamic region of interest prediction and seamlessly combining all the 2D regions into a unified frame, proving its effectiveness in 360° video object detection tasks.

Abstract

360° videos are gaining popularity, but immersive analytics, particularly in object detection, confront challenges from complex scenes and high data volume. This imposes significant burdens on individual users and resource-limited edge devices. Fortunately, Machine Learning as a Service (MLaaS) offers an economical solution for quick deployment without specific hardware or expertise. However, current MLaaS are mostly 2D image-designated and not optimized for the distinctive characteristics of raw 360° video frames. In this paper, we propose a novel MLaaS-based system to address this challenge. Our solution partitions 360° frames into distortion-free 2D regions with dynamic region of interest prediction. We then present an image-stitching algorithm featuring Skyline representation, seamlessly combining all the 2D regions into a unified frame. This frame is then transmitted to the MLaaS platform, with the detected objects being back-projected to yield the final results. Our experiments demonstrate the superiority of this system over baselines, proving its effectiveness in 360° video object detection tasks.