Channel Attention Is All You Need for Video Frame Interpolation (original) (raw)
Authors
- Myungsub Choi Seoul National University
- Heewon Kim Seoul National University
- Bohyung Han Seoul National University
- Ning Xu Amazon Go
- Kyoung Mu Lee Seoul National University
DOI:
https://doi.org/10.1609/aaai.v34i07.6693
Abstract
Prevailing video frame interpolation techniques rely heavily on optical flow estimation and require additional model complexity and computational cost; it is also susceptible to error propagation in challenging scenarios with large motion and heavy occlusion. To alleviate the limitation, we propose a simple but effective deep neural network for video frame interpolation, which is end-to-end trainable and is free from a motion estimation network component. Our algorithm employs a special feature reshaping operation, referred to as PixelShuffle, with a channel attention, which replaces the optical flow computation module. The main idea behind the design is to distribute the information in a feature map into multiple channels and extract motion information by attending the channels for pixel-level frame synthesis. The model given by this principle turns out to be effective in the presence of challenging motion and occlusion. We construct a comprehensive evaluation benchmark and demonstrate that the proposed approach achieves outstanding performance compared to the existing models with a component for optical flow computation.
How to Cite
Choi, M., Kim, H., Han, B., Xu, N., & Lee, K. M. (2020). Channel Attention Is All You Need for Video Frame Interpolation. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 10663-10671. https://doi.org/10.1609/aaai.v34i07.6693
Issue
Section
AAAI Technical Track: Vision