CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow

Page view(s)
24
Checked on Oct 25, 2024
CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow
Title:
CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow
Journal Title:
The IEEE / CVF Computer Vision and Pattern Recognition Conference
DOI:
Publication Date:
24 June 2022
Citation:
Xiuchao Sui, Shaohua Li, Xue Geng et al. CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 17602-17611
Abstract:
Optical flow estimation aims to find the 2D motion field by identifying corresponding pixels between two images. Despite the tremendous progress of deep learning-based optical flow methods, it remains a challenge to accurately estimate large displacements with motion blur. This is mainly because the correlation volume, the basis of pixel matching, is computed as the dot product of the convolutional features of the two images. The locality of convolutional features makes the computed correlations susceptible to various noises. On large displacements with motion blur, noisy correlations could cause severe errors in the estimated flow. To overcome this challenge, we propose a new architecture "CRoss-Attentional Flow Transformer" (CRAFT), aiming to revitalize the correlation volume computation. In CRAFT, a Semantic Smoothing Transformer layer transforms the features of one frame, making them more global and semantically stable. In addition, the dot-product correlations are replaced with transformer Cross-Frame Attention. This layer filters out feature noises through the Query and Key projections, and computes more accurate correlations. On Sintel (Final) and KITTI (foreground) benchmarks, CRAFT has achieved new state-of-the-art performance. Moreover, to test the robustness of different models on large motions, we designed an image shifting attack that shifts input images to generate large artificial motions. Under this attack, CRAFT performs much more robustly than two representative methods, RAFT and GMA. The code of CRAFT is is available at https://github.com/askerlee/craft.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the A*STAR - Career Development Fund
Grant Reference no. : C210812035

This research / project is supported by the A*STAR - Career Development Fund
Grant Reference no. : C210112016

This research / project is supported by the A*STAR - Human-Robot Collaborative AI for Advanced Manufacturing and Engineering programme
Grant Reference no. : A18A2b0046
Description:
© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISBN:
17602-17611
Files uploaded: