URCDC-Depth: Uncertainty Rectified Cross-Distillation With CutFlip for Monocular Depth Estimation

Page view(s)
54
Checked on Jan 25, 2025
URCDC-Depth: Uncertainty Rectified Cross-Distillation With CutFlip for Monocular Depth Estimation
Title:
URCDC-Depth: Uncertainty Rectified Cross-Distillation With CutFlip for Monocular Depth Estimation
Journal Title:
IEEE Transactions on Multimedia
Publication Date:
30 August 2023
Citation:
Shao, S., Pei, Z., Chen, W., Li, R., Liu, Z., & Li, Z. (2024). URCDC-Depth: Uncertainty Rectified Cross-Distillation With CutFlip for Monocular Depth Estimation. IEEE Transactions on Multimedia, 26, 3341–3353. https://doi.org/10.1109/tmm.2023.3310259
Abstract:
This work aims to estimate a high-quality depth map from a single RGB image. Due to the lack of depth clues, making full use of the long-range correlation and local information is critical for accurate depth estimation. To this end, we introduce an uncertainty rectified cross-distillation between the Transformer and convolutional neural network (CNN) to achieve a comprehensive depth estimator. Specifically, we utilize the depth estimates from the Transformer branch and CNN branch as pseudo labels to teach each other. At the same time, the pixel-wise depth uncertainty is modeled to mitigate the negative impact of noisy pseudo labels. To avoid the large capacity gap induced by the strong Transformer branch deteriorating the cross-distillation, we transfer the feature maps from the Transformer to the CNN and develop coupling units to assist the weak CNN branch in leveraging the transferred features. Furthermore, we introduce CutFlip, a surprisingly simple yet highly effective data augmentation technique, which forces the model to focus on more valuable depth reasoning clues apart from the vertical image position. Extensive experiments demonstrate that our model, termed URCDC-Depth, exceeds in performance previous state-of-the-art approaches on the KITTI, NYU-Depthv2 and SUN RGB-D datasets, with no additional computational burden in the evaluation phase. The source code will be publicly available upon acceptance. The source code is available at https://github.com/ShuweiShao/URCDC-Depth.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the A*STAR - MTC Programmatic Funds grant
Grant Reference no. : M23L7b0021

This work was supported by the National Natural Science Foundation of China under grant 61620106012, in part by the Key Research and Development Program of Zhejiang Province under Grant 2021C03050, in part by the Scientific Research Project of Agriculture and Social Development of Hangzhou under Grant 20212013B11, and in part by the National Natural Science Foundation of China under Grants 61620106012 and 61573048.
Description:
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISSN:
1941-0077
1520-9210
Files uploaded:

File Size Format Action
tmm-revised.pdf 5.80 MB PDF Request a copy