Calibrating Class Weights with Multi-Modal Information for Partial Video Domain Adaptation

Page view(s)
5
Checked on Jun 07, 2023
Calibrating Class Weights with Multi-Modal Information for Partial Video Domain Adaptation
Title:
Calibrating Class Weights with Multi-Modal Information for Partial Video Domain Adaptation
Journal Title:
Proceedings of the 30th ACM International Conference on Multimedia
Keywords:
Publication Date:
10 October 2022
Citation:
Wang, X., Xu, Y., Yang, J., & Mao, K. (2022). Calibrating Class Weights with Multi-Modal Information for Partial Video Domain Adaptation. Proceedings of the 30th ACM International Conference on Multimedia. https://doi.org/10.1145/3503161.3548095
Abstract:
Assuming the source label space subsumes the target one, Partial Video Domain Adaptation (PVDA) is a more general and practical scenario for cross-domain video classification problems. The key challenge of PVDA is to mitigate the negative transfer caused by the source-only outlier classes. To tackle this challenge, a crucial step is to aggregate target predictions to assign class weights by up-weighing target classes and down-weighing outlier classes. However, the incorrect predictions of class weights can mislead the network and lead to negative transfer. Previous works improve the class weight accuracy by utilizing temporal features and attention mechanisms, but these methods may fall short when trying to generate accurate class weight when domain shifts are significant, as in most real-world scenarios. To deal with these challenges, we first propose the Multi-modality partial Adversarial Network (MAN), which utilizes multi-scale and multi-modal information to enhance PVDA performance. Based on MAN, we then propose Multi-modality Cluster-calibrated partial Adversarial Network (MCAN). It utilizes a novel class weight calibration method to alleviate the negative transfer caused by incorrect class weights. Specifically, the calibration method tries to identify and weigh correct and incorrect predictions using distributional information implied by unsupervised clustering. Extensive experiments are conducted on prevailing PVDA benchmarks, and the proposed MCAN achieves significant improvements when compared to state-of-the-art PVDA methods.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the A*STAR - AME Programmatic Funds
Grant Reference no. : A20H6b0151

This research / project is supported by the Nanyang Technological University - NTU Presidential Postdoctoral Fellowship
Grant Reference no. : NA
Description:
© Author | ACM 2022. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the 30th ACM International Conference on Multimedia, http://dx.doi.org/10.1145/3503161.3548095
ISBN:
978-1-4503-9203-7
Files uploaded:

File Size Format Action
mcan-final-amended.pdf 1.64 MB PDF Open