Contrastive Adversarial Knowledge Distillation for Deep Model Compression in Time-Series Regression Tasks

Page view(s)
45
Checked on Sep 02, 2022
Contrastive Adversarial Knowledge Distillation for Deep Model Compression in Time-Series Regression Tasks
Title:
Contrastive Adversarial Knowledge Distillation for Deep Model Compression in Time-Series Regression Tasks
Other Titles:
Neurocomputing
Publication Date:
04 November 2021
Citation:
Qing Xu, Zhenghua Chen, Mohamed Ragab, Chao Wang, Min Wu, Xiaoli Li, Contrastive Adversarial Knowledge Distillation for Deep Model Compression in Time-Series Regression Tasks, Neurocomputing, 2021,
Abstract:
Knowledge distillation (KD) attempts to compress a deep teacher model into a shallow student model by letting the student mimic the teacher’s outputs. However, conventional KD approaches can have the following shortcomings. First, existing KD approaches align the global distribution between teacher and student models and overlook the fine-grained features. Second, most of existing approaches focus on classification tasks and require the architecture of teacher and student models to be similar. To address these limitations, we propose a contrastive adversarial knowledge distillation called CAKD for time series regression tasks where the student and teacher are using different architectures. Specifically, we first propose adversarial adaptation to automatically align the feature distribution between student and teacher networks respectively. Yet, adversarial adaptation can only align the global feature distribution without considering the fine-grained features. To mitigate this issue, we employ a novel contrastive loss for instance-wise alignment between the student and teacher. Particularly, we maximize similarity between teacher and student features that originate from the same sample. Lastly, a KD loss is used to for the knowledge distillation where the teacher and student have two different architectures. We used a turbofan engine dataset that consists of four sub-datasets to evaluate the model performance. The results show that the proposed CAKD method consistently outperforms state-of-the-art methods in terms of two different metrics.
License type:
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Funding Info:
This research / project is supported by the National Research Foundation of Singapore - Industrial Internet of Things Research Program
Grant Reference no. : A1788a0023

This research / project is supported by the National Natural Science Foundation of China - NA
Grant Reference no. : 61976200

This research / project is supported by the National Research Foundation of Singapore - AME Young Individual Research Grant (YIRG)
Grant Reference no. : A2084c0167
Description:
ISSN:
0925-2312
Files uploaded:

Files uploaded:

File Size Format Action
contrastive-and-adversarial-knowledge-distillation-for-model-compression-pp.pdf 1.54 MB PDF Request a copy