SoftSkip: Empowering Multi-Modal Dynamic Pruning for Single-Stage Referring Comprehension

Page view(s)
56
Checked on Oct 25, 2024
SoftSkip: Empowering Multi-Modal Dynamic Pruning for Single-Stage Referring Comprehension
Title:
SoftSkip: Empowering Multi-Modal Dynamic Pruning for Single-Stage Referring Comprehension
Journal Title:
Proceedings of the 30th ACM International Conference on Multimedia
Keywords:
Publication Date:
10 October 2022
Citation:
Weerakoon, D., Subbaraju, V., Tran, T., & Misra, A. (2022). SoftSkip. Proceedings of the 30th ACM International Conference on Multimedia. https://doi.org/10.1145/3503161.3548432
Abstract:
Supporting real-time referring expression comprehension (REC) on pervasive devices is an important capability for human-AI collaborative tasks. Model pruning techniques, applied to DNN models, can enable real-time execution even on resource-constrained devices. However, existing pruning strategies are designed principally for uni-modal applications, and suffer a significant loss of accuracy when applied to REC tasks that require fusion of textual and visual inputs. We thus present a multi-modal pruning model, LGMDP, which uses language as a pivot to dynamically and judiciously select the relevant computational blocks that need to be executed. LGMDP also introduces a new SoftSkip mechanism, whereby 'skipped' visual scales are not completely eliminated but approximated with minimal additional computation. Experimental evaluation, using 3 benchmark REC datasets and an embedded device implementation, shows that LGMDP can achieve 33% latency savings, with an accuracy loss 0.5% - 2%.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the A*STAR - AME Programmatic
Grant Reference no. : A18A2b0046

This research / project is supported by the National Research Foundation - NRF Investigatorship
Grant Reference no. : NRF-NRFI05-2019-0007

This research / project is supported by the Ministry of Education - AcRF Tier-1 grant
Grant Reference no. : 19-C220-SMU-008
Description:
© Author | ACM 2022. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the 30th ACM International Conference on Multimedia, http://dx.doi.org/10.1145/3503161.3548432
ISBN:
978-1-4503-9203-7/22/10
Files uploaded:

File Size Format Action
multimedia-7-amended.pdf 1.51 MB PDF Open