Audio–Visual Segmentation based on robust principal component analysis

Page view(s)
17
Checked on Dec 03, 2024
Audio–Visual Segmentation based on robust principal component analysis
Title:
Audio–Visual Segmentation based on robust principal component analysis
Journal Title:
Expert Systems with Applications
Publication Date:
05 December 2024
Citation:
Fang, S., Zhu, Q., Wu, Q., Wu, S., & Xie, S. (2024). Audio–Visual Segmentation based on robust principal component analysis. Expert Systems with Applications, 256, 124885. https://doi.org/10.1016/j.eswa.2024.124885
Abstract:
Audio–Visual Segmentation (AVS) aims to extract the sounding objects from a video. The current learning-based AVS methods are often supervised, which rely on specific task data annotations and expensive model training. Recognizing that the video background captured by a static camera is represented as a low-rank matrix, we introduce the non-convex robust principal component analysis into AVS task in this paper. This approach is unsupervised and only relies on input data patterns. Specifically, the proposed method decomposes each modality into the sum of two parts: the low-rank part that represents the background audio and visual information, and the sparse part that represents the foreground information. Furthermore, CUR decomposition is employed at each iteration to reduce the computational complexity in optimization. The experimental results also show that the proposed AVS outperforms the supervised methods on AVS-Bench Single-Source datasets.
License type:
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Funding Info:
There was no specific funding for the research done
Description:
© 2024 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
ISSN:
0957-4174
Files uploaded:

File Size Format Action
accepted-version-audio-visual-segmentation-based-on-robust-principal-component-analysis.pdf 6.10 MB PDF Request a copy