Weakly supervised action segmentation with effective use of attention and self-attention

Page view(s)
23
Checked on Apr 07, 2024
Weakly supervised action segmentation with effective use of attention and self-attention
Title:
Weakly supervised action segmentation with effective use of attention and self-attention
Journal Title:
Computer Vision and Image Understanding
Publication Date:
12 October 2021
Citation:
Ng, Y. B., & Fernando, B. (2021). Weakly supervised action segmentation with effective use of attention and self-attention. Computer Vision and Image Understanding, 213, 103298. doi:10.1016/j.cviu.2021.103298
Abstract:
This paper generates human action sequences using a novel hybrid sequence-to-sequence model that outputs a sequence of actions in the chronological order of the actions being performed in the longer activity of a given video. At test time, our models are able to generate action for each frame using weak supervision. We evaluate several sequence-to-sequence models to solve this task and demonstrate that they are able to solve action segment generation tasks on three challenging action recognition datasets. We present how to use self-attention and standard attention mechanisms with known sequence-to-sequence models for weakly supervised video action segmentation. Our new architecture is effective for weakly supervised action segmentation that uses a combination of recurrent and transformer-based sequence-to-sequence models. Our architecture consists of Transformers and GRU encoders to encode temporal information and we use self-attention and standard attention during the decoding process. We introduce an effective positional weight prior to further improve action segmentation performance. Using this architecture and two types of attention along with positional weight priors, we obtain state-of-the-art results on Breakfast and 50Salads datasets for weakly supervised action segmentation.
License type:
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Funding Info:
This research / project is supported by the National Research Foundation, Singapore - AI Singapore Programme
Grant Reference no. : AISG-RP-2019-010
Description:
ISSN:
1077-3142
Files uploaded:

File Size Format Action
main.pdf 1.10 MB PDF Open