A Log-Likelihood Regularized KL Divergence for Video Prediction With a 3D Convolutional Variational Recurrent Network

Page view(s)
33
Checked on Feb 16, 2025
A Log-Likelihood Regularized KL Divergence for Video Prediction With a 3D Convolutional Variational Recurrent Network
Title:
A Log-Likelihood Regularized KL Divergence for Video Prediction With a 3D Convolutional Variational Recurrent Network
Journal Title:
Winter Conference on Applications of Computer Vision 2021 - Generation of Human Behavior
DOI:
Publication URL:
Publication Date:
06 January 2021
Citation:
Abstract:
The use of latent variable models has shown to be a powerful tool for modeling probability distributions over sequences. In this paper, we introduce a new variational model that extends the recurrent network in two ways for the task of video frame prediction. First, we introduce 3D convolutions inside all modules including the recurrent model for future frame prediction, inputting and outputting a sequence of video frames at each timestep. This enables us to better exploit spatiotemporal information inside the variational recurrent model, allowing us to generate high-quality predictions. Second, we enhance the latent loss of the variational model by introducing a maximum likelihood estimate in addition to the KL divergence that is commonly used in variational models. This simple extension acts as a stronger regularizer in the variational autoencoder loss function and lets us obtain better results and generalizability. Experiments show that our model outperforms existing video prediction methods on several benchmarks while requiring fewer parameters.
License type:
PublisherCopyrights
Funding Info:
This research is supported by the National Research Foundation Singapore under its AI Singapore Programme (Award Number: AISG-RP-2019-010).
Description:
ISBN:

Files uploaded:
File Size Format Action
There are no attached files.