Human Action Recognition Using Hybrid Deep Evolving Neural Networks

Pavan Dasari, Li Zhang, Yonghong Yu, Haoqian Huang, Rong Gao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

305 Downloads (Pure)

Abstract

Human action recognition can be applied in a multitude of fully diversified domains such as active large-scale surveillance, threat detection, personal safety in hazardous environments, human assistance, health monitoring, and intelligent robotics. Owing to its high demands in real-world applications, it has drawn significant attention. In this research, we propose hybrid deep neural networks, i.e. Convolutional Long Short-Term Memory (ConvLSTM) Networks, Long-term Recurrent Convolutional Networks (LRCN), for tackling video action classification. In particular, for the LRCN model, different CNN encoder architectures such as VGG16, ResNet50, DenseNet121 and MobileNet, as well as several Long Short-Term Memory (LSTM) variant decoder architectures, such as LSTM, bidirectional LSTM (BiLSTM) and Gated Recurrent Unit (GRU), are used for spatial-temporal feature extraction to test model performance. We adopt diverse experimental settings including using different numbers of frames per video and learning configurations to optimize performance. The empirical results indicate the superiority of MobileNet in combination with a BiLSTM network over other hybrid network settings for the action classification using the UCF50 dataset. Owing to the lightweight MobileNet encoder, this LRCN model also achieves a better trade-off between performance and training and inference computational costs, while outperforming existing state-of-the-art methods.
Original languageEnglish
Title of host publicationInternational Joint Conference on Neural Networks (IJCNN)
Place of PublicationItaly
PublisherIEEE
Number of pages8
ISBN (Electronic)978-1-7281-8671-9
ISBN (Print)978-1-6654-9526-4
DOIs
Publication statusPublished - 30 Sept 2022

Cite this