TY - JOUR
T1 - Intelligent human action recognition using an ensemble model of evolving deep networks with swarm-based optimization
AU - Zhang, Li
AU - Lim, Chee Peng
AU - Yu, Yonghong
PY - 2021/5/23
Y1 - 2021/5/23
N2 - Automatic interpretation of human actions from realistic videos attracts increasing research attention owing to its growing demand in real-world deployments such as biometrics, intelligent robotics, and surveillance. In this research, we propose an ensemble model of evolving deep networks comprising Convolutional Neural Networks (CNNs) and bidirectional Long Short-Term Memory (BLSTM) networks for human action recognition. A swarm intelligence (SI)-based algorithm is also proposed for identifying the optimal hyper-parameters of the deep networks. The SI algorithm plays a crucial role for determining the BLSTM network and learning configurations such as the learning and dropout rates and the number of hidden neurons, in order to establish effective deep features that accurately represent the temporal dynamics of human actions. The proposed SI algorithm incorporates hybrid crossover operators implemented by sine, cosine, and tanh functions for multiple elite offspring signal generation, as well as geometric search coefficients extracted from a three-dimensional super-ellipse surface. Moreover, it employs a versatile search process led by the yielded promising offspring solutions to overcome stagnation. Diverse CNN–BLSTM networks with distinctive hyper-parameter settings are devised. An ensemble model is subsequently constructed by aggregating a set of three optimized CNN–BLSTM networks based on the average prediction probabilities. Evaluated using several publicly available human action data sets, our evolving ensemble deep networks illustrate statistically significant superiority over those with default and optimal settings identified by other search methods. The proposed SI algorithm also shows great superiority over several other methods for solving diverse high-dimensional unimodal and multimodal optimization functions with artificial landscapes.
AB - Automatic interpretation of human actions from realistic videos attracts increasing research attention owing to its growing demand in real-world deployments such as biometrics, intelligent robotics, and surveillance. In this research, we propose an ensemble model of evolving deep networks comprising Convolutional Neural Networks (CNNs) and bidirectional Long Short-Term Memory (BLSTM) networks for human action recognition. A swarm intelligence (SI)-based algorithm is also proposed for identifying the optimal hyper-parameters of the deep networks. The SI algorithm plays a crucial role for determining the BLSTM network and learning configurations such as the learning and dropout rates and the number of hidden neurons, in order to establish effective deep features that accurately represent the temporal dynamics of human actions. The proposed SI algorithm incorporates hybrid crossover operators implemented by sine, cosine, and tanh functions for multiple elite offspring signal generation, as well as geometric search coefficients extracted from a three-dimensional super-ellipse surface. Moreover, it employs a versatile search process led by the yielded promising offspring solutions to overcome stagnation. Diverse CNN–BLSTM networks with distinctive hyper-parameter settings are devised. An ensemble model is subsequently constructed by aggregating a set of three optimized CNN–BLSTM networks based on the average prediction probabilities. Evaluated using several publicly available human action data sets, our evolving ensemble deep networks illustrate statistically significant superiority over those with default and optimal settings identified by other search methods. The proposed SI algorithm also shows great superiority over several other methods for solving diverse high-dimensional unimodal and multimodal optimization functions with artificial landscapes.
U2 - 10.1016/j.knosys.2021.106918
DO - 10.1016/j.knosys.2021.106918
M3 - Article
SN - 0950-7051
VL - 220
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 106918
ER -