Journal on Communications ›› 2022, Vol. 43 ›› Issue (1): 138-148.doi: 10.11959/j.issn.1000-436x.2022020
• Papers • Previous Articles Next Articles
Hongyan WANG1,2, Hai YUAN2
Revised:
2021-12-13
Online:
2022-01-25
Published:
2022-01-01
Supported by:
CLC Number:
Hongyan WANG, Hai YUAN. Action recognition method based on fusion of skeleton and apparent features[J]. Journal on Communications, 2022, 43(1): 138-148.
"
数据 | 特征 | 方法方法 | cross subject | cross view |
手工提取手工提取 | LARP | 50.08% | 52.76% | |
Dynamic skeletonsDynamic skeletons | 60.23% | 65.22% | ||
CNNCNN | Multi temporal 3D CNN | 66.85% | 72.58% | |
LSTMLSTM | ST-LSTM+Trust Gate | 69.20% | 77.70% | |
骨骼序列 | RNN | Two-Stream RNN | 71.30% | 79.50% |
CNNCNN | TSRJI | 73.30% | 80.30% | |
LSTM | STA-LSTM | 73.40% | 81.20% | |
LSTM | DS-LSTM | 77.80% | 87.33% | |
CNN | Fuzzy fusion+CNN | 84.22% | 89.71% | |
LSTM/手工提取 | 所提方法 |
[1] | 罗会兰, 王婵娟, 卢飞 . 视频行为识别综述[J]. 通信学报, 2018,39(6): 169-180. |
LUO H L , WANG C J , LU F . Survey of video behavior recognition[J]. Journal on Communications, 2018,39(6): 169-180. | |
[2] | JIANG Y G , DAI Q , LIU W ,et al. Human action recognition in unconstrained videos by explicit motion modeling[J]. IEEE Transactions on Image Processing:a Publication of the IEEE Signal Processing Society, 2015,24(11): 3781-3795. |
[3] | LIU M Y , LIU H . Depth Context:a new descriptor for human activity recognition by using sole depth sequences[J]. Neurocomputing, 2016,175: 747-758. |
[4] | CHEN C , LIU M Y , LIU H ,et al. Multi-temporal depth motion maps-based local binary patterns for 3-D human action recognition[J]. IEEE Access, 2017,5: 22590-22604. |
[5] | SHOTTON J , FITZGIBBON A , COOK M ,et al. Real-time human pose recognition in parts from single depth images[C]// Machine Learning for Computer Vision. Berlin:Springer, 2013: 119-135. |
[6] | HAN F , REILY B , HOFF W ,et al. Space-time representation of people based on 3D skeletal data:a review[J]. Computer Vision and Image Understanding, 2017,158: 85-105. |
[7] | KE Q H , BENNAMOUN M , AN S J ,et al. Learning clip representations for skeleton-based 3D action recognition[J]. IEEE Transactions on Image Processing:a Publication of the IEEE Signal Processing Society, 2018,27(6): 2842-2855. |
[8] | VEMULAPALLI R , ARRATE F , CHELLAPPA R . Human action recognition by representing 3D skeletons as points in a lie group[C]// Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2014: 588-595. |
[9] | AHMED F , PAUL P P , GAVRILOVA M . Adaptive pooling of the most relevant spatio-temporal features for action recognition[C]// Proceedings of 2016 IEEE International Symposium on Multimedia. Piscataway:IEEE Press, 2016: 177-180. |
[10] | WANG L , HUYNH D Q , KONIUSZ P . A comparative review of recent kinect-based action recognition algorithms[J]. IEEE Transactions on Image Processing, 2020,29: 15-28. |
[11] | BANERJEE A , SINGH P K , SARKAR R . Fuzzy integral-based CNN classifier fusion for 3D skeleton action recognition[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021,31(6): 2206-2216. |
[12] | LE Q V , JAITLY N , HINTON G E . A simple way to initialize recurrent networks of rectified linear units[J]. arXiv Preprint,arXiv:1504.00941, 2015. |
[13] | ZHANG J , BAI F S , ZHAO J F ,et al. Multi-views action recognition on 3D ResNet-LSTM framework[C]// Proceedings of 2021 IEEE 2nd International Conference on Big Data,Artificial Intelligence and Internet of Things Engineering. Piscataway:IEEE Press, 2021: 289-293. |
[14] | AVOLA D , CASCIO M , CINQUE L ,et al. 2-D skeleton-based action recognition via two-branch stacked LSTM-RNNs[J]. IEEE Transactions on Multimedia, 2020,22(10): 2481-2496. |
[15] | JIANG X H , XU K , SUN T F . Action recognition scheme based on skeleton representation with DS-LSTM network[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020,30(7): 2129-2140. |
[16] | KWAK I S , GUO J Z , HANTMAN A ,et al. Detecting the starting frame of actions in video[C]// Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway:IEEE Press, 2020: 478-486. |
[17] | SONG S J , LAN C L , XING J L ,et al. Spatio-temporal attention-based LSTM networks for 3D action recognition and detection[J]. IEEE Transactions on Image Processing:a Publication of the IEEE Signal Processing Society, 2018,27(7): 3459-3471. |
[18] | SCHINDLER K , VAN GOOL L . Action snippets:how many frames does human action recognition require?[C]// Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2008: 1-8. |
[19] | OJALA T , PIETIK?INEN M , HARWOOD D . A comparative study of texture measures with classification based on featured distributions[J]. Pattern Recognition, 1996,29(1): 51-59. |
[20] | PIETIK?INEN M , . Image analysis with local binary patterns[C]// Proceedings of the 14th Scandinavian Conference on Image Analysis.[S.l.:s.n.], 2005: 115-118. |
[21] | 梁淑芬, 刘银华, 李立琛 . 基于LBP和深度学习的非限制条件下人脸识别算法[J]. 通信学报, 2014,35(6): 154-160. |
LIANG S F , LIU Y H , LI L C . Face recognition under unconstrained based on LBP and deep learning[J]. Journal on Communications, 2014,35(6): 154-160. | |
[22] | LEI L , PENG J , YANG B . Image retrieval based on HSV feature and regional Shannon entropy[J]. International Journal of Software Science and Computational Intelligence, 2012,4(2): 64-80. |
[23] | YU P , ZHANG C , DU C H . Image retrievals based on color and texture features[C]// Proceedings of 2007 9th International Symposium on Signal Processing and Its Applications. Piscataway:IEEE Press, 2007: 1-4. |
[24] | SHAHROUDY A , LIU J , NG T T ,et al. NTU RGB+D:a large scale dataset for 3D human activity analysis[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016: 1010-1019. |
[25] | HU J F , ZHENG W S , LAI J H ,et al. Jointly learning heterogeneous features for RGB-D activity recognition[C]// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2015: 5344-5352. |
[26] | TU J H , LIU M Y , LIU H . Skeleton-based human action recognition using spatial temporal 3D convolutional neural networks[C]// Proceedings of 2018 IEEE International Conference on Multimedia and Expo. Piscataway:IEEE Press, 2018: 1-6. |
[27] | LIU J , SHAHROUDY A , XU D ,et al. Spatio-temporal LSTM with trust gates for 3D human action recognition[C]// Computer Vision – ECCV 2016. Berlin:Springer, 2016: 816-833. |
[28] | WANG H S , WANG L . Modeling temporal dynamics and spatial configurations of actions using two-stream recurrent neural networks[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2017: 3633-3642. |
[29] | CAETANO C , BRéMOND F , SCHWARTZ W R . Skeleton image representation for 3D action recognition based on tree structure and reference joints[C]// Proceedings of 2019 32nd SIBGRAPI Conference on Graphics,Patterns and Images (SIBGRAPI). Piscataway:IEEE Press, 2019: 16-23. |
[30] | WANG J , NIE X H , XIA Y ,et al. Cross-view action modeling,learning,and recognition[C]// Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2014: 2649-2656. |
[31] | XIA L , CHEN C C , AGGARWAL J K . View invariant human action recognition using histograms of 3D joints[C]// Proceedings of 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Piscataway:IEEE Press, 2012: 20-27. |
[32] | DU Y , WANG W , WANG L . Hierarchical recurrent neural network for skeleton based action recognition[C]// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2015: 1110-1118. |
[33] | XIAO Y , CHEN J , WANG Y C ,et al. Action recognition for depth video using multi-view dynamic images[J]. Information Sciences, 2019,480: 287-304. |
[34] | YUN K , HONORIO J , CHATTOPADHYAY D ,et al. Two-person interaction detection using body-pose features and multiple instance learning[C]// Proceedings of 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Piscataway:IEEE Press, 2012: 28-35. |
[35] | ZHANG S Y , LIU X M , XIAO J . On geometric features for skeleton-based action recognition using multilayer LSTM networks[C]// Proceedings of 2017 IEEE Winter Conference on Applications of Computer Vision. Piscataway:IEEE Press, 2017: 148-157. |
[36] | ZHU W T , LAN C L , XING J L ,et al. Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks[C]// Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press, 2016: 3697-3703. |
[1] | Wengang MA, Yadong ZHANG, Jin GUO. Abnormal traffic detection method based on LSTM and improved residual neural network optimization [J]. Journal on Communications, 2021, 42(5): 23-40. |
[2] | Yuntian FENG, Xia WU, Xiong XU, Rongqing ZHANG. Research on ionospheric parameters prediction based on deep learning [J]. Journal on Communications, 2021, 42(4): 202-206. |
[3] | Ruizhang HUANG, Wenfan JIN, Yanping CHEN, Yongbin QIN, Qinghua ZHENG. Research on Chinese predicate head recognition based on Highway-BiLSTM network [J]. Journal on Communications, 2021, 42(1): 100-107. |
[4] | Han ZHANG,Yongjin HU,Yuanbo GUO,Jicheng CHEN. Research on coreference resolution technology of entity in information security [J]. Journal on Communications, 2020, 41(2): 165-175. |
[5] | WANG Li’na,GUO Xiaodong,WANG Run. Automated crowdturfing attack in Chinese user reviews [J]. Journal on Communications, 2019, 40(6): 1-13. |
[6] | Huilan LUO, Kang TONG. Spatiotemporal squeeze-and-excitation residual multiplier network for video action recognition [J]. Journal on Communications, 2019, 40(10): 189-198. |
[7] | Run WANG,Benxiao TANG,Li’na WANG. DeepRD:LSTM-based Siamese network for Android repackaged applications detection [J]. Journal on Communications, 2018, 39(8): 69-82. |
[8] | Shui-fei ZENG,Xiao-yan ZHANG,Xiao-feng DU,Tian-bo LU. New method of text representation model based on neural network [J]. Journal on Communications, 2017, 38(4): 86-98. |
[9] | You-jun LI,Jia-jin HUANG,Hai-yuan WANG,Ning ZHONG. Study of emotion recognition based on fusion multi-modal bio-signal with SAE and LSTM recurrent neural network [J]. Journal on Communications, 2017, 38(12): 109-120. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
|