电信科学 ›› 2018, Vol. 34 ›› Issue (6): 72-79.doi: 10.11959/j.issn.1000-0801.2018161

• 研究与开发 • 上一篇    下一篇

基于深度时空特征卷积—池化的视频人群计数方法

李强,康子路   

  1. 中国电子科技集团公司信息科学研究院,北京 100086
  • 修回日期:2018-04-18 出版日期:2018-06-01 发布日期:2018-07-03
  • 作者简介:李强(1984-),男,博士,中国电子科技集团公司信息科学研究院物联网技术研究所工程师,主要研究方向为视频/图像处理、模式识别、机器学习。|康子路(1972-),男,中国电子科技集团公司信息科学研究院物联网技术研究所高级工程师,主要研究方向为物联网、数据架构。

Video crowd counting method based on conv-pooling deep spatial and temporal features

Qiang LI,Zilu KANG   

  1. Information Science Academy,China Electronics Technology Group Corporation,Beijing 100086,China
  • Revised:2018-04-18 Online:2018-06-01 Published:2018-07-03

摘要:

由于摄像机角度、背景、人群密度分布和遮挡的限制,传统的基于底层视觉特征的视频人群计数方法往往难以实现理想的效果。利用视频的时空特征和卷积—池化方法形成高层的视觉特征,采用局部特征聚合描述符进行量化和码本计算,实现了对视频人群信息的精准描述;该方法充分利用了视频的运动和外观信息,基于卷积神经网络和池化方法提升了对视频本征属性和特征的描述能力。实验结果表明,所提方法比传统的视频人群计数方法具有更高的精度和更好的顽健性。

关键词: 人群计数, 卷积神经网络, 深度时空特征, 卷积—池化

Abstract:

Due to angle of camera,background,population density distribution and occlusion limitations,traditional video crowd counting methods based on underlying visual features are often difficult to achieve ideal results.Using the temporal and spatial features of video and conv-pooling method,high-level visual features were formed,local feature aggregation descriptors were used for quantization and codebook calculation to achieve accurate description of video crowd information.This method made full use of video motion and appearance information.Based on convolutional neural networks and pooling methods,the ability to describe video intrinsic attributes and features was improved.Experimental results show that the proposed method has higher precision and better robustness than traditional video crowd counting methods.

Key words: crowd counting, convolutional neural network, deep spatial and temporal feature, conv-pooling

中图分类号: 

No Suggested Reading articles found!