智能科学与技术学报 ›› 2023, Vol. 5 ›› Issue (2): 143-162.doi: 10.11959/j.issn.2096-6652.202315
杜泉成1, 王晓2,3, 李灵犀4, 宁焕生1
修回日期:
2023-03-24
出版日期:
2023-06-15
发布日期:
2023-06-10
作者简介:
杜泉成(1994- ),男,北京科技大学计算机与通信工程学院博士生,主要研究方向为轨迹预测和车辆规划决策基金资助:
Quancheng DU1, Xiao WANG2,3, Lingxi LI4, Huansheng NING1
Revised:
2023-03-24
Online:
2023-06-15
Published:
2023-06-10
Supported by:
摘要:
行人轨迹预测旨在利用观察到的人类历史轨迹和周围环境信息来预测目标行人未来的位置信息,该研究具有重要的应用价值,可以降低自动驾驶车辆在社会交互下的碰撞风险。然而,传统的模型驱动的行人轨迹预测方法难以在复杂高动态的场景下对行人进行轨迹预测。相比之下,数据驱动的行人轨迹预测方法依靠大规模数据集平台,可以更好地捕捉和建模更复杂的行人交互关系,进而取得较精准的行人轨迹预测效果,成为自动驾驶、机器人导航和视频监控等领域的研究热点。为了宏观把握行人轨迹预测方法的研究现状及关键问题,以行人轨迹预测技术和方法分类为切入点,首先,详述行人轨迹预测已有方法的研究进展并归纳了目前存在的关键问题与挑战;其次,根据行人轨迹预测模型的建模差异,将现有方法分为模型驱动和数据驱动的行人轨迹预测方法,同时总结了不同方法的优缺点及适用场景;然后,对行人轨迹预测任务中使用的主流数据集进行了归纳总结,并对比了不同算法的性能指标;最后,针对行人轨迹预测的未来发展方向进行了展望。
中图分类号:
杜泉成, 王晓, 李灵犀, 等. 行人轨迹预测方法关键问题研究:现状及展望[J]. 智能科学与技术学报, 2023, 5(2): 143-162.
Quancheng DU, Xiao WANG, Lingxi LI, et al. Key problems and progress of pedestrian trajectory prediction methods: the state of the art and prospects[J]. Chinese Journal of Intelligent Science and Technology, 2023, 5(2): 143-162.
表1
基于模型驱动的行人轨迹预测方法对比"
方法 | 输入信息 | 网络结构 | 优缺点 |
Helbing等人[ | 行人自身状态和周围环境信息 | 基于社会力模型结构 | 优点:模型简单有效、适用性强 缺点:参数敏感、缺少对行人不确定性建模 |
Karamouzas等人[ | 行人位置、速度、加速度等信息以及场景信息 | 基于运动学模型和社会力模型的混合模型 | 优点:长期预测 缺点:训练成本较高、推理速度较慢 |
Trautman等人[ | 历史轨迹信息 | 基于社会力模型和高斯模型结合 | 优点:泛化性强 缺点:实时性较低、模型复杂度较高 |
Zhou等人[ | 行人历史轨迹数据 | 基于混合高斯模型的无监督模型结构 | 优点:高准确性、实时性 缺点:依赖大量数据、可解释性较差 |
Kooij等人[ | 历史轨迹数据和环境上下文信息 | 基于贝叶斯滤波器和运动学模型 | 优点:全局建模、多模态输入、精度高 缺点:计算量大、需要大量训练数据 |
Yan等人[ | 历史轨迹数据信息 | 基于运动学模型和社会力模型结合 | 优点:实时性高、灵活性好、可解释性强 缺点:网络结构复杂、训练难度大 |
Best等人[ | 行人历史轨迹信息、场景上下文信息 | 基于运动学模型和贝叶斯模型结合 | 优点:可预测性强、可解释性好 缺点:计算复杂、依赖先验知识 |
Xie等人[ | 行人位置、姿态、方向信息以及场景上下文信息 | 基于循环神经网络和卷积神经网络以及运动学模型结合 | 优点:建模精准、多模态融合 缺点:模型复杂、需要大量的训练数据和计算资源 |
Rudenko等人[ | 地图、行人速度、位置以及环境信息 | 基于社会力模型和循环神经网络模型结合 | 优点:多模型组合、长期预测、扩展性好 缺点:计算复杂 |
表2
基于LSTM的行人轨迹预测方法"
方法 | 输入信息 | 网络结构 | 数据集 |
S-LSTM[ | 行人历史轨迹信息 | 基于LSTM网络和社交池机制结合 | ETH/UCY |
MX-LSTM[ | 行人历史轨迹和姿态信息 | 基于LSTM网络、注意机制和社交池化机制 | ETH/UCY、KITTI |
Group LSTM[ | 历史轨迹信息 | 基于LSTM模型于社交池化机制 | ETH/UCY |
Social-Grid LSTM[ | 社会交互信息、时序上下文信息 | 基于社交池化与LSTM结合 | ETH/UCY |
SS-LSTM[ | 历史轨迹、上下文场景信息 | 基于LSTM编解码结构 | ETH/UCY |
Scene-LSTM[ | 场景上下文、行人轨迹点信息 | 基于LSTM网络和CNN网络结合 | ETH/UCY |
Shi等人[ | 历史轨迹信息 | 基于社交池化机制 | GCDC/MOT17 |
StarNet[ | 行人历史轨迹信息 | 基于LSTM和社交池化机制 | ETH/UCY |
SNS-LSTM[ | 场景上下文、社会交互信息 | 基于LSTM网络和池化机制 | ETH/UCY |
表3
基于GAN的行人轨迹预测方法对比"
方法 | 输入信息 | 网络结构 | 数据集 |
Social-GAN[ | 行人历史轨迹信息 | 基于LSTM的编解码器结构 | ETH/UCY |
Sophie[ | 历史轨迹和上下文场景信息 | 基于GAN网络和注意机制结合 | ETH/UCY和SDD |
Social-BiGAT[ | 行人动态特征、场景上下文信息 | 基于GCN和GAN结合 | ETH/UCY |
Social Way[ | 历史轨迹信息 | 基于Info-GAN和注意机制结合 | ETH/UCY |
AEE-GAN[ | 场景上下文、历史轨迹 | 基于InfoGAN网络和LSTM网络架构结合 | ETH/UCY和SDD |
STI-GAN[ | 历史轨迹信息 | 基于GAN和图注意机制网络结合 | ETH/UCY |
Atten-GAN[ | 历史轨迹信息、场景图信息 | 基于GAN和双向循环神经网络结合 | ETH/UCY |
表4
基于GCN的行人轨迹预测方法对比"
方法 | 输入信息 | 网络结构 | 数据集 |
STGAT[ | 历史轨迹信息、场景图像信息 | 基于GCN网络、LSTM网络和GAT网络结合 | ETH/UCY |
RSBG[ | 历史轨迹信息 | 基于GCN网络和LSTM网络、CNN网络结合 | ETH/UCY |
Social-STGCNN[ | 历史轨迹信息 | 基于GCN网络和CNN网络结合 | ETH/UCY |
SGCN[ | 历史轨迹信息、速度、加速度信息 | 基于稀疏有向图建模时空关系 | ETH/UCY |
DMRGCN[ | 历史轨迹信息、行人速度信息 | 基于GCN和注意机制结合 | ETH/UCY |
SSAGCN[ | 历史轨迹信息 | 基于GCN、TCN和注意机制结合 | ETH/UCY、SDD |
AST-GNN[ | 历史轨迹信息 | 基于GCN和注意机制结合 | ETH/UCY |
Pedestrian Graph +[ | 行人位姿、场景上下文信息、车辆速度信息 | 基于GCN网络和卷积神经网络结合 | JAAD/PIE |
表5
基于Transformer的行人轨迹预测方法"
方法 | 输入信息 | 网络结构 | 数据集 |
Saleh等人[ | 历史轨迹、上下文信息 | 基于Transformer的编解码结构 | ETH/UCY |
STAR[ | 历史轨迹信息 | 基于图卷积(GCN)和Transformer结构结合 | ETH/UCY |
Giuliari等人[ | 历史轨迹信息 | 基于Transformer的编解码结构 | ETH/UCY、SDD |
Yin等人[ | 历史轨迹、上下文信息 | 基于Transformer的编解码结构 | JAAD/PIE |
Yao等人[ | 历史轨迹信息 | 基于端到端的Transformer编解码结构 | ETH/UCY |
Li等人[ | 历史轨迹信息、RGB图像信息、场景语义信息 | 基于图卷积网络和Transformer网络结合 | ETH/UCY、VIRAT/ActEV |
Su等人[ | 历史轨迹信息、行人速度和加速度信息 | 基于交叉模态的Transformer网络架构 | JAAD/PIE |
表6
数据集预览"
数据集 | Agent | 场景数量 | 传感器 | 时长 | 位置 |
UCY[ | 行人 | 3 | 相机 | 29.5 min | 校园、城市街道 |
ETH[ | 行人 | 2 | 相机 | 25 min | 大学入口、人行道、酒店入口 |
PETS 2009[ | 行人 | 10 | 相机 | 20 h | 公共场所、室外场景 |
Caltech Pedestrian[ | 行人 | 7 | 车载相机 | 10 h | 城市道路 |
SDD[ | 车辆、行人 | 20 | 相机 | — | 校园 |
JAAD[ | 行人 | 346 | 车载相机 | 240 h | 城镇地区、城乡地区 |
ActEV/VIR[ | 行人、车辆、船只等 | 12 | 高清相机 | 280 h | 路口场景、街道 |
Crowd Human[ | 行人 | 15 000 | 相机 | — | 城市路口、商场街道 |
InD[ | 车辆、行人、自行车 | 4 | 相机 | 10 h | 城市路口 |
PIE[ | 行人、车辆 | — | 车载相机 | 6h | 城镇地区、城乡地区 |
STCrowd[ | 行人 | 9 | 激光雷达、相机 | — | 交通路口 |
[1] | SHARMA N , DHIMAN C , INDU S . Pedestrian intention prediction for autonomous vehicles:a comprehensive survey[J]. Neurocomputing, 2022,508: 120-152. |
[2] | CAESAR H , BANKITI V , LANG A H ,et al. nuScenes:a multimodal dataset for autonomous driving[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2020: 11618-11628. |
[3] | RUDENKO A , PALMIERI L , HERMAN M ,et al. Human motion trajectory prediction:a survey[J]. The International Journal of Robotics Research, 2020,39(8): 895-935. |
[4] | R?SMANN C , OELJEKLAUS M , HOFFMANN F ,et al. Online trajectory prediction and planning for social robot navigation[C]// Proceedings of 2017 IEEE International Conference on Advanced Intelligent Mechatronics (AIM). Piscataway:IEEE Press, 2017: 1255-1260. |
[5] | BALLAN L , CASTALDO F , ALAHI A ,et al. Knowledge transfer for scene-specific motion prediction[M]. Computer Vision - ECCV 2016. Cham: Springer International Publishing, 2016: 697-713. |
[6] | SIGHENCEA B I , STANCIU R I , C?LEANU C D . A review of deep learning-based methods for pedestrian trajectory prediction[J]. Sensors, 2021,21(22): 7543. |
[7] | 孔玮, 刘云, 李辉 ,等. 基于深度学习的行人轨迹预测方法综述[J]. 控制与决策, 2021,36(12): 2841-2850. |
KONG W , LIU Y , LI H ,et al. Survey of pedestrian trajectory prediction methods based on deep learning[J]. Control and Decision, 2021,36(12): 2841-2850. | |
[8] | KORBMACHER R , TORDEUX A . Review of pedestrian trajectory prediction methods:comparing deep learning and knowledge-based approaches[J]. IEEE Transactions on Intelligent Transportation Systems, 2022,23(12): 24126-24144. |
[9] | 陈敏, 曾凯, 沈韬 ,等. 基于注意力机制和稀疏图卷积的行人轨迹预测[J]. 激光与光电子学进展, 2023,60(10): 1010013. |
CHEN M , ZENG K , SHEN T ,et al. Pedestrian trajectory prediction based on attention mechanism and sparse graph convolution[J]. Laser and Optoelectronics Progress, 2023,60(10): 1010013. | |
[10] | XU Y , WANG L C , WANG Y Z ,et al. Adaptive trajectory prediction via transferable GNN[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2022: 6510-6521. |
[11] | 裴炤, 邱文涛, 王淼 ,等. 基于Transformer动态场景信息生成对抗网络的行人轨迹预测方法[J]. 电子学报, 2022,50(7): 1537-1547. |
PEI Z , QIU W T , WANG M ,et al. Pedestrian trajectory prediction method using dynamic scene infor? mation based transformer generative adversarial network[J]. Acta Electronica Sinica, 2022,50(7): 15371547. | |
[12] | HELBING D , MOLNáR P . Social force model for pedestrian dynamics[J]. Physical Review E, 1995,51(5): 4282-4286. |
[13] | YAN X . Modeling local behavior for predicting social interactions towards human tracking[J]. Pattern Recognition, 2014,47(4): 1626-1641. |
[14] | HELBING D , FARKAS I , VICSEK T . Simulating dynamical features of escape panic[J]. Nature, 2000,407(6803): 487-490. |
[15] | SCH?LLER C , ARAVANTINOS V , LAY F ,et al. What the constant velocity model can teach us about pedestrian motion prediction[J]. IEEE Robotics and Automation Letters, 2020,5(2): 1696-1703. |
[16] | FOX E , SUDDERTH E B , JORDAN M I ,et al. Bayesian nonparametric inference of switching dynamic linear models[J]. IEEE Transactions on Signal Processing, 2011,59(4): 1569-1585. |
[17] | KOOIJ J F P , SCHNEIDER N , FLOHR F ,et al. Context-based pedestrian path prediction[M]// Computer Vision - ECCV 2014. Cham: Springer International Publishing, 2014: 618-633. |
[18] | SCHNEIDER N , GAVRILA D M . Pedestrian path prediction with recursive Bayesian filters:a comparative study[M]// Lecture Notes in Computer Science. Berlin,Heidelberg: Springer Berlin Heidelberg, 2013: 174-183. |
[19] | DENDORFER P , ELFLEIN S , LEAL-TAIXé L . MG-GAN:a multigenerator model preventing out-of-distribution samples in pedestrian trajectory prediction[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway:IEEE Press, 2022: 13138-13147. |
[20] | GIULIARI F , HASAN I , CRISTANI M ,et al. Transformer networks for trajectory forecasting[C]// Proceedings of 2020 25th International Conference on Pattern Recognition (ICPR). Piscataway:IEEE Press, 2021: 10335-10342. |
[21] | GUPTA A , JOHNSON J , LI F F ,et al. Social GAN:socially acceptable trajectories with generative adversarial networks[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 2255-2264. |
[22] | ALAHI A , GOEL K , RAMANATHAN V ,et al. Social LSTM:human trajectory prediction in crowded spaces[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2016: 961-971. |
[23] | 李琳辉, 周彬, 任威威 ,等. 行人轨迹预测方法综述[J]. 智能科学与技术学报, 2021,3(4): 399-411. |
LI L H , ZHOU B , REN W W ,et al. Review of pedestrian trajectory prediction methods[J]. Chinese Journal of Intelligent Science and Technology, 2021,3(4): 399-411. | |
[24] | SHI X D , SHAO X W , GUO Z L ,et al. Pedestrian trajectory prediction in extremely crowded scenarios[J]. Sensors, 2019,19(5): 1223. |
[25] | SEYFRIED A , STEFFEN B , KLINGSCH W ,et al. The fundamental diagram of pedestrian movement revisited[J]. Journal of Statistical Mechanics:Theory and Experiment, 2005,2005(10): 10002. |
[26] | KENNEDY J , EBERHART R . Particle swarm optimization[C]// Proceedings of ICNN'95 - International Conference on Neural Networks. Piscataway:IEEE Press, 2002: 1942-1948. |
[27] | JIA H F , LIN Y , LUO Q Y ,et al. Multi-objective optimization of urban road intersection signal timing based on particle swarm optimization algorithm[J]. Advances in Mechanical Engineering, 2019,11(4): 168781401984249. |
[28] | MEHRAN R , OYAMA A , SHAH M . Abnormal crowd behavior detection using social force model[C]// Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2009: 935-942. |
[29] | PELLEGRINI S , ESS A , SCHINDLER K ,et al. You'll never walk alone:modeling social behavior for multi-target tracking[C]// Proceedings of 2009 IEEE 12th International Conference on Computer Vision. Piscataway:IEEE Press, 2010: 261-268. |
[30] | CHOI W , SAVARESE S . A unified framework for multi-target tracking and collective activity recognition[M]// Computer Vision – ECCV 2012. Heidelberg: Springer Berlin Heidelberg, 2012: 215-230. |
[31] | RUDENKO A , PALMIERI L , ARRAS K O . Joint long-term prediction of human motion using a planning-based social force approach[C]// Proceedings of 2018 IEEE International Conference on Robotics and Automation (ICRA). Piscataway:IEEE Press, 2018: 4571-4577. |
[32] | TRAUTMAN P , KRAUSE A . Unfreezing the robot:navigation in dense,interacting crowds[C]// Proceedings of 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway:IEEE Press, 2010: 797-803. |
[33] | PELLEGRINI S , ESS A , VAN GOOL L . Improving data association by joint modeling of pedestrian trajectories and groupings[M]// Computer Vision - ECCV 2010. Heidelberg: Springer Berlin Heidelberg, 2010: 452-465. |
[34] | KARAMOUZAS I , HEIL P , VAN BEEK P ,et al. A predictive collision avoidance model for pedestrian simulation[M]// Motion in Games. Berlin,Heidelberg: Springer Berlin Heidelberg, 2009: 41-52. |
[35] | ZHOU B L , WANG X G , TANG X O . Understanding collective crowd behaviors:learning a Mixture model of Dynamic pedestrian-Agents[C]// Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2012: 2871-2878. |
[36] | FOX E , SUDDERTH E B , JORDAN M I ,et al. Bayesian nonparametric inference of switching dynamic linear models[J]. IEEE Transactions on Signal Processing, 2011,59(4): 1569-1585. |
[37] | BEST G , FITCH R . Bayesian intention inference for trajectory prediction with an unknown goal destination[C]// Proceedings of 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway:IEEE Press, 2015: 5817-5823. |
[38] | XIE D , SHU T M , TODOROVIC S ,et al. Learning and inferring "dark matter" and predicting human intents and trajectories in videos[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018,40(7): 1639-1652. |
[39] | XUE H , HUYNH D Q , REYNOLDS M . SS-LSTM:a hierarchical LSTM model for pedestrian trajectory prediction[C]// Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Piscataway:IEEE Press, 2018: 1186-1194. |
[40] | MANH H , ALAGHBAND G . Scene-lstm:a model for human trajectory prediction[J]. arXiv preprint, 2018,arXiv:1808.04018. |
[41] | HASAN I , SETTI F , TSESMELIS T ,et al. MX-LSTM:mixing tracklets and vislets to jointly forecast trajectories and head poses[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 6067-6076. |
[42] | BISAGNO N , ZHANG B , CONCI N . Group LSTM:group trajectory prediction in crowded scenarios[M]// Lecture Notes in Computer Science. Cham: Springer International Publishing, 2019: 213-225. |
[43] | CHENG B , XU X , ZENG Y J ,et al. Pedestrian trajectory prediction via the Social-Grid LSTM model[J]. The Journal of Engineering, 2018(16): 1468-1474. |
[44] | ZHU Y L , QIAN D H , REN D C ,et al. StarNet:pedestrian trajectory prediction using deep neural network in star topology[C]// Proceedings of 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway:IEEE Press, 2020: 8075-8080. |
[45] | LERNER A , CHRYSANTHOU Y , LISCHINSKI D . Crowds by example[J]. Computer Graphics Forum, 2007,26(3): 655-664. |
[46] | SADEGHIAN A , KOSARAJU V , SADEGHIAN A ,et al. SoPhie:an attentive GAN for predicting paths compliant to social and physical constraints[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2020: 1349-1358. |
[47] | AMIRIAN J , HAYET J B , PETTRé J . Social ways:learning multimodal distributions of pedestrian trajectories with GANs[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway:IEEE Press, 2020: 2964-2972. |
[48] | HUANG L , ZHUANG J H , CHENG X M ,et al. STI-GAN:multimodal pedestrian trajectory prediction using spatiotemporal interactions and a generative adversarial network[J]. IEEE Access, 2021,9: 50846-50856. |
[49] | LAI W C , XIA Z X , LIN H S ,et al. Trajectory prediction in heterogeneous environment via attended ecology embedding[C]// Proceedings of the 28th ACM International Conference on Multimedia. New York:ACM, 2020: 202-210. |
[50] | KOSARAJU V , SADEGHIAN A , MARTíN-MARTíN R , ,et al. SocialBiGAT:Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks[J]. Advances in Neural Information Processing Systems, 2019:32. |
[51] | VELI?KOVI? P , CUCURULL G , CASANOVA A ,et al. Graph Attention Networks[J]. arXiv preprint, 2017,arXiv:1710.10903. |
[52] | CHEN X , DUAN Y , HOUTHOOFT R ,et al. InfoGAN:interpretable representation learning by information maximizing generative adversarial nets[C]// Proceedings of the 30th International Conference on Neural Information Processing Systems. New York:ACM, 2016: 2180-2188. |
[53] | FANG F , ZHANG P P , ZHOU B ,et al. Atten-GAN:pedestrian trajectory prediction with GAN based on attention mechanism[J]. Cognitive Computation, 2022,14(6): 2296-2305. |
[54] | MOHAMED A , QIAN K , ELHOSEINY M ,et al. Social-STGCNN:a social spatio-temporal graph convolutional neural network for human trajectory prediction[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2020: 14412-14420. |
[55] | HUANG Y F , BI H K , LI Z X ,et al. STGAT:modeling spatialtemporal interactions for human trajectory prediction[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway:IEEE Press, 2020: 6271-6280. |
[56] | SHI L S , WANG L , LONG C J ,et al. SGCN:sparse graph convolution network for pedestrian trajectory prediction[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2021: 8990-8999. |
[57] | YAN S J , XIONG Y J , LIN D H . Spatial temporal graph convolutional networks for skeleton-based action recognition[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018,32(1): 12328. |
[58] | SUN J H , JIANG Q H , LU C W . Recursive social behavior graph for trajectory prediction[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2020: 657-666. |
[59] | BAE I , JEON H G . Disentangled multi-relational graph convolutional network for pedestrian trajectory prediction[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021,35(2): 911-919. |
[60] | CADENA P R G , QIAN Y Q , WANG C X ,et al. Pedestrian graph:a fast pedestrian crossing prediction model based on graph convolutional networks[J]. IEEE Transactions on Intelligent Transportation Systems, 2022,23(11): 21050-21061. |
[61] | LYU P , WANG W , WANG Y ,et al. SSAGCN:social soft attention graph convolution network for pedestrian trajectory prediction[J]. arXiv preprint, 2021,arXiv:2112.02459. |
[62] | ZHOU H , REN D C , XIA H X ,et al. AST-GNN:an attention-based spatio-temporal graph neural network for Interaction-aware pedestrian trajectory prediction[J]. Neurocomputing, 2021,445: 298-308. |
[63] | 田永林, 王雨桐, 王建功 ,等. 视觉Transformer研究的关键问题:现状及展望[J]. 自动化学报, 2022,48(4): 957-979. |
TIAN Y L , WANG Y T , WANG J G ,et al. Key problems and progress of vision transformers:the state of the art and prospects[J]. Acta Automatica Sinica, 2022,48(4): 957-979. | |
[64] | DEVLIN J , CHANG M W , LEE K ,et al. BERT:pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint, 2018,arXiv:1810.04805. |
[65] | RADFORD A , NARASIMHAN K Improving language understanding by generative pre-training[Z]. 2018. |
[66] | WANG A , SINGH A , MICHAEL J ,et al. GLUE:a multi-task benchmark and analysis platform for natural language understanding[J]. arXiv preprint, 2018,arXiv:1804.07461. |
[67] | CARION N , MASSA F , SYNNAEVE G ,et al. End-to-end object detection with transformers[M]// Computer Vision - ECCV 2020. Cham: Springer International Publishing, 2020: 213-229. |
[68] | DOSOVITSKIY A , BEYER L , KOLESNIKOV A ,et al. An image is worth 16x16 words:transformers for image recognition at scale[J]. arXiv preprint, 2020,arXiv:2010.11929. |
[69] | FU J , LIU J , TIAN H J ,et al. Dual attention network for scene segmentation[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2020: 3141-3149. |
[70] | YAO H Y , WAN W G , LI X . End-to-end pedestrian trajectory forecasting with transformer network[J]. ISPRS International Journal of GeoInformation, 2022,11(1): 44. |
[71] | YU C J , MA X , REN J W ,et al. Spatio-temporal graph transformer networks for pedestrian trajectory prediction[M]// Computer Vision ECCV 2020. Cham: Springer International Publishing, 2020: 507-523. |
[72] | SALEH K . Pedestrian trajectory prediction using context-augmented transformer networks[J]. arXiv preprint, 2020,arXiv:2012.01757. |
[73] | YIN Z , LIU R , XIONG Z ,et al. Multimodal transformer network for pedestrian trajectory prediction[C]// IJCAI.[S.l.:s.n.], 2021: 1259-1265. |
[74] | LI L H , PAGNUCCO M , SONG Y . Graph-based spatial transformer with memory replay for multi-future pedestrian trajectory prediction[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2022: 2221-2231. |
[75] | SU Z X , HUANG G , ZHANG S Y ,et al. Crossmodal transformer based generative framework for pedestrian trajectory prediction[C]// Proceedings of 2022 International Conference on Robotics and Automation (ICRA). Piscataway:IEEE Press, 2022: 2337-2343. |
[76] | ROBICQUET A , SADEGHIAN A , ALAHI A ,et al. Learning social etiquette:human trajectory understanding in crowded scenes[M]// Computer Vision – ECCV 2016. Cham: Springer International Publishing, 2016: 549-565. |
[77] | OH S , HOOGS A , PERERA A ,et al. A large-scale benchmark dataset for event recognition in surveillance video[C]// Proceedings of CVPR. Piscataway:IEEE Press, 2011: 3153-3160. |
[78] | GRIFFIN G , HOLUB A , PERONA P Caltech-256 object category dataset[Z]. 2007. |
[79] | ELLIS A , FERRYMAN J . PETS2010 and PETS2009 evaluation of results using individual ground truthed single views[C]// Proceedings of 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway:IEEE Press, 2010: 135-142. |
[80] | SHAO S , ZHAO Z , LI B ,et al. Crowdhuman:a benchmark for detecting human in a crowd[J]. arXiv preprint, 2018,arXiv:1805.00123. |
[81] | BOCK J , KRAJEWSKI R , MOERS T ,et al. The InD dataset:a drone dataset of naturalistic road user trajectories at German intersections[C]// Proceedings of 2020 IEEE Intelligent Vehicles Symposium (IV). Piscataway:IEEE Press, 2021: 1929-1934. |
[82] | KOTSERUBA I , RASOULI A , TSOTSOS J K . Joint attention in autonomous driving (JAAD)[J]. arXiv preprint, 2016,arXiv:1609.04741. |
[83] | RASOULI A , KOTSERUBA I , KUNIC T ,et al. PIE:a large-scale dataset and models for pedestrian intention estimation and trajectory prediction[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway:IEEE Press, 2020: 6261-6270. |
[84] | CONG P , ZHU X , QIAO F ,et al. Stcrowd:a multimodal dataset for pedestrian perception in crowded scenes[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 19608-19617. |
[1] | 黄哲, 王永才, 李德英. 3D目标检测方法研究综述[J]. 智能科学与技术学报, 2023, 5(1): 7-31. |
[2] | 李金娜, 程薇燃. 基于强化学习的数据驱动多智能体系统最优一致性综述[J]. 智能科学与技术学报, 2020, 2(4): 327-340. |
[3] | 张国宾,王新迎. 基于混合神经网络的光伏组件输出特性数据驱动建模方法[J]. 智能科学与技术学报, 2020, 2(2): 169-178. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|