3D目标检测方法研究综述

doi:10.11959/j.issn.2096-6652.202312

Abstract

Abstract:

3D object detection is a fundamental problem in autonomous driving，virtual reality，robotics，and other applications.Its goal is to extract the most accurate 3D box characterizing interested targets from the disordered point clouds，such as the closest 3D box surrounding the pedestrians or vehicles.The target 3D box's location，size，and orientation are also output.Currently，there are two primary approaches for 3D object detection: (1) pure point cloud based 3D object detection，in which the point clouds are created by binocular vision，RGB-D camera，and lidar; (2) fusion-based 3D object detection based on the fusion of image and point cloud.The various representations of 3D point clouds were introduced.Then representative methods were introduced from three aspects: traditional machine learning techniques; non-fusion deep learning based algorithms; and multimodal fusion-based deep learning algorithms in progressive relation.The algorithms within and across each category were examined and compared，and the differences and connections between the various methods were analyzed thoroughly.Finally，remaining challenges of 3D object detection were discussed and explored.And the primary datasets and metrics used in 3D object detection studies were summarized.

Key words: deep learning, 3D object detection, multimodal fusion, point cloud, autonomous driving

CLC Number:

TP183

Zhe HUANG,Yongcai WANG,Deying LI. A survey of 3D object detection algorithms[J]. Chinese Journal of Intelligent Science and Technology, 2023, 5(1): 7-31.

Figures/Tables 13

方法		输入	速度/fps		车辆			行人			自行车手
方法		输入	速度/fps	E	M	H	E	M	H	E	M	H
基于立体视觉	Deep3DBox^[85]	mono.image	—	—	—	—	—	—	—	—	—	—
	MonoCon^[99]		—	22.50%	16.46%	13.95%	13.10%	8.41%	6.94%	2.80%	1.92%	1.55%
	Mono3D^[100]		—	2.53%	2.31%	2.31%	—	—	—	—	—	—
	Mono3D-PLiDAR^[87]		—	1.76%	7.5%	6.1%	—	—	—	—	—	—
	M3DSSD^[101]		—	17.51%	11.46%	8.98%	5.16%	3.87%	3.08%	2.10%	1.51%	1.58%
	MonoRUn^[102]			19.65%	12.30%	10.58%	10.88%	6.78%	5.83%	1.01%	0.61%	0.48%
	MonoDETR^[63]		—	25.00%	16.47%	13.58%	33.60%	22.11%	18.60%	28.84%	20.61%	16.38%
	MonoDTR^[71]			21.99%	15.39%	12.73%	15.33%	10.18%	8.61%	5.05%	3.27%	3.19%
	MonoJSG^[88]			24.69%	16.14%	13.64%	11.02%	7.49%	6.41%	5.45%	3.21%	2.57%
基于单目	3DOP^[89]	stereo image	—	—	—	—	—	—	—	—	—	—
	LIGA-Stereo^[50]		—	84.92%	67.06%	63.80%	81.39%	64.66%	57.22%	40.46%	30.00%	27.07%
	CG-Stereo^[103]		—	76.17%	57.82%	54.63%	74.39%	53.58%	46.50%	47.40%	30.89%	27.73%
	Stereo R-CNN^[90]		—	47.58%	30.23%	23.72%	—	—	—	—	—	—
基于多视图	VeloFCN^[76]	FV	1.0	—	—	—		—	—		—	—
	BirdNet^[69]	BEV	9.1	13.53%	9.47%	8.49%	12.25%	8.99%	8.06%	16.63%	10.46%	9.53%
	PIXOR^[92]	BEV	28.6	—	—	—	—	—	—	—	—	—
	HDNet^[93]	BEV	20.0	82.11%	71.70%	67.08%	50.32%	40.97%	37.87%	77.63%	63.78%	55.89%
	LaserNet^[94]	RV	26.3		—	—	—	—	—	—	—	—
基于体素	VoxelNet^[21]	voxel	2.0	77.82%	64.17%	57.51%	—	—	—	—	—	—
	HVNet^[49]	voxel	31	—	—	—	—	—	—	—	—	—
	SECOND^[12]	voxel	26.3	84.65%	75.96%	68.71%	—	—	—	—	—	—
	PointPillars^[24]	pillars	62.0	82.58%	74.31%	68.99%	51.45%	41.92%	38.89%	77.10%	58.65%	51.92%
	HVPR^[104]	voxel	36.1	86.38%	77.92%	73.04%	53.47%	43.96%	40.64%
	SA-SSD^[53]	voxel	25.0	88.75%	79.79%	74.16%	—	—	—	—	—
	Part-A∧2^[54]	point	12.5	87.81%	78.49%	73.51%	53.10%	43.35%	40.06%	79.17%	63.52%	56.93%
基于点云	PointRCNN^[52]	point	10	86.96%	75.64%	70.70%	47.98%	39.37%	36.01%	74.96%	58.82%	52.53%
	VoteNet^[105]	point	—	—	—	—	—	—	—	—	—	—
	PV RCNN^[96]	point ＆voxel	12.5	90.25%	81.43%	76.82%	52.17%	43.29%	40.29%	78.60%	63.71%	57.65%
	3DSSD^[106]	point	25.0	88.36%	79.57%	74.55%	54.64%	44.27%	40.23%	82.48%	64.10%	56.90%
	LiDAR RCNN^[65]	point	—	85.97%	74.21%	69.18%	—	—	—	—	—	—
	3DIoUMatch^[97]	point	—	76.0%	31.7%	36.4%	78.7%	48.2%	56.2%	84.8%	60.2%	74.9%
	ST3D^[51]	point	—	72.94%	—	—	—	—	—	—	—	—
基于Transformer	DisARM^[61]	point	—	—	—	—	—	—	—	—	—	—
	VoTr-TSD^[60]	voxel	—	89.04%	84.04%	78.68%	—	—	—	—	—	—
	PointFormer^[57]	point	—	87.13%	77.06%	69.25%	50.67%	42.43%	39.60%	75.01%	59.80%	53.99%
	CT3D^[58]	point	—	87.83%	81.77%	77.16%	—	—	—	—	—	—
	DETR3D^[59]	BEV	—	—	—	—	—	—	—	—	—	—

模型	融合方法	组合方式	速度/fps	车辆			行人			自行车手
模型	融合方法	组合方式	速度/fps	E	M	H	E	M	H	E	M	H
MV3D^[22]	Rg-F	(1)	2.8	74.97%	63.63%	54.00%	—	—	—	—	—	—
AVOD^[23]	Rg-F	(1)	12.5	76.39%	66.47%	60.23%	36.10%	27.86%	25.76%	57.19%	42.08%	38.29%
SCANet^[58]	Rg-F	(1)	11.1	79.22%	67.13%	60.65%	—	—	—	—	—	—
MVX-Ne^[115]	Rg-F/Pt-F	(3)	16.7	84.99%	71.95%	64.88%	—	—	—	—	—	—
MMF^[109]	Rg-F/Pt-F	(4)	12.5	88.40%	77.43%	70.22%	—	—	—	—	—	—
3D-CVF^[112]	Rg-F/Pt-F	(3)	—	89.20%	80.05%	73.11%	—	—	—	—	—	—
CLOCS^[125]	Rg-F	(2)	—	86.38%	78.45%	72.45%	—	—	—	—	—	—
ContFuse^[77]	Pt-F	(1)	16.7	83.68%	68.78%	61.67%	—	—	—	—	—	—
PointFusion^[14]	Pt-F	(2)	—	77.92%	63.00%	53.27%	33.36%	28.04%	23.38%	49.34%	29.42%	26.98%
PointPainting^[73]	Pt-F	(2)	2.5	82.11%	71.70%	67.08%	50.32%	40.97%	37.87%	77.63%	63.78%	55.89%
EPNet^[124]	Pt-F	(2)	—	89.81%	79.28%	74.59%	—	—	—	—	—	—
PI-RCNN^[16]	Pt-F	(2)	—	88.27%	78.53%	77.75%	—	—	—	—	—	—
MoCa^[116]	Pt-F	(3)	—	50.9%	43.7%	40.0%	76.1%	61.0%	53.4%	86.0%	75.9%	70.7%
CenterPoint^[126]	Pt-F	(1)	—	—	—	—	—	—	—	—	—	—
DeepFusiont^[110]	Pt-F	(2)	—	—	—	—	—	—	—	—	—	—
PointAugmen^[117]	Pt-F	(2)	—	—	—	—	—	—	—	—	—	—
RoarNet^[113]	Rs-F	(2)	10.0	83.71%	73.04%	59.16%	—	—	—	—	—	—
ImVoteNet^[114]	Rs-F/Pt-F	(2)	—	—	—	—	—	—	—	—	—	—
Pseudo-LiDAR^[48]	Rs-F	(4)	—	54.53%	34.05%	38.25%	—	—	—	—	—	—
F-ConvNet^[11]	Rs-F	(2)	2.1	87.36%	76.39%	66.69%	52.16%	43.38%	38.80%	81.98%	65.07%	56.54%
F-PointNet^[74]	Rs-F	(2)	5.9	82.19%	69.79%	60.59%	50.53%	42.15%	38.08%	72.27%	56.12%	49.01%
TransFusion^[119]	Tr-F	(2)	—	—	—	—	—	—	—	—	—	—
TokenFusion^[118]	Tr-F	(2)	—	—	—	—	—	—	—	—	—	—
TransFuser^[120]	Tr-F	(1)	—	—	—	—	—	—	—	—	—	—

数据集	传感器	时间	大小	类别	采集地点
KITTI^[4]	相机、LiDARGNSS、惯性传感器	2012年	7481帧80 256个对象	8类	卡尔斯鲁厄（德国）
SUN3D^[67]	3D相机	2013年	254 个不同场景	16类	北美、欧洲、亚洲
			捕获415序列
			10 335张室内场景
SUN RGB-D^[66]	3D相机(4)	2015年	146 617个2D边框	10类	普林斯顿大学（美国）
			58 657个3D框
Multi-SpectralObject^[128]	视觉和热相机	2017年	7 512帧，5 833个对象	3类	日本
MPO^[62]	相机、LiDAR、GNSS	2017年	1 569帧	6类	—
ScanNet^[64]	3D相机	2018年	1 513个室内场景	21类	—
	深度传感器
S3DIS^[56]	3D相机	2018年	超过 70 000 张RGB图像	13类	斯坦福大学（美国）
DBNet^[129]	相机、LiDAR、GNSS	2018年	超过10k帧	含7个数据集	中国
KAIST^[68]	相机、LiDAR	2018年	7512帧	3类	首尔（韩国）
	GNSS、惯性传感器		308 913个对象
A*3D^[130]	相机(2)	2019年	39 k帧	7类	新加坡
	LiDAR		230 k个对象
Argoverse^[131]	LiDAR(2)	2019年	113个场景	15类	匹兹堡（美国）
	相机(9)		300 k轨迹		宾西法尼亚州（美国）
					佛罗里达州（美国）
PandaSet^[132]	LiDAR(2)、相机(6)	2019年	125个场景	28类	旧金山（美国）
	GNSS、惯性传感器
ApolloScape^[133]	相机、LiDAR	2019年	143 906个图像帧	35类	中国
	GNSS、惯性传感器		89 430个物体
	相机(6)		1 000个场景;
nuScence^[134]	LiDAR	2019年	1.4万帧(照相机、雷达)	23类	波士顿（美国）、新加坡
	Radars(6)		390 k帧(3D激光雷达)
BLVD^[39]	相机(5)	2019年	120 k帧	3类	常熟（中国）
	LiDAR(5)		249 129个对象
Waymo^[135]	相机	2019年	200 k帧	4类	—
	LiDAR		12 M对象(3D激光雷达)
			1.2 M对象(2D照相机)
H3D^[136]	相机(3)	2019年	27 721帧	8类	旧金山（美国）
	LiDAR		1 071 302个对象
A2D2^[137]	相机(6)	2020年	40 k帧(语义信息)	37类	凯默斯海姆（德国）
	LiDAR(5)		12 k帧(3D激光雷达)		英戈尔施塔特（德国）
			390 k帧未标记		慕尼黑（德国）

References 145

[1]	GUO Y L , WANG H Y , HU Q Y ,et al. Deep learning for 3D point clouds:a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021,43(12): 4338-4364.
[2]	ARNOLD E , AL-JARRAH O Y , DIANATI M ,et al. A survey on 3D object detection methods for autonomous driving applications[J]. IEEE Transactions on Intelligent Transportation Systems, 2019,20(10): 3782-3795.
[3]	QIAN R , LAI X , LI X . 3D object detection for autonomous driving:a survey[J]. arXiv preprint, 2021,arXiv:2106.10823.
[4]	GEIGER A , LENZ P , URTASUN R . Are we ready for autonomous driving？ The KITTI vision benchmark suite[C]// Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2012: 3354-3361.
[5]	FRITZ M , SCHIELE B . Decomposition,discovery and detection of visual categories using topic models[C]// Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2008: 1-8.
[6]	CUI Y D , CHEN R , CHU W B ,et al. Deep learning for image and point cloud fusion in autonomous driving:a review[J]. IEEE Transactions on Intelligent Transportation Systems, 2022,23(2): 722-739.
[7]	QIN Z Y , WANG J L , LU Y . MonoGRNet:a geometric reasoning network for monocular 3D object localization[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019,33(1): 8851-8858.
[8]	FU H , GONG M M , WANG C H ,et al. Deep ordinal regression network for monocular depth estimation[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 2002-2011.
[9]	ALDOMA A , MARTON Z C , TOMBARI F ,et al. Tutorial:point cloud library:three-dimensional object recognition and 6 DOF pose estimation[J]. IEEE Robotics ＆ Automation Magazine, 2012,19(3): 80-91.
[10]	ARORA H , LOEFF N , FORSYTH D A ,et al. Unsupervised segmentation of objects using efficient learning[C]// Proceedings of 2007 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2007: 1-7.
[11]	WANG Z X , JIA K . Frustum ConvNet:sliding Frustums to aggregate local point-wise features for amodal 3D object detection[C]// Proceedings of 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway:IEEE Press, 2020: 1742-1749.
[12]	YAN Y , MAO Y X , LI B . SECOND:sparsely embedded convolutional detection[J]. Sensors (Basel,Switzerland), 2018,18(10): 3337.
[13]	ZHOU Y , SUN P , ZHANG Y ,et al. End-to-end multi-view fusion for 3D object detection in LiDAR point clouds[C]// Proceedings of Conference on Robot Learning.[S.l.:s.n.], 2020: 923-932.
[14]	XU D F , ANGUELOV D , JAIN A . PointFusion:deep sensor fusion for 3D bounding box estimation[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 244-253.
[15]	XIE S N , LIU S N , CHEN Z Y ,et al. Attentional ShapeContextNet for point cloud recognition[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 4606-4615.
[16]	XIE L , XIANG C , YU Z X ,et al. PI-RCNN:an efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020,34(7): 12460-12467.
[17]	FENG D , HAASE-SCHüTZ C , ROSENBAUM L ,et al. Deep multi-modal object detection and semantic segmentation for autonomous driving:datasets,methods,and challenges[J]. IEEE Transactions on Intelligent Transportation Systems, 2021,22(3): 1341-1360.
[18]	WANG Y , MAO Q , ZHU H ,et al. Multi-modal 3D object detection in autonomous driving:a survey[J]. arXiv preprint, 2021,arXiv:2106.12735.
[19]	CHARLES R Q , HAO S , MO K C ,et al. PointNet:deep learning on point sets for 3D classification and segmentation[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2017: 77-85.
[20]	QI C R , YI L , SU H ,et al. PointNet++:deep hierarchical feature learning on point sets in a metric space[J]. arXiv preprint, 2017,arXiv:1706.02413.
[21]	ZHOU Y , TUZEL O . VoxelNet:end-to-end learning for point cloud based 3D object detection[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 4490-4499.
[22]	CHEN X Z , MA H M , WAN J ,et al. Multi-view 3D object detection network for autonomous driving[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2017: 6526-6534.
[23]	KU J , MOZIFIAN M , LEE J ,et al. Joint 3D proposal generation and object detection from view aggregation[C]// Proceedings of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. New York:ACM Press, 2018: 1-8.
[24]	LANG A H , VORA S , CAESAR H ,et al. PointPillars:fast encoders for object detection from point clouds[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 12689-12697.
[25]	FAN L , XIONG X , WANG F ,et al. RangeDet:in defense of range view for LiDAR-based 3D object detection[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2022: 2898-2907.
[26]	SUN P , WANG W Y , CHAI Y N ,et al. RSN:range sparse net for efficient,accurate LiDAR 3D object detection[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 5721-5730.
[27]	WU B C , WAN A , YUE X Y ,et al. SqueezeSeg:convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud[C]// Proceedings of 2018 IEEE International Conference on Robotics and Automation. Piscataway:IEEE Press, 2018: 1887-1893.
[28]	REN M Y , POKROVSKY A , YANG B ,et al. SBNet:sparse blocks network for fast inference[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 8711-8720.
[29]	LI P X , ZHAO H C , LIU P F ,et al. RTM3D:real-time monocular 3D detection from object keypoints for autonomous driving[C]// Proceedings of 2020 16th European Conference on Computer Vision. Cham:Springer, 2020: 644-660.
[30]	WANG Y , CHAO W L , GARG D ,et al. Pseudo-LiDAR from visual depth estimation:bridging the gap in 3D object detection for autonomous driving[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 8437-8445.
[31]	NG P C , HENIKOFF S . SIFT:predicting amino acid changes that affect protein function[J]. Nucleic Acids Research, 2003,31(13): 3812-3814.
[32]	JOHNSON A E , HEBERT M . Surface matching for object recognition in complex three-dimensional scenes[J]. Image and Vision Computing, 1998,16(9/10): 635-651.
[33]	CHEN H , BHANU B . 3D free-form object recognition in range images using local surface patches[J]. Pattern Recognition Letters, 2007,28(10): 1252-1262.
[34]	MIAN A , BENNAMOUN M , OWENS R . On the repeatability and quality of keypoints for local feature-based 3D object retrieval from cluttered scenes[J]. International Journal of Computer Vision, 2010,89(2): 348-361.
[35]	STEIN F , MEDIONI G . Structural indexing:efficient 3D object recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992,14(2): 125-145.
[36]	CHUA C S , JARVIS R . Point signatures:a new representation for 3D object recognition[J]. International Journal of Computer Vision, 1997,25(1): 63-85.
[37]	FROME A , HUBER D , KOLLURI R ,et al. Recognizing objects in range data using regional point descriptors[C]// Proceedings of European Conference on Computer Vision. Heidelberg:Springer, 2004: 224-237.
[38]	ZHOU D F , FANG J , SONG X B ,et al. IoU loss for 2D/3D object detection[C]// Proceedings of 2019 International Conference on 3D Vision. Piscataway:IEEE Press, 2019: 85-94.
[39]	COLLET A , SRINIVASAY S S , HEBERT M . Structure discovery in multi-modal data:a region-based approach[C]// Proceedings of 2011 IEEE International Conference on Robotics and Automation. Piscataway:IEEE Press, 2011: 5695-5702.
[40]	SHIN J , TRIEBEL R , SIEGWART R . Unsupervised discovery of repetitive objects[C]// Proceedings of 2010 IEEE International Conference on Robotics and Automation. Piscataway:IEEE Press, 2010: 5041-5046.
[41]	HERBST E , HENRY P , REN X F ,et al. Toward object discovery and modeling via 3D scene comparison[C]// Proceedings of 2011 IEEE International Conference on Robotics and Automation. Piscataway:IEEE Press, 2011: 2623-2629.
[42]	KARPATHY A , MILLER S , LI F F . Object discovery in 3D scenes via shape analysis[C]// Proceedings of 2013 IEEE International Conference on Robotics and Automation. Piscataway:IEEE Press, 2013: 2088-2095.
[43]	FELZENSZWALB P F , HUTTENLOCHER D P . Efficient graph-based image segmentation[J]. International Journal of Computer Vision, 2004,59(2): 167-181.
[44]	SONG S R , XIAO J X . Sliding shapes for 3D object detection in depth images[C]// Proceedings of European Conference on Computer Vision. Cham:Springer, 2014: 634-651.
[45]	MALISIEWICZ T , GUPTA A , EFROS A A . Ensemble of exemplar-SVMs for object detection and beyond[C]// Proceedings of 2011 International Conference on Computer Vision. Piscataway:IEEE Press, 2012: 89-96.
[46]	SONG S R , XIAO J X . Deep sliding shapes for amodal 3D object detection in RGB-D images[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016: 808-816.
[47]	罗会兰, 陈鸿坤 . 基于深度学习的目标检测研究综述[J]. 电子学报, 2020,48(6): 1230-1239.
	LUO H L , CHEN H K . Survey of object detection based on deep learning[J]. Acta Electronica Sinica, 2020,48(6): 1230-1239.
[48]	MINEMURA K , LIAU H , MONRROY A ,et al. LMNet:real-time multiclass object detection on CPU using 3D LiDAR[C]// Proceedings of 2018 3rd Asia-Pacific Conference on Intelligent Robot Systems. Piscataway:IEEE Press, 2018: 28-34.
[49]	YE M S , XU S J , CAO T Y . HVNet:hybrid voxel network for LiDAR based 3D object detection[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 1628-1637.
[50]	GUO X Y , SHI S S , WANG X G ,et al. LIGA-stereo:learning LiDAR geometry aware representations for stereo-based 3D detector[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2022: 3133-3143.
[51]	YANG J H , SHI S S , WANG Z ,et al. ST3D:self-training for unsupervised domain adaptation on 3D object detection[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 10363-10373.
[52]	SHI S S , WANG X G , LI H S . PointRCNN:3D object proposal generation and detection from point cloud[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 770-779.
[53]	HE C H , ZENG H , HUANG J Q ,et al. Structure aware single-stage 3D object detection from point cloud[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 11870-11879.
[54]	SHI S S , WANG Z , WANG X G ,et al. Part-A∧2 Net:3D part-aware and aggregation neural network for object detection from point cloud[J]. arXiv preprint, 2019,arXiv:1907.03670.
[55]	PENG X D , ZHU X G , WANG T ,et al. SIDE:center-based stereo 3D detector with structure-aware instance depth estimation[C]// Proceedings of 2022 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway:IEEE Press, 2022: 225-234.
[56]	ARMENI I , SAX S , ZAMIR A R ,et al. Joint 2D-3D-semantic data for indoor scene understanding[J]. arXiv preprint, 2017,arXiv:1702.01105.
[57]	PAN X R , XIA Z F , SONG S J ,et al. 3D object detection with pointformer[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 7459-7468.
[58]	SHENGA H L , CAI S J , LIU Y ,et al. Improving 3D object detection with channel-wise transformer[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2022: 2723-2732.
[59]	WANG Y , GUIZILINI V , ZHANG T ,et al. DETR3D:3D object detection from multi-view images via 3D-to-2D queries[C]// Proceedings of Conference on Robot Learning.[S.l.:s.n.], 2022: 180-191.
[60]	MAO J G , XUE Y J , NIU M Z ,et al. Voxel transformer for 3D object detection[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2021: 3164-3173.
[61]	DUAN Y , ZHU C , LAN Y ,et al. DisARM:displacement aware relation module for 3D detection[J]. arXiv preprint, 2022,arXiv:2203.01152.
[62]	JUNG H , OTO Y , MOZOS O M ,et al. Multi-modal panoramic 3D outdoor datasets for place categorization[C]// Proceedings of 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems. New York:ACM Press, 2016: 4545-4550.
[63]	ZHANG R , QIU H , WANG T ,et al. MonoDETR:depth-guided transformer for monocular 3D object detection[J]. arXiv preprint, 2022,arXiv:2203.13310.
[64]	DAI A , CHANG A X , SAVVA M ,et al. ScanNet:richly-annotated 3D reconstructions of indoor scenes[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2017: 2432-2443.
[65]	LI Z C , WANG F , WANG N Y . LiDAR R-CNN:an efficient and universal 3D object detector[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 7542-7551.
[66]	SONG S R , LICHTENBERG S P , XIAO J X . SUN RGB-D:a RGB-D scene understanding benchmark suite[C]// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2015: 567-576.
[67]	XIAO J X , OWENS A , TORRALBA A . SUN3D:a database of big spaces reconstructed using SfM and object labels[C]// Proceedings of 2013 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2014: 1625-1632.
[68]	CHOI Y , KIM N , HWANG S ,et al. KAIST multi-spectral day/night data set for autonomous and assisted driving[J]. IEEE Transactions on Intelligent Transportation Systems, 2018,19(3): 934-948.
[69]	BELTRáN J , GUINDEL C , MORENO F M ,et al. BirdNet:a 3D object detection framework from LiDAR information[C]// Proceedings of 2018 21st International Conference on Intelligent Transportation Systems. New York:ACM Press, 2018: 3517-3523.
[70]	LU H H , CHEN X S , ZHANG G Y ,et al. Scanet:spatial-channel attention network for 3D object detection[C]// Proceedings of 2019 IEEE International Conference on Acoustics,Speech and Signal Processing. Piscataway:IEEE Press, 2019: 1992-1996.
[71]	HUANG K C , WU T H , SU H T ,et al. MonoDTR:monocular 3D object detection with depth-aware transformer[J]. arXiv preprint, 2022,arXiv:2203.10981.
[72]	ZHAO X , LIU Z , HU R L ,et al. 3D object detection using scale invariant and feature reweighting networks[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019,33(1): 9267-9274.
[73]	VORA S , LANG A H , HELOU B ,et al. Point Painting:sequential fusion for 3D object detection[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 4603-4611.
[74]	QI C R , LIU W , WU C X ,et al. Frustum PointNets for 3D object detection from RGB-D data[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 918-927.
[75]	JIANG M , WU Y , ZHAO T ,et al. PointSIFT:a SIFT-like network module for 3D point cloud semantic segmentation[J]. arXiv preprint, 2018,arXiv:1807.00652.
[76]	LI B , ZHANG T , XIA T . Vehicle detection from 3D lidar using fully convolutional network[J]. arXiv preprint, 2016,arXiv:1608.07916.
[77]	LIANG M , YANG B , WANG S L ,et al. Deep continuous fusion for multi-sensor 3D object detection[C]// Proceedings of 2018 15th European Conference on Computer Vision. New York:ACM Press, 2018: 663-678.
[78]	尹宏鹏, 陈波, 柴毅 ,等. 基于视觉的目标检测与跟踪综述[J]. 自动化学报, 2016,42(10): 1466-1489.
	YIN H P , CHEN B , CHAI Y ,et al. Vision-based object detection and tracking:a review[J]. Acta Automatica Sinica, 2016,42(10): 1466-1489.
[79]	王永森, 刘宏哲 . 3D 目标检测技术的研究进展[C]// 中国计算机用户协会网络应用分会2019年第二十三届网络新技术与应用年会论文集. 重庆:《计算机科学》编辑部, 2019: 177-182.
	WANG Y S , LIU H Z . Study progress of advances in 3D object detection technology[C]// Proceedings of 2019 23rd Annual Conference on New Network Technologies and Applications of China Computer Users Association. Chongqing:Editorial Board of Computer Science, 2019: 177-182.
[80]	GIRSHICK R , . Fast R-CNN[C]// Proceedings of 2015 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2016: 1440-1448.
[81]	REN S Q , HE K M , GIRSHICK R ,et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(6): 1137-1149.
[82]	LIU W , ANGUELOV D , ERHAN D ,et al. SSD:single shot MultiBox detector[C]// Proceedings of 14th European Conference on Computer Vision. Cham:Springer International Publishing, 2016: 21-37.
[83]	REDMON J , DIVVALA S , GIRSHICK R ,et al. You only look once:unified,real-time object detection[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016: 779-788.
[84]	HE K M , ZHANG X Y , REN S Q ,et al. Deep residual learning for image recognition[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016: 770-778.
[85]	CHEN Y J , TAI L , SUN K ,et al. MonoPair:monocular 3D object detection using pairwise spatial relationships[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 12090-12099.
[86]	MOUSAVIAN A , ANGUELOV D , FLYNN J ,et al. 3D bounding box estimation using deep learning and geometry[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2017: 5632-5640.
[87]	WENG X S , KITANI K . Monocular 3D object detection with pseudo-LiDAR point cloud[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop. Piscataway:IEEE Press, 2020: 857-866.
[88]	LIAN Q , LI P , CHEN X . MonoJSG:joint semantic and geometric cost volume for monocular 3D object detection[J]. arXiv preprint, 2022,arXiv:2203.08563.
[89]	CHEN X Z , KUNDU K , ZHU Y K ,et al. 3D object proposals using stereo imagery for accurate object class detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018,40(5): 1259-1272.
[90]	LI P L , CHEN X Z , SHEN S J . Stereo R-CNN based 3D object detection for autonomous driving[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 7636-7644.
[91]	HE K M , GKIOXARI G , DOLLáR P ,et al. Mask R-CNN[C]// Proceedings of 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2017: 2980-2988.
[92]	YANG B , LUO W J , URTASUN R . PIXOR:real-time 3D object detection from point clouds[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 7652-7660.
[93]	YANG B , LIANG M , URTASUN R . HDNET:exploiting HD maps for 3D object detection[C]// Proceedings of Conference on Robot Learning.[S.l.:s.n.], 2018: 146-155.
[94]	MEYER G P , LADDHA A , KEE E ,et al. LaserNet:an efficient probabilistic 3D object detector for autonomous driving[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 12669-12678.
[95]	LIU W , ANGUELOV D , ERHAN D ,et al. SSD:single shot multibox detector[C]// Proceedings of European Conference on Computer Vision. Cham:Springer, 2016: 21-37.
[96]	SHI S S , GUO C X , JIANG L ,et al. PV-RCNN:point-voxel feature set abstraction for 3D object detection[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 10526-10535.
[97]	WANG H , CONG Y Z , LITANY O ,et al. 3DIoUMatch:leveraging IoU prediction for semi-supervised 3D object detection[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 14610-14619.
[98]	CARION N , MASSA F , SYNNAEVE G ,et al. End-to-end object detection with transformers[C]// Proceedings of European Conference on Computer Vision. Cham:Springer, 2020: 213-229.
[99]	LIU X , XUE N , WU T . Learning auxiliary monocular contexts helps monocular 3D object detection[J]. arXiv preprint, 2021,arXiv:2112.04628.
[100]	HE T , SOATTO S . Mono3D++:monocular 3D vehicle detection with two-scale 3D hypotheses and task priors[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019,33(1): 8409-8416.
[101]	LUO S J , DAI H , SHAO L ,et al. M3DSSD:monocular 3D single stage object detector[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 6141-6150.
[102]	CHEN H S , HUANG Y Y , TIAN W ,et al. MonoRUn:monocular 3D object detection by reconstruction and uncertainty propagation[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 10374-10383.
[103]	LI C Y , KU J , WASLANDER S L . Confidence guided stereo 3D object detection with split depth estimation[C]// Proceedings of 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway:IEEE Press, 2021: 5776-5783.
[104]	NOH J , LEE S , HAM B . HVPR:hybrid voxel-point representation for single-stage 3D object detection[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 14600-14609.
[105]	DING Z P , HAN X , NIETHAMMER M . VoteNet:a deep learning label fusion method for multi-atlas segmentation[C]// Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham:Springer, 2019: 202-210.
[106]	YANG Z T , SUN Y N , LIU S ,et al. 3DSSD:point-based 3D single stage object detector[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 11037-11045.
[107]	LIU Z C , WU Z Z , TóTH R , . SMOKE:single-stage monocular 3D object detection via keypoint estimation[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway:IEEE Press, 2020: 4289-4298.
[108]	HERBST E , REN X F , FOX D . RGB-D object discovery via multi-scene analysis[C]// Proceedings of 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway:IEEE Press, 2011: 4850-4856.
[109]	LIANG M , YANG B , CHEN Y ,et al. Multi-task multi-sensor fusion for 3D object detection[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 7337-7345.
[110]	LI Y , YU A W , MENG T ,et al. DeepFusion:lidar-camera deep fusion for multi-modal 3D object detection[J]. arXiv preprint, 2022,arXiv:2203.08195.
[111]	GUPTA S , GIRSHICK R , ARBELáEZ P ,et al. Learning rich features from RGB-D images for object detection and segmentation[C]// Proceedings of European Conference on Computer Vision. Cham:Springer, 2014: 345-360.
[112]	YOO J H , KIM Y , KIM J ,et al. 3D-CVF:generating joint camera and LiDAR features using cross-view spatial feature fusion for 3D object detection[C]// Proceedings of European Conference on Computer Vision. Cham:Springer, 2020: 720-736.
[113]	SHIN K , KWON Y P , TOMIZUKA M . RoarNet:a robust 3D object detection based on RegiOn approximation refinement[C]// Proceedings of 2019 IEEE Intelligent Vehicles Symposium. Piscataway:IEEE Press, 2019: 2510-2515.
[114]	QI C R , CHEN X L , LITANY O ,et al. ImVoteNet:boosting 3D object detection in point clouds with image votes[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 4403-4412.
[115]	SINDAGI V A , ZHOU Y , TUZEL O . MVX-Net:multimodal VoxelNet for 3D object detection[C]// Proceedings of 2019 International Conference on Robotics and Automation. Piscataway:IEEE Press, 2019: 7276-7282.
[116]	ZHANG W W , WANG Z , LOY C C . Multi-modality cut and paste for 3D object detection[J]. arXiv preprint, 2020,arXiv:2012.12741.
[117]	WANG C W , MA C , ZHU M ,et al. PointAugmenting:cross-modal augmentation for 3D object detection[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 11789-11798.
[118]	WANG Y , CHEN X , CAO L ,et al. Multimodal token fusion for vision transformers[J]. arXiv preprint, 2022,arXiv:2204.08721.
[119]	BAI X , HU Z , ZHU X ,et al. TransFusion:robust LiDAR-camera fusion for 3D object detection with transformers[J]. arXiv preprint, 2022,arXiv:2203.11496.
[120]	PRAKASH A , CHITTA K , GEIGER A . Multi-modal fusion transformer for end-to-end autonomous driving[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 7073-7083.
[121]	LIANG T , XIE H , YU K ,et al. BEVFusion:a simple and robust LiDAR-camera fusion framework[J]. arXiv preprint, 2022,arXiv:2205.13790.
[122]	MANHARDT F , KEHL W , GAIDON A . ROI-10D:monocular lifting of 2D detection to 6D pose and metric shape[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 2064-2073.
[123]	GUPTA S , HOFFMAN J , MALIK J . Cross modal distillation for supervision transfer[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016: 2827-2836.
[124]	HUANG T T , LIU Z , CHEN X W ,et al. EPNet:enhancing point features with image semantics for 3D object detection[C]// Proceedings of European Conference on Computer Vision. Cham:Springer, 2020: 35-52.
[125]	PANG S , MORRIS D , RADHA H . CLOCs:camera-LiDAR object candidates fusion for 3D object detection[C]// Proceedings of 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. New York:ACM Press, 2020: 10386-10393.
[126]	YIN T W , ZHOU X Y , KR?HENBüHL P , . Center-based 3D object detection and tracking[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 11779-11788.
[127]	SIVIC J , RUSSELL B C , EFROS A A ,et al. Discovering objects and their location in images[C]// Proceedings of 10th IEEE International Conference on Computer Vision Volume 1. Piscataway:IEEE Press, 2005: 370-377.
[128]	SHI X P , CHEN Z X , KIM T K . Distance-normalized unified representation for monocular 3D object detection[C]// Proceedings of European Conference on Computer Vision. Cham:Springer, 2020: 91-107.
[129]	CHEN Y P , WANG J K , LI J ,et al. LiDAR-video driving dataset:learning driving policies effectively[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 5870-5878.
[130]	PHAM Q H , SEVESTRE P , PAHWA R S ,et al. A*3D dataset:towards autonomous driving in challenging environments[C]// Proceedings of 2020 IEEE International Conference on Robotics and Automation. Piscataway:IEEE Press, 2020: 2267-2273.
[131]	CHANG M F , LAMBERT J , SANGKLOY P ,et al. Argoverse:3D tracking and forecasting with rich maps[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 8740-8749.
[132]	SCALE H . PandaSet:public large-scale dataset for autonomous driving[R]. 2019.
[133]	HUANG X Y , CHENG X J , GENG Q C ,et al. The ApolloScape dataset for autonomous driving[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway:IEEE Press, 2018: 1067-10676.
[134]	CAESAR H , BANKITI V , LANG A H ,et al. nuScenes:a multimodal dataset for autonomous driving[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 11618-11628.
[135]	SUN P , KRETZSCHMAR H , DOTIWALLA X ,et al. Scalability in perception for autonomous driving:waymo open dataset[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 2443-2451.
[136]	PATIL A , MALLA S , GANG H M ,et al. The H3D dataset for full-surround 3D multi-object detection and tracking in crowded urban scenes[C]// Proceedings of 2019 International Conference on Robotics and Automation. Piscataway:IEEE Press, 2019: 9552-9557.
[137]	GEYER J A2D2:AEV autonomous driving dataset[Z]. 2019.
[138]	RAHMAN M M , TAN Y H , XUE J ,et al. Notice of violation of IEEE publication principles:recent advances in 3D object detection in the era of deep neural networks:a survey[J]. IEEE Transactions on Image Processing, 2020,29: 2947-2962.
[139]	DENG J , DONG W , SOCHER R ,et al. ImageNet:a large-scale hierarchical image database[C]// Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2009: 248-255.
[140]	LIU Y C , FAN B , XIANG S M ,et al. Relation-shape convolutional neural network for point cloud analysis[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 8887-8896.
[141]	CASEY A D , SON S F , BILIONIS I ,et al. Prediction of energetic material properties from electronic structure using 3D convolutional neural networks[J]. Journal of Chemical Information and Modeling, 2020,60(10): 4457-4473.
[142]	CASAS S , LUO W , URTASUN R . IntentNet:learning to predict intention from raw sensor data[J]. arXiv preprint, 2021,arXiv:2101.07907.
[143]	SUN Y X , ZUO W X , LIU M . RTFNet:RGB-thermal fusion network for semantic segmentation of urban scenes[J]. IEEE Robotics and Automation Letters, 2019,4(3): 2576-2583.
[144]	CHEN X , SHI S , ZHU B ,et al. MPPNet:multi-frame feature intertwining with proxy points for 3D temporal object detection[J]. arXiv preprint, 2022,arXiv:2205.05979.
[145]	XU J Y , MIAO Z W , ZHANG D ,et al. INT:towards infiniteframes 3D detection with an efficient framework[J]. arXiv preprint, 2022,arXiv:2209.15215.

Metrics

Recommended 0

No Suggested Reading articles found!

A survey of 3D object detection algorithms

RichHTML

PDF下载

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 13

References 145

Related Articles 15

Metrics

Recommended 0

[1]	Jingwei LU, Xiang CHENG, Fei-Yue WANG. Artificial intelligence and deep learning methods for solving differential equations: the state of the art and prospects [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(4): 461-476.
[2]	Jun ZHANG, Peidong XU, Siyuan CHEN, Tianlu GAO, Yuxin DAI, Ke ZHANG, Hang ZHAO, Jiemai GAO, Yuyang BAI, Jinxing LI, Haoran ZHANG, Xiang LI, Jiuxiang CHEN. A hybrid physics-data-knowledge driven approach for human-machine hybrid-augmented intelligence-based system management and control [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(4): 571-583.
[3]	Yan CHEN, Xueqin LUO, Wei LIANG, Yongfang XIE. Depression recognition based on emotional information fused with attentional mechanism [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(4): 600-609.
[4]	Renwu LI, Lingxiao ZHANG, Lin GAO, Chunpeng LI, Hao JIANG. Category-level object pose estimation from depth point cloud [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(2): 246-254.
[5]	Linhui LI, Bin ZHOU, Weiwei REN, Jing LIAN. Review of pedestrian trajectory prediction methods [J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(4): 399-411.
[6]	Ying LI, Long CHEN, Zhaohong HUANG, Yang SUN, Guorong CAI. Plant leaf detection technology based on multi-scale CNN feature fusion [J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(3): 304-311.
[7]	Qing TIAN, Rong HU, Zuoyong LI, Yuanzheng CAI, Zhaochai YU. Insulator detection based on SE-YOLOv5s [J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(3): 312-321.
[8]	Wen LIU, Kunlin HU, Yan LI, Zhao LIU. A review of prediction methods for moving target trajectories [J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(2): 149-160.
[9]	Yang ZHANG, Yue HU, Dongrong XIN. A deep learning short-term traffic flow prediction method considering spatial-temporal association [J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(2): 172-178.
[10]	Dongjun LIU, Yuhan WANG, Wenfen LING, Yong PENG, Wanzeng KONG. Emotion recognition based on brain and machine collaborative intelligence [J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(1): 65-75.
[11]	Liang ZHAO, Zhifeng XIE, Kunpeng ZHANG, Yuqing ZHENG, Yuankun FU. Modeling signal propagation in wireless network:an interval type-2 fuzzy ensemble deep learning approach [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 401-411.
[12]	Sijia TIAN,Qiang GU,Rong HU,Ruige LI,Dingxin HE. A robot sorting method based on deep learning [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(3): 268-274.
[13]	Fei-Yue WANG,Dongpu CAO,Qinglai WEI. Reinforcement learning:toward action-knowledge merged intelligent mechanisms and algorithms [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(2): 101-106.
[14]	Xiaofeng YUAN,Yalin WANG,Chunhua YANG,Weihua GUI. The application of deep learning in data-driven modeling of process industries [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(2): 107-115.
[15]	Kai LIU, Yidong LI, Weipeng LIN. A survey on vehicle re-identification [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(1): 10-25.