通信学报 ›› 2023, Vol. 44 ›› Issue (10): 72-84.doi: 10.11959/j.issn.1000-436x.2023189

• 学术论文 • 上一篇    

基于原始点云网格自注意力机制的三维目标检测方法

鲁斌1,2, 孙洋1,2, 杨振宇1,2   

  1. 1 华北电力大学计算机系,河北 保定 071003
    2 复杂能源系统智能计算教育部工程研究中心,河北 保定 071003
  • 修回日期:2023-09-05 出版日期:2023-10-01 发布日期:2023-10-01
  • 作者简介:鲁斌(1975− ),男,宁夏银川人,博士,华北电力大学教授,主要研究方向为智能计算与计算机视觉、综合能源系统与大数据分析
    孙洋(1991− ),男,河北保定人,华北电力大学博士生,主要研究方向为机器学习、计算机视觉
    杨振宇(1998− ),男,内蒙古呼和浩特人,华北电力大学博士生,主要研究方向为机器学习、计算机视觉
  • 基金资助:
    国家自然科学基金资助项目(62371188);河北省在读研究生创新能力培养基金资助项目(CXZZBS2023153)

Grid self-attention mechanism 3D object detection method based on raw point cloud

Bin LU1,2, Yang SUN1,2, Zhenyu YANG1,2   

  1. 1 School of Control and Compute Engineering, North China Electric Power University, Baoding 071003, China
    2 Engineering Research Center of Intelligent Computing for Complex Energy Systems, Ministry of Education, Baoding 071003, China
  • Revised:2023-09-05 Online:2023-10-01 Published:2023-10-01
  • Supported by:
    The National Natural Science Foundation of China(62371188);Hebei Province Postgraduate Innovation Capability Training Project(CXZZBS2023153)

摘要:

为了增强感兴趣区域(RoI)的特征表达,包括空间网格特征编码模块和软回归损失,提出了一种基于原始点云网格自注意力机制的三维目标检测方法GT3D。网格特征编码模块用于通过自注意力机制对点的局部特征和空间特征进行有效加权,充分考虑点云之间的几何关系,以提供更准确的特征表达;软回归损失用于改善数据标注过程中由于标注不准确而产生的回归歧义问题。将所提方法在公开的三维目标检测数据集KITTI上进行实验。结果表明,所提方法相比其他已公开的基于点云的三维目标检测方法检测准确率提升明显,并提交了KITTI官方测试集进行公开测试,对简单、中等和困难 3 个难度等级的汽车检测准确率分别达到 91.45%、82.76%和79.74%。

关键词: 三维目标检测, 点云, 自注意力机制, 空间坐标编码, 软回归损失

Abstract:

To enhance the feature representation of region of interest (RoI), which incorporated a spatial context encoding module and soft regression loss, a grid self-attention mechanism 3D object detection method based on raw point cloud, named GT3D, was proposed.The spatial context encoding module was designed to effectively weight the local and spatial features of points through the attention mechanism, considering the contribution of different point cloud features for a more accurate feature representation.The soft regression loss was introduced to address label ambiguity arising during the data annotation phase.Experiments conducted on the public KITTI 3D object detection dataset demonstrate that the proposed method achieves significant improvements in detection accuracy compared to other publicly available point cloud-based 3D object detection methods.The detection results of the test set are submitted to the official KITTI server for public evaluation, achieving detection accuracies of 91.45%, 82.76%, and 79.74% for easy, moderate, and hard difficulty levels in car detection, respectively.

Key words: 3D object detection, point cloud, self-attention mechanism, spatial coordinate encoding, soft regression loss

中图分类号: 

No Suggested Reading articles found!