物联网学报 ›› 2023, Vol. 7 ›› Issue (4): 142-152.doi: 10.11959/j.issn.2096-3750.2023.00366

• 理论与技术 • 上一篇    

基于位置可学习视觉中心机制的零售商品检测方法

吕晓华, 魏铭辰, 刘立波   

  1. 宁夏大学信息工程学院,宁夏 银川 750021
  • 修回日期:2023-07-28 出版日期:2023-12-01 发布日期:2023-12-01
  • 作者简介:吕晓华(2000- ),男,宁夏大学信息工程学院硕士生,主要研究方向为基于深度学习的细粒度商品检测、增量学习
    魏铭辰(1993- ),男,宁夏大学信息工程学院硕士生,主要研究方向为基于深度学习的细粒度商品检测
    刘立波(1974- ),女,博士,宁夏大学教授、博士生导师,主要研究方向为智能信息处理、计算机视觉
  • 基金资助:
    国家自然科学基金资助项目(62262053);宁夏科技创新领军人才计划项目(2022GKLRLX03)

Retail commodity detection method based on location learnable visual center mechanism

Xiaohua LYU, Mingchen WEI, Libo LIU   

  1. School of Information Engineering, Ningxia University, Yinchuan 750021, China
  • Revised:2023-07-28 Online:2023-12-01 Published:2023-12-01
  • Supported by:
    The National Natural Science Foundation of China(62262053);The Ningxia Science and Technology Innovation Leading Talent Plan(2022GKLRLX03)

摘要:

针对零售商品包装变形和重叠使得难以有效捕捉显著且多样化的特征信息,导致检测精度不高的问题,设计了位置可学习视觉中心(LLVC, location learnable visual center)机制,对YOLOX-s进行改进,取得了更高的检测精度。为有效应对商品包装变形和重叠现象,首先,通过轻量级多层感知机融合不同特征通道上的信息,以充分捕获全局上下文信息;接着,通过设计的LLVC增强局部特征表示能力,并利用空间信息为局部特征分配可学习的权重,提高辨别性局部特征的关注程度;最后,将交并比(IoU, intersection over union)损失函数替换为中心交并比(CIoU, centered intersection over union),并在此基础上引入功率参数α,有效降低了漏检率。实验结果表明,所提方法在零售商品识别(RPC, retail product checkout)数据集上取得91.3%的准确率,相比YOLOX-s提高了2.2%,并优于目前主流的轻量级目标检测算法;同时每秒帧率(FPS, frame per second)为97 frame/s,模型大小为9.48 MB,能够在计算资源受限的场景下,准确且实时地进行零售商品检测。

关键词: 零售商品检测, YOLOX-s, 中心学习机制, 损失函数, 轻量级

Abstract:

To address the problem of low detection accuracy caused by the difficulty in effectively capturing significant and diversified feature information for packaging deformation and overlap products, a location learnable visual center (LLVC) mechanism was designed to improve YOLOX-s, achieving higher detection accuracy.To effectively deal with product packaging deformation and overlap phenomena, firstly, global context information was captured through a lightweight multi-layer perceptron to help the model better understand spatial information in product features.Secondly, the local feature representation ability was enhanced by the designed LLVC and the spatial information was used to allocate learnable weights for local features to increase the attention of discriminative local features.Finally, the intersection over union (IoU) loss function was replaced with centered intersection over union (CIoU) and power parameters were introduced on this basis to effectively reduce the missed detection rate.Experimental results show that the proposed method achieves an accuracy of 91.3% on the retail product checkout (RPC) dataset, which is 2.2% higher than YOLOX-s and better than current mainstream lightweight object detection algorithms.At the same time, frame per second (FPS) is 97 frame/s, and the model size is 9.48 MB.It can accurately and in real-time detect retail products in scenarios where computing resources are limited.

Key words: retail commodity detection, YOLOX-s, central learning mechanism, loss function, lightweight

中图分类号: 

No Suggested Reading articles found!