通信学报 ›› 2016, Vol. 37 ›› Issue (9): 10-19.doi: 10.11959/j.issn.1000-436x.2016173

• 学术论文 • 上一篇    下一篇

基于滑动窗口的多核程序数据竞争硬件检测算法

朱素霞1,2,陈德运1,2,季振洲3,孙广路1   

  1. 1 哈尔滨理工大学计算机科学与技术学院,黑龙江 哈尔滨 150080
    2 哈尔滨理工大学计算机科学与技术学院博士后流动站,黑龙江 哈尔滨 150080
    3 哈尔滨工业大学计算机科学与技术学院,黑龙江 哈尔滨 150001
  • 出版日期:2016-09-25 发布日期:2016-09-28
  • 基金资助:
    国家自然科学青年基金资助项目;黑龙江省青年科学基金资助项目;中国博士后科学基金资助项目;国家自然科学基金资助项目;国家重点基础研究发展计划(“973”计划)基金资助项目

Hardware data race detection algorithm based on sliding windows

Su-xia ZHU1,2,De-yun CHEN1,2,Zhen-zhou JI3,Guang-lu SUN1   

  1. 1 School of Computer Science and Technology,Harbin University of Technology,Harbin 150080,China
    2 Postdoctoral Research Station,School of Computer Science and Technology,Harbin University of Technology,Harbin 150080,China
    3 School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China
  • Online:2016-09-25 Published:2016-09-28
  • Supported by:
    The National Natural Science Foundation of China for Youths;Heilongjiang Province Science Foundation for Youths;The China Postdoctoral Science Foundation;The National Basic Research Program of China

摘要:

数据竞争是引起多核程序发生并发错误的主要原因。针对现有基于硬件的 happens-before 数据竞争检测方法硬件开销大的问题,提出了一种轻量级的内存竞争硬件检测算法,该算法利用滑动窗口技术动态检测程序执行过程中发生的距离较近、更易引发并发错误的数据竞争。考虑竞争距离的大小,将并发线程片段细分为加锁并发竞争域和包含线程近期执行序列的未加锁并发竞争域,用一对交替移动的可重写滑动窗口保存未加锁并发竞争域内的内存操作指令,用一个大小可变的可重写滑动窗口保存加锁并发竞争域内的内存操作指令,当来自远程的共享访问与窗口内的内存访问发生冲突时,检测到数据竞争。在硬件实现结构中,仅为每个处理器核添加3对较小尺寸的硬件签名寄存器来保存并发竞争域内的数据地址,无需更改原有的cache一致性协议,带来的带宽开销低,能够快速地检测多核程序并发执行过程中发生的动态数据竞争,为多核程序开发和生产运行阶段的并发错误诊断提供有效的指导信息。

关键词: 数据竞争, 滑动窗口, 硬件签名, 并发错误, 多核程序

Abstract:

Data race is a major factor which causes multi-core programs to produce concurrent bugs.To address the high hardware cost in happens-before detection proposals,a light-weight hardware data race detection approach based on sliding window technology was proposed.It used sliding windows to save recent memory instructions in thread execution and dynamically detected data races with small race distance which more easily lead to concurrent bugs.Considering the race distance,parallel thread segments were subdivided into concurrent race regions with lock and concurrent race regions without lock.A pair of alternate rewritable sliding windows was used to store the memory instructions in concurrent race region without lock,and a sliding window with variable size was used to store the memory instructions in concurrent race region with lock.When there was a conflict between a remote sharing access and memory accesses in sliding windows,a data race was detected.In the hardware implementation,the addresses of the data in sliding windows were automatically encoded into three hardware signatures with small size.Data races can be detected quickly without modifying the L1 cache and cache coherence protocol messages.This approach supplies efficient guidance to help users to diagnose concurrency bugs occurred in the development and production run of multi-core programs,achieving smaller hardware and bandwidth overhead.

Key words: data race, sliding window, hardware signature, concurrency bug, multi-core program

No Suggested Reading articles found!