大数据 ›› 2023, Vol. 9 ›› Issue (4): 16-31.doi: 10.11959/j.issn.2096-0271.2023043

• 专题:跨域数据管理 • 上一篇    下一篇

Harp:面向跨空间域的分布式事务优化算法

庄琪钰1,2, 李彤1,2, 卢卫1,2, 杜小勇1,2   

  1. 1 数据工程与知识工程教育部重点实验室,北京 100872
    2 中国人民大学信息学院,北京 100872
  • 出版日期:2023-07-15 发布日期:2023-07-01
  • 作者简介:庄琪钰(2000- ),男,中国人民大学信息学院博士生,主要研究方向为分布式数据库系统、事务处理
    李彤(1989- ),男,博士,中国人民大学信息学院副教授,主要研究方向为新一代互联网体系结构、跨域数据管理和大数据
    卢卫(1981- ),男,博士,中国人民大学信息学院教授、博士生导师,中国计算机学会数据库专业委员会委员,主要研究方向为数据库基础理论、大数据系统研制、时空背景下的查询处理和云数据库系统和应用
    杜小勇(1963- ),男,博士,中国人民大学信息学院二级教授、博士生导师,主要研究方向为数据库系统、大数据管理与分析、智能信息检索
  • 基金资助:
    国家自然科学基金资助项目(61972403);国家自然科学基金资助项目(61732014);国家自然科学基金资助项目(62202473);国家重点研发计划资助项目(2020YFB2104100)

Harp: optimization algorithm for cross-domain distributed transactions

Qiyu ZHUANG1,2, Tong LI1,2, Wei LU1,2, Xiaoyong DU1,2   

  1. 1 Key Laboratory of Data Engineering and Knowledge Engineering, Beijing 100872, China
    2 School of Information, Renmin University of China, Beijing 100872, China
  • Online:2023-07-15 Published:2023-07-01
  • Supported by:
    The National Natural Science Foundation of China(61972403);The National Natural Science Foundation of China(61732014);The National Natural Science Foundation of China(62202473);The National Key Research and Development Program of China(2020YFB2104100)

摘要:

近数据计算范式驱动了银行、券商在全国或全球范围内建设多个数据中心。在传统的业务模式中,事务聚焦单个数据中心的数据访问。随着业务模式的变化,跨数据中心的分布式事务成为常态,例如,银行账户之间的转账、游戏账户之间的装备交换等,而这些账户的数据存储在不同区域的数据中心上。分布式事务处理需要两阶段提交协议来保证各参与节点子事务提交的原子性。在跨空间域场景下,节点之间的网络时延更长且存在差异性,传统的事务处理技术需要拓展,以保证系统能够提供较高的吞吐量。在分析了跨域事务存在的问题和优化空间后,提出了一种新的分布式事务处理算法Harp。Harp在保证可串行化隔离级别的前提下,根据网络时延的差异,将部分子事务延迟执行,减少了事务的锁争用时长,提升系统并发度和吞吐量。实验表明,在YCSB负载下,Harp的性能比传统算法提升了1.39倍。

关键词: 跨空间域分布式事务, 网络差异, 事务调度, 锁争用

Abstract:

The paradigm of near-data computing has driven banks and securities firms to build multiple data centers globally or nationally.In the traditional business model, transactions focused on accessing data within a single data center.With the changing business model, distributed transactions across data centers have become common, such as transferring money between bank accounts or exchanging equipment between game accounts, with data stored in different data centers in different regions.Distributed transaction processing requires the two-phase commit protocol to ensure the atomicity of the sub-transactions submitted by each participating node.In processing cross-domain transactions, traditional transaction processing technology needs to be expanded to ensure that the system can provide higher throughput due to the longer and more varied network latency between nodes.After analyzing the problems and optimizing space for crossdomain distributed transactions, this paper proposes a new distributed transaction processing algorithm called Harp.Harp delays the execution of some sub-transactions based on the difference in network latency while ensuring serializable isolation level, reducing the duration of transaction lock contention, and improving system concurrency and throughput.Experiments show that Harp improves the performance by 1.39 times compared with the traditional algorithm under YCSB workload.

Key words: cross-domain distributed transaction, network difference, transaction scheduling, lock contention

中图分类号: 

No Suggested Reading articles found!