大数据 ›› 2023, Vol. 9 ›› Issue (4): 16-31.doi: 10.11959/j.issn.2096-0271.2023043
• 专题:跨域数据管理 • 上一篇
庄琪钰1,2, 李彤1,2, 卢卫1,2, 杜小勇1,2
出版日期:
2023-07-01
发布日期:
2023-07-01
作者简介:
庄琪钰(2000- ),男,中国人民大学信息学院博士生,主要研究方向为分布式数据库系统、事务处理基金资助:
Qiyu ZHUANG1,2, Tong LI1,2, Wei LU1,2, Xiaoyong DU1,2
Online:
2023-07-01
Published:
2023-07-01
Supported by:
摘要:
近数据计算范式驱动了银行、券商在全国或全球范围内建设多个数据中心。在传统的业务模式中,事务聚焦单个数据中心的数据访问。随着业务模式的变化,跨数据中心的分布式事务成为常态,例如,银行账户之间的转账、游戏账户之间的装备交换等,而这些账户的数据存储在不同区域的数据中心上。分布式事务处理需要两阶段提交协议来保证各参与节点子事务提交的原子性。在跨空间域场景下,节点之间的网络时延更长且存在差异性,传统的事务处理技术需要拓展,以保证系统能够提供较高的吞吐量。在分析了跨域事务存在的问题和优化空间后,提出了一种新的分布式事务处理算法Harp。Harp在保证可串行化隔离级别的前提下,根据网络时延的差异,将部分子事务延迟执行,减少了事务的锁争用时长,提升系统并发度和吞吐量。实验表明,在YCSB负载下,Harp的性能比传统算法提升了1.39倍。
中图分类号:
庄琪钰, 李彤, 卢卫, 杜小勇. Harp:面向跨空间域的分布式事务优化算法[J]. 大数据, 2023, 9(4): 16-31.
Qiyu ZHUANG, Tong LI, Wei LU, Xiaoyong DU. Harp: optimization algorithm for cross-domain distributed transactions[J]. Big Data Research, 2023, 9(4): 16-31.
[1] | 柴云鹏, 李彤, 范举 ,等. 跨域数据管理的内涵与挑战[J]. 中国计算机学会通讯, 2022,18(11): 29-33. |
CHAI Y P , LI T , FAN J ,et al. Connotation and challenges of cross-domain data management[J]. Communications of China Computer Society, 2022,18(11): 29-33. | |
[2] | TAFT R , SHARIF I , MATEI A ,et al. CockroachDB:the resilient geodistributed SQL database[C]// Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. New York:ACM Press, 2020: 1493-1509. |
[3] | CORBETT J C , DEAN J , EPSTEIN M ,et al. Spanner:Google’s globally distributed database[J]. ACM Transactions on Computer Systems, 2013,31(3): 1-22. |
[4] | BERNSTEIN P A , GOODMAN N . Concurrency control in distributed database systems[J]. ACM Computing Surveys, 13(2): 185-221. |
[5] | ZAMANIAN E , SHUN J L , BINNIG C ,et al. Chiller:contention-centric transaction execution and data partitioning for modern networks[C]// Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. New York:ACM Press, 2020: 511-526. |
[6] | 赵泓尧, 赵展浩, 杨皖晴 ,等. 内存数据库并发控制算法的实验研究[J]. 软件学报, 2022,33(3): 867-890. |
ZHAO H Y , ZHAO Z H , YANG W Q ,et al. Experimental study on concurrency control algorithms in in-memory databases[J]. Journal of Software, 2022,33(3): 867-890. | |
[7] | KUNG H T , ROBINSON J T . On optimistic methods for concurrency control[J]. ACM Transactions on Database Systems, 1981,6(2): 213-226. |
[8] | MOHAN C , LINDSAY B . Efficient commit protocols for the tree of processes model of distributed transactions[C]// Proceedings of the Second Annual ACM Symposium on Principles of Distributed Computing. New York:ACM Press, 1983: 76-88. |
[9] | BINNIG C , CROTTY A , GALAKATOS A ,et al. The end of slow networks[J]. Proceedings of the VLDB Endowment, 2016,9(7): 528-539. |
[10] | BALLA D , MALIOSZ M , SIMON C ,et al. Bounded latency with RoCE[C]// Proceedings of the ACM SIGCOMM 2019 Conference Posters and Demos. New York:ACM Press, 2019 134-135. |
[11] | STAMOS J W , CRISTIAN F . A low-cost atomic commit protocol[C]// Proceedings Ninth Symposium on Reliable Distributed Systems. Piscataway:IEEE Press, 2002: 66-75. |
[12] | LU Y , YU X , MADDEN S . STAR:scaling transactions through asymmetric replication[J]. arXiv preprint, 2018,arXiv:1811.02059. |
[13] | LIN Q , CHANG P F , CHEN G ,et al. Towards a non-2PC transaction management in distributed database systems[C]// Proceedings of the 2016 International Conference on Management of Data. New York:ACM Press, 2016: 1659-1674. |
[14] | 王珊, 萨师煊 . 数据库系统概论(第5版)[M]. 北京: 高等教育出版社, 2014. |
WANG S , SA S X . Introduction to database system(5th ed.)[M]. Beijing: Higher Education Press, 2014. | |
[15] | ROSENKRANTZ D J , STEARNS R E , LEWIS P M . System level concurrency control for distributed database systems[J]. ACM Transactions on Database Systems, 1978,3(2): 178-198. |
[16] | WU Y J , ARULRAJ J , LIN J X ,et al. An empirical evaluation of in-memory multi-version concurrency control[J]. Proceedings of the VLDB Endowment, 2017,10(7): 781-792. |
[17] | DuBourdieuD . Implementation of distributed transactions[C]// Proceedings of 6th Berkeley Workshop on Distributed Data Management and Computer Networks.[S.l.:s.n.], 1982. |
[18] | LARSON P ? , BLANAS S , DIACONU C ,et al. High-performance concurrency control mechanisms for main-memory databases[J]. Proceedings of the VLDB Endowment, 2011,5(4): 298-309. |
[19] | ZHANG I , SHARMA N K , SZEKERES A ,et al. Building consistent transactions with inconsistent replication[J]. ACM Transactions on Computer Systems, 2018,35(4): 1-37. |
[20] | MAIYYA S , NAWAB F , AGRAWAL D ,et al. Unifying consensus and atomic commitment for effective cloud data management[J]. Proceedings of the VLDB Endowment, 2019,12(5): 611-623. |
[21] | HARDING R , AKEN D V , PAVLO A ,et al. An evaluation of distributed concurrency control[J]. Proceedings of the VLDB Endowment, 2017,10(5): 553-564. |
[22] | WIDENIUS U M , AXMARK D . MySQL reference manual - documentation from the source[M].[S.l.]: O’Reilly, 2002. |
[23] | YANG Z K , YANG C H , HAN F S ,et al. OceanBase:a 707 million tpmC distributed relational database system[J]. Proceedings of the VLDB Endowment, 2022,15: 3385-3397. |
[24] | COOPER B F , SILBERSTEIN A , TAM E ,et al. Benchmarking cloud serving systems with YCSB[C]// Proceedings of the 1st ACM Symposium on Cloud Computing. New York:ACM Press, 2010: 143-154. |
[25] | TU S , ZHENG W T , KOHLER E ,et al. Speedy transactions in multicore inmemory databases[C]// Proceedings of the 24th ACM Symposium on Operating Systems Principles. New York:ACM Press, 2013: 18-32. |
[26] | THOMSON A , DIAMOND T , WENG S C ,et al. Calvin:fast distributed transactions for partitioned database systems[C]// Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. New York:ACM Press, 2012: 1-12. |
[27] | LU Y , YU X Y , CAO L ,et al. Aria[J]. Proceedings of the VLDB Endowment, 2020,13(12): 2047-2060. |
[28] | QIN D , BROWN A D , GOEL A . Caracal:contention management with deterministic concurrency control[C]// Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles. New York:ACM Press, 2021: 180-194. |
[29] | BAYER R , ELHARDT K , HEIGERT J ,et al. Dynamic timestamp allocation for transactions in database systems[C]// International Symposium on Distributed Data Bases.[S.l.:s.n.], 1982: 9-20. |
[30] | BOKSENBAUM C , CART M , FERRIéJ ,et al. Certification by intervals of timestamps in distributed database systems[C]// Proceedings of the 10th International Conference on Very Large Data Bases. New York:ACM Press, 1984: 377-387. |
[31] | MAHMOUD H A , ARORA V , NAWAB F ,et al. MaaT:effective and scalable coordination of distributed transactions in the cloud[J]. Proceedings of the VLDB Endowment, 2014,7(5): 329-340. |
[32] | YU X Y , XIA Y C , PAVLO A ,et al. Sundial:harmonizing concurrency control and caching in a distributed OLTP database management system[J]. Proceedings of the VLDB Endowment, 2018,11(10): 1289-1302. |
[33] | YU X Y , PAVLO A , SANCHEZ D ,et al. TicToc:time traveling optimistic concurrency control[C]// Proceedings of the 2016 International Conference on Management of Data. New York:ACM Press, 2016: 1629-1642. |
[34] | LIU Y J , SU L , SHAH V ,et al. Hybrid deterministic and nondeterministic execution of transactions in actor systems[C]// Proceedings of the 2022 International Conference on Management of Data. New York:ACM Press, 2022: 65-78. |
[35] | FALEIRO J M , ABADI D , HELLERSTEIN J . High performance transactions via early write visibility[J]. Proceedings of the VLDB Endowment, 2017,10(5): 613-624. |
[36] | FALEIRO J M , THOMSON A , ABADI D J . Lazy evaluation of transactions in database systems[C]// Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. New York:ACM Press, 2014: 15-26. |
[37] | GUO Z H , WU K , YAN C ,et al. Releasing locks as early as you can:reducing contention of hotspots by violating twophase locking[C]// Proceedings of the 2021 International Conference on Management of Data. New York:ACM Press, 2021: 658-670. |
[38] | LI J R , LU Y Y , ZHANG Y M ,et al. SwitchTx:scalable in-network coordination for distributed transaction processing[J]. Proceedings of the VLDB Endowment, 2022,15(11): 2881-2894. |
[39] | BAKER J , BOND C , CORBETT J C ,et al. Megastore:providing scalable,highly available storage for interactive services[C]// Proceedings of the 5th Biennial Conference on Innovative Data Systems Research.[S.l.:s.n.], 2011: 223-234. |
[40] | KRASKA T , PANG G , FRANKLIN M J ,et al. MDCC:multi-data center consistency[J]. arXiv preprint. 2012,arXiv:1203.6049. |
[41] | ZHANG Q , LI J Y , ZHAO H Y ,et al. Efficient distributed transaction processing in heterogeneous networks[J]. Proceedings of the VLDB Endowment, 2023,16(6): 1372-1385. |
[42] | YANG L G , YAN X N , WONG B . Natto:providing distributed transaction prioritization for high-contention workloads[C]// Proceedings of the 2022 International Conference on Management of Data. New York:ACM Press, 2022: 715-729. |
[43] | LISKOV B , CASTRO M , SHRIRA L ,et al. Providing persistent objects in distributed systems[C]// Proceedings of the ECOOP’ 99-Object-Oriented Programming. Heidelberg:Springer, 1999: 230-257. |
[44] | BERNSTEIN P A , CSERI I , DANI N ,et al. Adapting microsoft SQL server for cloud computing[C]// Proceedings of 2011 IEEE 27th International Conference on Data Engineering. Piscataway:IEEE Press, 2011: 1255-1263. |
[45] | ANNAMALAI M , RAVICHANDRAN K , SRINIVAS H ,et al. Sharding the Shards:managing datastore locality at scale with Akkio[C]// Proceedings of the 13th USENIX Conference on Operating Systems Design and Implementation. New York:ACM Press, 2018: 445-460. |
[1] | 陈佩武, 束方兴. 基于SVD++隐语义模型的信任网络推荐算法[J]. 大数据, 2021, 7(4): 105-116. |
[2] | 张召, 田继鑫, 金澈清. 链上存证、链下传输的可信数据共享平台[J]. 大数据, 2020, 6(5): 106-117. |
[3] | 吴明瑜, 陈海波, 臧斌宇. 大数据场景中语言虚拟机的应用和挑战[J]. 大数据, 2020, 6(4): 81-91. |
[4] | 印鉴, 朱怀杰, 余建兴, 邱爽. 大数据治理的全景式框架[J]. 大数据, 2020, 6(2): 19-26. |
[5] | 金澈清, 陈晋川, 刘威, 张召. 政府治理大数据的共享、集成与融合[J]. 大数据, 2020, 6(2): 27-40. |
[6] | 刘汪根, 郑淮城, 荣国平. 云环境下大规模分布式计算数据感知的调度系统[J]. 大数据, 2020, 6(1): 81-98. |
[7] | 于明鹤, 聂铁铮, 李国良. 数据管护技术及应用[J]. 大数据, 2019, 5(6): 30-46. |
[8] | 梁英, 张伟, 余知栋, 史红周. 学术大数据技术在科技管理过程中的应用[J]. 大数据, 2019, 5(5): 3-15. |
[9] | 杜小勇, 陈跃国, 范举, 卢卫. 数据整理——大数据治理的关键技术[J]. 大数据, 2019, 5(3): 13-22. |
[10] | 刘雷, 郭志军, 马海欣, 赵琼, 胡卉芪, 蔡鹏, 杜洪涛, 周傲英, 李战怀. 分布式数据库在金融应用场景中的探索与实践[J]. 大数据, 2019, 5(1): 77-86. |
[11] | 陈世敏. 树状结构大数据类型的高效支持[J]. 大数据, 2018, 4(4): 35-43. |
[12] | 李友焕, 邹磊. 图数据流的模型、算法和系统[J]. 大数据, 2018, 4(4): 44-55. |
[13] | 闫树, 卿苏德, 魏凯. 区块链在数据流通中的应用[J]. 大数据, 2018, 4(1): 3-12. |
[14] | 钱卫宁, 金澈清, 邵奇峰, 周傲英. 区块链与分享型数据库[J]. 大数据, 2018, 4(1): 36-45. |
[15] | 白硕. 浅论区块链的可运维性[J]. 大数据, 2018, 4(1): 85-89. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|