大数据 ›› 2021, Vol. 7 ›› Issue (2): 172-181.doi: 10.11959/j.issn.2096-0271.2021020

• 专题:高性能计算虚拟数据空间 • 上一篇    下一篇

面向非易失内存的MPI-IO接口优化

邓镇龙, 陈志广   

  1. 中山大学计算机学院,广东 广州 510006
  • 出版日期:2021-03-15 发布日期:2021-03-01
  • 作者简介:邓镇龙(1995- ),男,中山大学计算机学院硕士生,主要研究方向为分布式存储。
    陈志广(1984- ),男,博士,中山大学计算机学院副教授,主要研究方向为大数据存储与处理、并行与分布式计算、高性能计算与超级计算机。
  • 基金资助:
    国家重点研发计划资助项目(2018YFB0203904);国家自然科学基金资助项目(61872392);广东省自然科学基金资助项目(2018B030312002);广州市珠江科技新星项目(201906010008)

An optimization of MPI-IO interface for non-volatile memory

Zhenlong DENG, Zhiguang CHEN   

  1. School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou 510006, China
  • Online:2021-03-15 Published:2021-03-01
  • Supported by:
    The National Key Research and Development Program of China(2018YFB0203904);The National Natural Science Foundation of China(61872392);The Natural Science Foundation of Guangdong Province(2018B030312002);Pearl River S & T Nova Program of Guangzhou(201906010008)

摘要:

在高性能计算环境中,MPI应用多个计算节点同时访问底层存储系统文件时,其I/O开销受到访问模式和外存设备性能的影响。针对MPI应用访问文件的特征,利用非易失内存高带宽、低时延、可字节寻址、数据可持久化等优势,提出面向非易失内存的MPI-IO接口优化方案;对文件数据建立分布式的缓存并维护持久性的元数据、对进程间数据传输策略进行优化,使应用可以有效管理、利用非易失内存设备,保持缓存数据一致有效。实验结果证明,所提系统为应用带来数十倍的读写性能提升。未来将进一步优化本方案的并行性。

关键词: 非易失内存, MPI-IO, 分布式数据缓存

Abstract:

In an HPC system where multiple computation nodes of an MPI application simultaneously access files in underlying storage systems, the I/O overhead is affected by the access mode and the properties of external storage devices.Based on the patterns of MPI applications to access files, an optimization for MPI-IO interface for persistent memories was introduced on high-bandwidth, low-latency, byte-addressable, data-persistent memories.By constructing distributed data cache, maintaining persistent metadata and leveraging optimizations on data movements among processes, applications were enabled to efficiently manage and utilize persistent memories with data consistency guaranteed, resulting in tens of times improvement on read/write bandwidth.Further optimizations on parallelism were set for future work.

Key words: non-volatile memory, MPI-IO, distributed data cache

中图分类号: 

No Suggested Reading articles found!