Big Data Research ›› 2024, Vol. 10 ›› Issue (4): 3-20.doi: 10.11959/j.issn.2096-0271.2024046

• TOPIC: BIG DATA AND CLOUD STORAGE •    

Research on key technologies for efficient storage and access of turbulent big data

Wendi CHENG1, Xiao ZHANG1, Zhaohui PAN2, Youjun ZHAO2, Chenguang SUN1, Xueqiang SHAN1, Yuzhan JIN1, Xiaonan ZHAO1   

  1. 1 School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China
    2 School of Software, Northwestern Polytechnical University, Xi’an 710072, China
  • Online:2024-07-01 Published:2024-07-01
  • Supported by:
    The National Natural Science Foundation of China(92152301)

Abstract:

With the development of measurement techniques and numerical simulation technologies, data-driven turbulence research has become a new approach in this field.In China, several wind tunnel laboratories and supercomputing centers have been established for turbulence simulations, resulting in a substantial collection of turbulence data.However, there is currently no centralized turbulence data management platform in China, which makes it difficult to achieve the exchange and share of the expensive experimental and simulation data.Turbulence data is characterized by its large volume, high dimensionality, precision and heterogeneity, which present problems in terms of storage, access and management efficiency.A turbulence big data distributed storage system called TDFS was designed, specifically targeting typical flow problems in aviation, aerospace, and marine applications.Considering the access characteristics of turbulence big data, the novel metadata management methods and data access interfaces were designed in TDFS.Experimental results demonstrate that TDFS achieves interface response speed improvements of 54.38% and 57.7% compared with HDFS and GlusterFS, respectively.Additionally, to reduce the storage overhead of turbulence big data, a lazy replication compression mechanism based on HDF5 was designed, resulting in 34% reduction in storage space, compared to the original replication storage approach.

Key words: turbulence big data, distributed storage system, lazy replication compression, performance optimization

CLC Number: 

No Suggested Reading articles found!