Big Data Research ›› 2020, Vol. 6 ›› Issue (3): 101-116.doi: 10.11959/j.issn.2096-0271.2020027

• TOPIC:DATAFLOW COMPUTING TECHNIQUES FOR BIG DATA PROCESSING • Previous Articles     Next Articles

Survey on data caching technology of distributed dataflow system

Xuchu YUAN,Guo FU,Jize BI,Yanfeng ZHANG,Tiezheng NIE,Yu GU,Yubin BAO,Ge YU   

  1. College of Computer Science and Engineering,Northeastern University,Shenyang 110169,China
  • Online:2020-05-15 Published:2020-05-15
  • Supported by:
    The National Key Research and Development Program of China(2018YFB1003404);The National Natural Science Foundation of China(61672141);Fundamental Research Funds for the Central Universities(N181605017);Fundamental Research Funds for the Central Universities(N181604016)

Abstract:

Dataflow model is adopted by several dataflow systems for its advantages of high parallel computing,pipeline processing and functional programming.In distributed dataflow systems and heterogeneous dataflow systems,due to the speed mismatch between the data production of data source operators and the data consumption of data sink operators,data could be delayed and operators could be idle.In order to support an efficient dataflow system,a dataflow cache system was desired to ensure efficient caching and movement of dataflow.Several distributed dataflow systems and distributed message queuing systems were analyzed,and the support degree of current message queuing system to data flow caching system was summarized.Finally,the cache technique was introduced,and the demands and research directions of future dataflow caching systems were analyzed.

Key words: dataflow, cache, distributed system, message queue

CLC Number: 

No Suggested Reading articles found!