大数据 ›› 2019, Vol. 5 ›› Issue (6): 30-46.doi: 10.11959/j.issn.2096-0271.2019048

• 专题:大数据整理 • 上一篇    下一篇

数据管护技术及应用

于明鹤1,2,聂铁铮3,李国良4   

  1. 1 东北大学软件学院,辽宁 沈阳 110169
    2 广东省普及型高性能计算机重点实验室,广东 深圳 518060
    3 东北大学计算机科学与工程学院,辽宁 沈阳 110169
    4 清华大学计算机科学与技术系,北京 100084
  • 出版日期:2019-11-15 发布日期:2020-01-10
  • 作者简介:于明鹤(1989- ),女,博士,东北大学软件学院讲师,主要研究方向为大数据、信息检索等|聂铁铮(1980- ),男,博士,东北大学计算机科学与工程学院副教授,主要研究方向数据集成、大数据处理、区块链|李国良(1980- ),男,博士,清华大学计算机科学与技术系教授,主要研究方向为数据清洗、数据整合、众包数据管理等
  • 基金资助:
    中国博士后科学基金资助项目(2019M651134);广东省普及型高性能计算机重点实验室(2017B030314073);中央高校基本科研业务专项资金资助项目(N181703006)

Data curation technologies and applications

Minghe YU1,2,Tiezheng NIE3,Guoliang LI4   

  1. 1 Software College of Northeastern University,Shenyang 110169,China
    2 Guangdong Province Key Laboratory of Popular High Performance Computers,Shenzhen 518060,China
    3 School of Computer Science and Engineering,Northeastern University,Shenyang 110169,China
    4 Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China
  • Online:2019-11-15 Published:2020-01-10
  • Supported by:
    China Postdoctoral Science Foundation(2019M651134);Guangdong Province Key Laboratory of Popular High Performance Computers(2017B030314073);Fundamental Research Funds for the Central Universities(N181703006)

摘要:

为了对海量数据进行充分和有效的处理、存储以及应用,数据管护技术应运而生。数据管护技术是在数据整个生命周期内,对数据进行的主动并持续的管护,使数据得到最大化的利用,并且大程度地延长数据的使用寿命。围绕数据管护技术的目的、解决方案和应用,系统介绍了数据管护的处理过程和其中的关键技术,并介绍了几种基于数据管护的应用,并对其技术特点进行了对比分析。最后,对数据管护技术的发展前景和未来挑战进行了阐述。

关键词: 数据管护, 数据清洗, 数据集成, 元数据管理, 溯源管理

Abstract:

Data curation is emerged in order to process,store and applied efficiency.Data curation processes active and continuous management the data through the whole lifecycle of it.And utilizing data curation techniques,data could be used to the maximum extent,and the speed of its elimination can be effectively slowed down.The process and key techniques of data curation aroundits goals,solutions and applications were described.For the crucial techniques,existing solutions were analyzed and introduced.In addition,the applications of data curation in the various domains were also introduced and compared.Finally,the development prospect and future challenges were expounded.

Key words: data curation, data cleaning, data integration, metadata management, provenance management

中图分类号: 

No Suggested Reading articles found!