Big Data Research

Statistical characteristics analysis of knowledge graphs for benchmarking graph database management systems

Weining QIAN, Chen SUN, Wenliang CHENG, Aoying ZHOU

2016, 2(5): 3-11. doi:10.11959/j.issn.2096-0271.2016049

Asbtract ( 340 )

HTML ( 16)

PDF (2193KB) ( 779 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

Recently,graph data has been widely used in domains such as information security,scientific research,internet services,etc.,that stimulates the fast development of graph data management systems.However,existing benchmarks for graph databases are all designed for applications that manage and analyze social networks.The statistical characteristics of knowledge graphs were analyzed,and compared with two social networks.It was showed that knowledge graphs,as an important and fast growing kind of graph data,were significantly different from social networks.Therefore,existing social network based benchmarks were not suitable for applications that deal with knowledge graphs.Furthermore,the requirements for a new benchmark were analyzed.

Parallel graph layout algorithm for large-scale graph data

Zhiyuan CHENG, Yubin BAO, Fangling LENG

2016, 2(5): 12-21. doi:10.11959/j.issn.2096-0271.2016050

Asbtract ( 507 )

HTML ( 24)

PDF (1959KB) ( 1222 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

Graph models are modeling tools which are widely used.Data visualization techniques have been widely used as intuitive data analysis tools.Graph layout algorithm is the most critical technique of graph visualization,while there are no effective parallel graph layout algorithms.So to study on visualization of massive graph data is a challenging problem.Aiming at this problem,based on the force-directed layout algorithm and ignoring the repulsion force computation between weakly associated vertexes partially,a k-friend approximate layout algorithm was proposed,and an effective parallel layout algorithm was designed for massive graph data.The experimental results on artificial and real dataset show that the algorithms proposed greatly improve the layout speed.

Application of large scale gene expression profiles in anticancer drug development

Yang LIU, Hui BAI, Xiaochen BO

2016, 2(5): 22-31. doi:10.11959/j.issn.2096-0271.2016051

Asbtract ( 497 )

HTML ( 25)

PDF (1407KB) ( 1310 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

As an important part of functional genomics,gene expression profile plays an important role in many fields,such as biology,medicine and drug discovery.With the advent of precision medicine,integration of multi-omics data including gene expression profile data for personalized health care is becoming the trend of future medicine.The advances of anticancer drug development were introduced firstly,and then the methods for perturbational gene expression profile analysis were illustrated,especially connectivity map idea.Finally applications of these data in anticancer drug development were summarized.

Cross-OSN user modeling in big data

Liancheng XIANG, Jitao SANG, Changsheng XU

2016, 2(5): 32-42. doi:10.11959/j.issn.2096-0271.2016052

Asbtract ( 371 )

HTML ( 15)

PDF (2220KB) ( 1077 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

Social media variety mainly concerns with the contents created and consumed in different online social network (OSN).Analyzing cross-OSN from the perspective of “variety” is beneficial to exerting the potential of big data,by integrally analyzing and exploiting the multi-sourced and multi-modal data.The problem of exploiting the cross-OSN data for comprehensive user modeling,which is fundamental in the context of multi-sourced social media big data was addressed.Inspired by the fact that the cross-OSN data shares unique user space,take the users as a bridge for associations mining between OSN was proposed.The discovered association patterns were then utilized in cross-OSN user demographic attribute inference and interest modeling in cross-OSN respectively,which can be further applied to personalized social media services.

Intelligence analysis and application for satellite imagery of big data

Jinfang ZHANG, Xiaohui HU, Hui ZHANG, Rui WANG, Haichang LI

2016, 2(5): 43-53. doi:10.11959/j.issn.2096-0271.2016053

Asbtract ( 533 )

HTML ( 29)

PDF (2007KB) ( 1331 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

Imaging capability has been greatly improved along with the development of remote sensing technology,the image information extraction based on deep learning raises to a new level,and cloud computing makes it possible of processing satellite imagery of big data.These three technologies activated the research on the expected potential commercial and military value,many research institutions joined the strength competition,and attracted a large number of venture capital.The potential value analysis and application based on satellite imagery of big data were summarized,and the next possible technological breakthroughs and the future direction of development were presented.

Airport and flight recognition on optical remote sensing data by deep learning

Xin NIU, Yong DOU, Peng ZHANG, Yushe CAO

2016, 2(5): 54-67. doi:10.11959/j.issn.2096-0271.2016054

Asbtract ( 440 )

HTML ( 22)

PDF (3818KB) ( 1238 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

Airport and flight recognition are the typical remote sensing applications.For the big optical remote sensing data,deep learning techniques for airport and flight recognition have been studied.To this end,a seconds’ response airport and flight recognition system for optical remote sensing data was built.To obtain effective deep learning model with limited labeled samples,transfer learning approach has been employed.Prior knowledge has also been explored for efficient object proposal.To achieve real-time performance for such recognition with “large region and small targets”,a cascade framework of deep networks has been proposed.The results of experiments show that,by the proposed deep learning approaches,significant improvement on recognition accuracy could be achieved with seconds’ response.

Applications of location-based big data for auto insurance risk control

Cheng ZHANG, Chen ZHAO

2016, 2(5): 79-87. doi:10.11959/j.issn.2096-0271.2016056

Asbtract ( 370 )

HTML ( 10)

PDF (1321KB) ( 575 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

Since the authorities conduct deep reform of commercial auto insurance,Chinese insurance companies are able to get more autonomy in product pricing and the insurance companies are required to improve their capabilities of delicacy risk management and risk analysis.The point of location-based big data to discuss applications of mobile positioning methods in auto insurance risk control was proposed.The implementation path and service recommendations to insurance underwriting and claim settlement based on grid-based methods of risk assessment and calculation were given.

Cultivating big data talent by combing various disciplines and utilizing multiple resources

Libo WU

2016, 2(5): 89-94. doi:10.11959/j.issn.2096-0271.2016057

Asbtract ( 265 )

HTML ( 6)

PDF (830KB) ( 255 )

Knowledge map

References | Related Articles | Metrics

Rapid development of big data science and technology put forward great challenges to talent cultivation.Data scientist should have the capability of coax commercial value from tremendous data and require multi-disciplinary training.Based on the progress and demand of cultivating big data talent,the patlerns of cultivation big data talent by combing various disciplines and utilizing multiple resources were discussed.In the patterns of talent cultivation,core knowledge system should be built and the basis of innovation capability should be solid.The talent cultivation should have data resource.Therefore,breaking through the data open barriers of government,private firms and universities and providing real world data is essential for talent education.To strengthen the interdisciplinary features,talent cultivation should remove the barriers of disciplines and build up systematic standards through the teaching,internship,graduate and evaluation.

Data science:the demand and development of talents

CCCHAN Keith, Tiantian HE

2016, 2(5): 95-106. doi:10.11959/j.issn.2096-0271.2016058

Asbtract ( 458 )

HTML ( 19)

PDF (1099KB) ( 681 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

Information technology has entered the era of big data.As talents who can discover the knowledge in big data,data scientists are tremendously demanded.The differences between data scientists and data analysts in the job nature,entry requirement and even remuneration were presented.Through a careful survey of the current job markets in the US and China.Then,it was revealed the gap between the kind of talents that were required for the jobs and the kind of graduates that the universities were training out.After a gap analysis,the views to the kind of data science programs which we believe may best develop the talents for the current and future job market were presented.

On prerequisites for cultivating big data talents

Yangyong ZHU, Yun XIONG

2016, 2(5): 107-114. doi:10.11959/j.issn.2096-0271.2016059

Asbtract ( 389 )

HTML ( 12)

PDF (1062KB) ( 580 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

The shortage of big data talents becomes a global concern,which restricts the development of big data.Cultivating big data talents has been paid attention widely and increasing universities have launched big data talents training plans.It is important and necessary to discuss the prerequisites for cultivating big data talents,including qualified teachers,data resources,computing capabilities.Building qualified teachers team is the first element.It is impossible to discuss cultivating talents if there is no qualified teacher.However,this is a contradiction,because the shortage of big data talents means the shortage of the qualified teachers for big data training.The second one is data resource,especially big data.If there is no data,the big data talents training will not make sense.Correspondingly,the third one is computation capability for big data.Three main prerequisites for big data talents training were discussed including qualified teachers,data resources and computation capabilities.Two solutions were presented:one was to develop an innovation talents training pattern,namely transdisciplinary,for the shortage of qualified teachers,the other was to establish big data arena for innovation and advance to supply the data resource and computation capability.

Ecological system operation theory

2016, 2(5): 115-119. doi:10.11959/j.issn.2096-0271.2016060

Asbtract ( 192 )

HTML ( 1)

PDF (5697KB) ( 258 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

当期目录