Big Data Research ›› 2019, Vol. 5 ›› Issue (2): 89-103.doi: 10.11959/j.issn.2096-0271.2019016

• STUDY • Previous Articles     Next Articles

A scalable CPU-MIC coordinated drug-finding tool by frequent subgraph mining

Shaoliang PENG1,Qi NIU1,Kenli LI1,Quan ZOU2   

  1. 1 College of Computer Science and Electronic Engineering,Hunan University,Changsha 410082,China
    2 Institute of Fundamental and Frontier Sciences,University of Electronic Science and Technology of China,Chengdu 610054,China
  • Online:2019-03-15 Published:2019-04-04
  • Supported by:
    National Key Research and Development Program of China(2017YFB0202602);National Key Research and Development Program of China(2018YFC0910405);National Key Research and Development Program of China(2017YFC1311003);National Key Research and Development Program of China(2016YFC1302500);National Key Research and Development Program of China(2016YFB0200400);National Key Research and Development Program of China(2017YFB0202104);The National Natural Science Foundation of China(61772543);The National Natural Science Foundation of China(U1435222);The National Natural Science Foundation of China(61625202);The National Natural Science Foundation of China(61272056)

Abstract:

Frequent subgraph mining is an important issue to be solved in many practical fields.Due to the computational intensiveness,the mining of the atlas and the large capacity of the results,the existing solutions can not meet the time requirements,and its efficiency is currently the main challenge.The frequent subgraph mining tool cmFSM for parallel acceleration was originally proposed.cmFSM performs parallel optimization on three levels:fine-grained OpenMP parallelization on a single node,multi-node multi-process parallelization and CPU-MIC collaborative parallelization.cmFSM is twice as fast as the best CPU-based algorithm on a single node and provides scalability in a multi-node approach.In the future,we will continue to improve the scalability of multiple solutions.The results show that even with only a few parallel computing resources,cmFSM is significantly better than the most advanced algorithms available.This fully demonstrates the effectiveness of the proposed tool in the field of bioinformatics.

Key words: frequent subgraph mining, bioinformatics, parallel algorithm, memory constraints, isomorphism, many integrated core

CLC Number: 

No Suggested Reading articles found!