Chinese Journal of Network and Information Security ›› 2016, Vol. 2 ›› Issue (8): 74-83.doi: 10.11959/j.issn.2096-109x.2016.00076

• Academic paper • Previous Articles    

Study and optimization on system architectures of Larbin

Xuan WANG1,2,Yi-xia HUO1,Yun-fei CI1,Guo-zhen SHI1,Li LI1,2()   

  1. 1 School of Information Security,Beijing Electronic Science and Technology Institute,Beijing 100070,China
    2 School of Computer,Xidian University,Xi'an 710000,China
  • Revised:2016-08-02 Online:2016-08-01 Published:2017-06-04
  • Supported by:
    The National Key Research Programof China(2016YFB0800304);he Natural Science Foundation of Beijing(4152048);The Natural Science Foundation of Jiangsu Province(BK20150787);2016 Spring Buds Project of Beijing Electronic Science&Technology Institute(2016CL04)

Abstract:

Web crawler is an important part of the search engine,its performance will directly affect the accuracy and timeliness of the search engine.Larbin is an efficient and simple open source crawler with relatively perfect in functions.Several typical open-source crawler were firstly introduced and a multi-dimensional comparison was made among them.Then,the system architecture and working mechanism of Larbin were given in detail.Its short-comings in the program structure and process were pointed out,and improved programs were proposed.Experimen-tal results show that improved program is better in speed and performance.

Key words: search engine, Web crawler, Larbin, open source, optimization

CLC Number: 

No Suggested Reading articles found!