电信科学 ›› 2011, Vol. 27 ›› Issue (11): 51-56.doi: 10.3969/j.issn.1000-0801.2011.11.015

• 云计算专栏 • 上一篇    下一篇

基于Solr的分布式实时搜索模型研究与实现

傅巍玮1,李仁发1,刘钰峰1,黄松立2   

  1. 1 湖南大学嵌入式系统及网络实验室 长沙410082
    2 淘宝(中国)有限责任公司 杭州315100
  • 出版日期:2011-11-15 发布日期:2011-11-15
  • 基金资助:
    国家自然科学基金资助项目;国家工业和信息化部核高基金资助项目

Study and Implementation of Distributed Real-Time Search Engine Model Based on Solr

Weiwei Fu1,Renfa Li1,Yufeng Liu1,Songli Huang2   

  1. 1 Embedded Systems & Networking Laboratory of Hunan University,Changsha 410082,China
    2 Taobao(China)Limited Liability Company,Hangzhou 315000,China
  • Online:2011-11-15 Published:2011-11-15

摘要:

实时搜索已成为信息检索领域的热点问题之一。传统搜索引擎在分布式环境下无法保证大数据量、高并发情况下的实时响应和数据容灾。本文提出了一种基于 Solr 的分布式实时搜索模型,分析了其实现原理。模型通过内存索引与磁盘索引相结合保证索引信息的实时展示,同时引入CommitLog 日志保证内存索引数据容灾,并通过Master/Slave 模型保证搜索服务的可用性。最终应用于实际生产系统中,实践结果充分证明了该模型的可行性。

关键词: 信息检索, 分布式实时搜索模型, Solr, 数据容灾

Abstract:

Real-time search is a hot spot in research of information retrieval. In the distributed environment of big data and high concurrent,traditional search engine can not guarantee to make real-time response and data disaster tolerance. In this paper,we proposes a distributed real-time search engine model based on Solr,then explaines the principle and the procedures in detail. The memory index and disk index are integrated organically to present information in time. We brings out CommitLog to ensure memory index metadata disaster tolerance. Master/Slave model carry guarantee of high availability of search service. Practice has proved its feasibility.

Key words: information retrieval, distributed real-time search engine model, Solr,data disaster tolerance

No Suggested Reading articles found!