通信学报 ›› 2015, Vol. 36 ›› Issue (12): 178-189.doi: 10.11959/j.issn.1000-436x.2015311
王晓阳,郑骁庆,肖仰华
出版日期:
2015-12-25
发布日期:
2017-07-17
基金资助:
Sean WANGX,Xiao-qing ZHENG,Yang-hua XIAO
Online:
2015-12-25
Published:
2017-07-17
Supported by:
摘要:
随着网络搜索空间从互联网扩展到人、机、物互联的泛在网络空间,以及大数据时代的到来,传统的搜索引擎已经不能满足时代的需求,新时代的搜索引擎技术——大搜索(或称智慧搜索)概念应运而生。因此,讨论实现大搜索所需关键技术之一的实体与关联关系建模与挖掘,以及相关的设计思想和实现技术。
王晓阳,郑骁庆,肖仰华. 智慧搜索中的实体与关联关系建模与挖掘[J]. 通信学报, 2015, 36(12): 178-189.
Sean WANGX,Xiao-qing ZHENG,Yang-hua XIAO. Entity-relation modeling and discovery for smart search[J]. Journal on Communications, 2015, 36(12): 178-189.
[18] | JIA Y , WANG Y , CHENG X ,et al. OpenKN:an open knowledge computational engine for network big data[A]. Advances in Social Networks Analysis and Mining (ASONAM),2014 IEEE/ACM International Conference on[C]. IEEE, 2014. 657-664. |
[19] | 王元卓, 贾岩涛, 赵泽亚 ,等. OpenKN——网络大数据时代的知识计算引擎[J]. CCF通讯, 2014,10(11): 30-35. WANG Y Z , JIA Y T , ZHAO Z Y ,et al. OpenKN—— knowledge computing engine in the big data era[J]. CCF Communication, 2014,10(10): 30-35. |
[20] | LI Q , LI Y L , GAO J ,et al. Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation[A]. Proceedings of the 2014 SIGMOD[C]. 2014. |
[21] | SARMA D JAIN A A , YU C . Dynamic relationship and event discovery[A]. Fourth ACM International Conference on Web Search and Data Mining[C]. ACM, 2011. 207-216. |
[22] | KUZEY E , VREEKEN J , WEIKUM G . A fresh look on knowledge bases:Distilling named events from news[A]. 23rd ACM International Conference on Information and Knowledge Management[C]. ACM, 2014. 1689-1698. |
[23] | BROEKSTRA J , KAMPMAN A , VAN HARMELEN F . Sesame:an architecture for storing and querying rdf data and schema information[J]. Spinning the Semantic Web:Bringing the World Wide Web to Its Full Potential, 2003,197. |
[24] | WILKINSON K , SAYERS C , KUNO H A ,et al. Efficient RDF Storage and retrieval in Jena2[A]. The First International Workshop on Semantic Web and Databases[C]. 2003,3: 131-150. |
[25] | HARRIS S , GIBBINS N . 3store:efficient bulk RDF storage[A]. Workshop on Practical and Scalable Semantic Systems[C]. 2003. |
[26] | ALEXAKI S , CHRISTOPHIDES V , KARVOUNARAKIS G ,et al. The ICS-FORTH RDFSuite:managing voluminous RDF description bases[A]. SemWeb[C]. Hong Kong,China, 2001. |
[27] | CHONG E I , DAS S , EADON G ,et al. An efficient SQL-based RDF querying scheme[A]. 31st International Conference on Very Large Data Bases VLDB Endowment[C]. 2005. 1216-1227. |
[28] | ABADI D J , MARCUS A , MADDEN S R ,et al. Scalable semantic web data management using vertical partitioning[A]. 33rd International Conference on Very Large Data Bases[C]. 2007. 411-422. |
[29] | WEISS C , KARRAS P , BERNSTEIN A . Hexastore:sextuple indexing for semantic Web data management[J]. Proceedings of the VLDB Endowment, 2008,1(1): 1008-1019. |
[30] | SIDIROURGOS L , GONCALVES R , KERSTEN M ,et al. Column-store support for RDF data management:not all swans are white[J]. Proceedings of the VLDB Endowment, 2008,1(2): 1553-1563. |
[31] | ATRE M , CHAOJI V , ZAKI M J ,et al. Matrix bit loaded:a scalable lightweight join query processor for RDF data[A]. 19th International Conference on World Wide Web[C]. ACM, 2010. 41-50. |
[32] | STOCKER M , SEABORNE A , BERNSTEIN A ,et al. SPARQL basic graph pattern optimization using selectivity estimation[A]. 17th International Conference on World Wide Web[C]. ACM, 2008. 595-604. |
[33] | MADUKO A , ANYANWU K , SHETH A ,et al. Estimating the cardinality of RDF graph patterns[A]. Proceedings of the 16th International Conference on World Wide Web[C]. ACM, 2007. 1233-1234. |
[34] | NEUMANN T , WEIKUM G . RDF-3X:a RISC-style engine for RDF[J]. Proceedings of the VLDB Endowment, 2008,1(1): 647-659. |
[35] | NEUMANN T , WEIKUM G . The RDF-3X engine for scalable management of RDF data[J]. The VLDB Journal, 2010,19(1): 91-113. |
[1] | 方滨兴 ,等. Statistical multisource-multitarget information fusion[M]. 北京: 电子工业出版社, 2015. FANG B X , et al . Big Search Technology White Paper[M]. Beijing: Electronic Industry PressPress, 2015. |
[2] | ETZIONI O , CAFARELLA M , DOWNEY D ,et al. Web-scale information extraction in knowitall:(preliminary results)[A]. Proceedings of the 13th International Conference on World Wide Web[C]. ACM, 2004. 100-110. |
[36] | NEUMANN T , WEIKUM G . Scalable join processing on very large RDF graphs[A]. Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data[C]. ACM, 2009. 627-640. |
[37] | HUANG J , ABADI D J , REN K . Scalable SPARQL querying of large RDF graphs[J]. Proceedings of the VLDB Endowment, 2011,4(11): 1123-1134. |
[3] | YATES A , CAFARELLA M , BANKO M ,et al. Textrunner:open information extraction on the web[A]. Proceedings of Human Language Technologies:The Annual Conference of the North American Chapter of the Association for Computational Linguistics:Demonstrations Association for Computational Linguistics[C]. 2007. 25-26. |
[4] | WU W , LI H , WANG H ,et al. Probase:a probabilistic taxonomy for text understanding[A]. ACM SIGMOD International Conference on Management of Data[C]. ACM, 2012. 481-492. |
[38] | BINNA R , GASSLER W , ZANGERLE E ,et al. Spiderstore:exploiting main memory for efficient RDF graph representation and fast querying[A]. Proceedings of Workshop on Semantic Data Management (SemData@ VLDB)[C]. 2010. |
[39] | WEAVER J , HENDLER J A . Parallel Materialization of the Finite RDFs Closure for Hundreds of Millions of Triples[M]. Springer Berlin Heidelberg, 2009. |
[5] | SUCHANEK F M , KASNECI G , WEIKUM G . Yago:a core of semantic knowledge[A]. 16th International Conference on World Wide Web[C]. ACM, 2007. 697-706. |
[6] | AUER S , BIZER C , KOBILAROV G ,et al. Dbpedia:a Nucleus for a Web of Open Data[M]. Springer Berlin Heidelberg, 2007. |
[7] | BOLLACKER K , EVANS C , PARITOSH P ,et al. Freebase:a collaboratively created graph database for structuring human knowledge[A]. ACM SIGMOD International Conference on Management of Data[C]. ACM, 2008. 1247-1250. |
[8] | SINGHAL A . Introducing the Knowledge Graph:Things,Not Strings Official Blog (of Google)[EB/OL]. . |
[40] | URBANI J , KOTOULAS S , OREN E ,et al. Scalable Distributed Reasoning Using MapReduce[M]. Springer Berlin Heidelberg, 2009. |
[41] | MYUNG J , YEON J , LEE S . SPARQL basic graph pattern processing with iterative MapReduce[A]. Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud[C]. ACM, 2010. |
[9] | WANG J , WANG H , WANG Z ,et al. Understanding Tables on the Web Conceptual Modeling[M]. Springer Berlin Heidelberg, 2012. 141-155. |
[10] | WANG Y , LI H , WANG H ,et al. Toward Topic Search on the Web[R]. Technical report,Microsoft Research, 2010. |
[11] | Apple-Siri-frequently asked questions. Apple[EB/OL]. . |
[12] | HOFFART J , SUCHANEK F M , BERBERICH K ,et al. YAGO2:exploring and querying world knowledge in time,space,context,and many languages[A]. 20th International Conference Companion on World Wide Web[C]. ACM, 2011. 229-232. |
[13] | PASSANT A . Dbrec—music recommendations using DBpedia[A]. The Semantic Web-ISWC 2010[C]. Springer Berlin Heidelberg, 2010. 209-224. |
[14] | GARCIA A , SZOMSZOR M , ALANI H ,et al. Preliminary results in tag disambiguation using DBpedia[A]. Collective Knowledge Capturing and Representation[C]. California, 2009. |
[15] | Wu F , Weld D S . Automatically refining the wikipedia infobox ontology[A]. 17th International Conference on World Wide Web[C]. ACM, 2008. 635-644. |
[16] | KASNECI G , RAMANATH M , SUCHANEK F ,et al. The YAGO-NAGA approach to knowledge discovery[J]. ACM SIGMOD Record, 2009,37(4): 41-47. |
[42] | ROHLOFF K , SCHANTZ R E . High-performance,massively scalable distributed systems using the MapReduce software framework:the SHARD triple-store[A]. Programming Support Innovations for Emerging Distributed Applications[C]. ACM, 2010. |
[43] | GUPTA M , GAO J , YAN X F ,et al. Top-K interesting subgraph discovery in information networks[A]. 2014 International Conference on Data Engineering[C]. 2014. |
[44] | ZOU L,?ZSU M T , CHEN L , et al . gStore:a graph-based SPARQL query engine[J]. The VLDB Journal—the International Journal on Very Large Data Bases, 2014,23(4): 565-590. |
[45] | ZOU L , HUANG R , WANG H ,et al. Natural language question answering over RDF:a graph data driven approach[A]. Proceedings of the 2014 ACM SIGMOD International Conference on Management of data[C]. ACM, 2014. 313-324. |
[46] | YANG T , CHEN J , WANG X ,et al. Efficient S`PARQL query evaluation via automatic data partitioning[A]. Database Systems for Advanced Applications[C]. Wuhan, 2013. |
[47] | DU F , BIAN H , CHEN Y ,et al. Efficient SPARQL query evaluation in a database cluster[A]. Big Data,2013 IEEE International Congress on[C]. 2013. 165-172. |
[48] | BIAN H , CHEN Y , DU X ,et al. MetKB:enriching RDF knowledge bases with web entity-attribute tables[A]. 22nd ACM International Conference on Conference on Information & Knowledge Management[C]. ACM, 2013. 2461-2464. |
[49] | RAIMOND Y ,et al. The event ontology[EB/OL]. . 20073. |
[50] | TRAME J,KE?LER C , KUHN W . Linked Data And Time–Modeling Researcher Life Lines By Events[M]. Spatial Information Theory. Springer International Publishing, 2013. |
[51] | JIN R , HONG H , WANG H ,et al. Computing label-constraint reachability in graph databases[A]. 2010 ACM SIGMOD International Conference on Management of data[C]. ACM, 2010. 123-134. |
[52] | XU K , ZOU L , YU J X ,et al. Answering label-constraint reachability in large graphs[A]. Proceedings of the 20th ACM International Conference on Information and Knowledge Management[C]. ACM, 2011. 1595-1600. |
[53] | FAN W , LI J , MA S ,et al. Adding regular expressions to graph reachability and pattern queries[A]. Data Engineering (ICDE),2011 IEEE 27th International Conference on[C]. 2011. 39-50. |
[54] | GUBICHEV A , BEDATHUR S , SEUFERT S ,et al. Fast and accurate estimation of shortest paths in large graphs[A]. Proceedings of the 19th ACM International Conference on Information and Knowledge Management[C]. ACM, 2010. 499-508. |
[55] | POTAMIAS M , BONCHI F , CASTILLO C ,et al. Fast shortest path distance estimation in large networks[A]. 18th ACM Conference on Information and Knowledge Management[C]. ACM, 2009. 867-876. |
[56] | TRETYAKOV K,ARMAS-CERVANTES A,GARCíA-BA?UELOS L , et al . Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs[A]. 20th ACM International Conference on Information and Knowledge Management[C]. ACM, 2011. 1785-1794. |
[57] | DAS SARMA A , GOLLAPUDI S , NAJORK M ,et al. A sketch-based distance oracle for Web-scale graphs[A]. Proceedings of the Third ACM International Conference on Web Search and Data Mining[C]. ACM, 2010. 401-410. |
[58] | GOLDBERG A V , HARRELSON C . Computing the shortest path:a search meets graph theory[A]. Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms Society for Industrial and Applied Mathematics[C]. 2005. 156-165. |
[59] | ZHAO X , SALA A , WILSON C ,et al. Orion:shortest path estimation for large social graphs[J]. 2010, 1: 5. |
[60] | ZHAO X , SALA A , ZHENG H ,et al. Fast and scalable analysis of massive social graph[J]. arXiv preprint arXiv:1107.5114, 2011. |
[61] | FAN W , LI J , MA S ,et al. Graph pattern matching:from intractable to polynomial time[J]. Proceedings of the VLDB Endowment, 2010,3(1-2): 264-275. |
[62] | ZOU L , CHEN L,?ZSU M T , et al . Answering pattern match queries in large graph databases via graph embedding[J]. International Journal on Very Large Data Bases, 2012,21(1): 97-120. |
[63] | MA S , CAO Y , FAN W ,et al. Capturing topology in graph pattern matching[J]. Proceedings of the VLDB Endowment, 2011,5(4): 310-321. |
[64] | SUN Z , WANG H , WANG H ,et al. Efficient subgraph matching on billion node graphs[J]. Proceedings of the VLDB Endowment, 2012,5(9): 788-799. |
[65] | MA S , CAO Y , HUAI J ,et al. Distributed graph pattern matching[A]. 21st International Conference on World Wide Web[C]. 2012. 949-958. |
[66] | LI G , OOI B C , FENG J ,et al. EASE:an effective 3-in-1 keyword search method for unstructured,semi-structured and structured data[A]. ACM SIGMOD International Conference on Management of Data[C]. 2008. 903-914. |
[67] | KARGAR M , et al.A . Keyword search in graphs:finding r-cliques[J]. Proceedings of the VLDB Endowment, 2011,4(10): 681-692. |
[68] | GRAY J , CHAUDHURI S , Bosworth A ,et al. Data cube:a relational aggregation operator generalizing group-by,cross-tab,and sub-totals[J]. Data Mining and Knowledge Discovery, 1997,1(1): 29-53. |
[69] | LIN C X , DING B , HAN J ,et al. Text cube:computing ir measures for multidimensional text database analysis[A]. Data Mining,ICDM'08.Eighth IEEE International Conference on[C]. 2008. 905-910. |
[70] | ZHANG D , ZHAI C , HAN J . Topic cube:topic modeling for OLAP on multidimensional text databases[A]. SDM[C]. 2009,9: 1124-1135. |
[71] | CHEN C , YAN X , ZHU F ,et al. Graph OLAP:towards online analytical processing on graphs[A]. Eighth IEEE International Conference on Data Mining[C]. 2008. |
[17] | LIN H , JIA Y , WANG Y ,et al. Populating knowledge base with collective entity mentions:a graph-based approach[A]. Advances in Social Networks Analysis and Mining (ASONAM),2014 IEEE/ACM International Conference on[C]. IEEE, 2014. 604-611. |
[72] | ZHAO P , LI X , XIN D ,et al. Graph cube:on warehousing and OLAP multidimensional networks[A]. ACM SIGMOD International Conference on Management of data[C]. 2011. 853-864. |
[1] | 李荣鹏, 汪丙炎, 张宏纲, 赵志峰. 知识增强的语义通信接收端设计[J]. 通信学报, 2023, 44(6): 70-76. |
[2] | 徐泽汐, 庄雷, 张坤丽, 桂明宇. 基于知识图谱的服务功能链在线部署算法[J]. 通信学报, 2022, 43(8): 41-51. |
[3] | 孙佳琛, 王金龙, 丁国如, 陈瑾, 龚玉萍. 频谱知识图谱:面向未来频谱管理的智能引擎[J]. 通信学报, 2021, 42(5): 1-12. |
[4] | 赵晓娟, 贾焰, 李爱平, 陈恺. 基于层级注意力机制的链接预测模型研究[J]. 通信学报, 2021, 42(3): 36-44. |
[5] | 贾焰,甘亮,李爱平,徐菁. 社交网络智慧搜索研究进展与发展趋势[J]. 通信学报, 2015, 36(12): 9-16. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|