网络与信息安全学报 ›› 2024, Vol. 10 ›› Issue (1): 79-90.doi: 10.11959/j.issn.2096-109x.2024013

• 学术论文 • 上一篇    

基于多查询的社交网络关键节点挖掘算法

辛国栋, 朱滕威, 黄俊恒, 魏家扬, 刘润萱, 王巍   

  1. 哈尔滨工业大学(威海)计算机科学与技术学院,山东 威海 264209
  • 修回日期:2023-11-21 出版日期:2024-02-01 发布日期:2024-02-01
  • 作者简介:辛国栋(1976− ),男,山东青岛人,哈尔滨工业大学(威海)讲师,主要研究方向为网络信息挖掘、网络安全、机器学习
    朱滕威(1998− ),男,山东枣庄人,哈尔滨工业大学(威海)工程师,主要研究方向为复杂网络分析
    黄俊恒(1966− ),男,河南新乡人,哈尔滨工业大学(威海)副教授,主要研究方向为社交网络、数据挖掘
    魏家扬(2000− ),男,江西南昌人,哈尔滨工业大学(威海)硕士生,主要研究方向为复杂网络分析、GNN
    刘润萱(2003− ),男,山东德州人,主要研究方向为社交网络和自然语言处理
    王巍(1975− ),女,黑龙江齐齐哈尔人,哈尔滨工业大学(威海)副教授,主要研究方向为社交网络,数据挖掘和自然语言处理
  • 基金资助:
    国家自然科学基金(62272129);国家重点研发计划(2021YFB2012400);中央高校基本科研业务费专项(NSRIF.2020098)

Multi-query based key node mining algorithm for social networks

Guodong XIN, Tengwei ZHU, Junheng HUANG, Jiayang Wei, Runxuan Liu, Wei WANG   

  1. School of Computer Science and Technology, Harbin Institute of Technology(Weihai), Weihai 264209, China
  • Revised:2023-11-21 Online:2024-02-01 Published:2024-02-01
  • Supported by:
    The National Natural Science Foundation of China(62272129);The National Key R&D Program of Chi-na(2021YFB2012400);The Fundamental Research Funds for the Central Universities(NSRIF.2020098)

摘要:

关键节点挖掘是复杂网络领域的研究重点和热点。针对社交网络中关键嫌疑人挖掘问题,提出基于多查询的社交网络关键节点挖掘算法。该算法将已知嫌疑人作为查询节点,提取其所在的局部拓扑结构,并计算局部拓扑结构中非查询节点的关键程度,从中选择关键程度较高的节点进行推荐。针对现有方法中关键节点计算复杂度高、已知查询节点信息难以有效利用的问题,提出一个两阶段的基于多查询的社交网络关键节点挖掘算法,整合多查询节点的局部拓扑信息和全局节点聚合特征信息,将计算范围从全局缩减到局部,进而对相关节点的关键程度进行量化。具体而言,利用带重启策略的随机游走算法获得多个查询节点的局部拓扑结构;为了得到节点的嵌入向量,基于 graphsage 模型构建一种无监督的图神经网络模型,该模型结合节点的自身特征和邻居聚合特征来生成嵌入向量,从而为算法框架的相似度计算提供信息输入。基于与查询节点特征的相似性,衡量局部拓扑中节点的关键程度。实验结果显示,所提算法在时间效率和结果有效性方面均优于传统关键节点挖掘算法。

关键词: 社交网络, 随机游走, 图神经网络, 节点嵌入向量, 关键节点

Abstract:

Mining key nodes in complex networks has been a hotly debated topic as it played an important role in solving real-world problems.However, the existing key node mining algorithms focused on finding key nodes from a global perspective.This approach became problematic for large-scale social networks due to the unacceptable storage and computing resource overhead and the inability to utilize known query node information.A key node mining algorithm based on multiple query nodes was proposed to address the issue of key suspect mining.In this method, the known suspects were treated as query nodes, and the local topology was extracted.By calculating the critical degree of non-query nodes in the local topology, nodes with higher critical degrees were selected for recommendation.Aiming to overcome the high computational complexity of key node mining and the difficulty of effectively utilizing known query node information in existing methods, a two-stage key node mining algorithm based on multi-query was proposed to integrate the local topology information and the global node aggregation feature information of multiple query nodes.It reduced the calculation range from global to local and quantified the criticality of related nodes.Specifically, the local topology of multiple query nodes was obtained using the random walk algorithm with restart strategy.An unsupervised graph neural network model was constructed based on the graphsage model to obtain the embedding vector of nodes.The model combined the unique characteristics of nodes with the aggregation characteristics of neighbors to generate the embedding vector, providing input for similarity calculations in the algorithm framework.Finally, the criticality of nodes in the local topology was measured based on their similarity to the features of the query nodes.Experimental results demonstrated that the proposed algorithm outperformed traditional key node mining algorithms in terms of time efficiency and result effectiveness.

Key words: social network, random walk, graph neural network, node embedding vector, key node

中图分类号: 

No Suggested Reading articles found!