电信科学 ›› 2023, Vol. 39 ›› Issue (8): 58-68.doi: 10.11959/j.issn.1000-0801.2023154

• 研究与开发 • 上一篇    

基于对比学习的社交媒体地理位置预测方法

徐永昌1, 黄士多2, 艾浩军1   

  1. 1 空天信息安全与可信计算教育部重点实验室,武汉大学国家网络安全学院,湖北 武汉 430072
    2 武汉市互联网舆情研究中心,湖北 武汉 430014
  • 修回日期:2023-08-07 出版日期:2023-08-01 发布日期:2023-08-01
  • 作者简介:徐永昌(1998- ),男,武汉大学国家网络安全学院硕士生,主要研究方向为普适计算
    黄士多(1965- ),男,武汉市互联网舆情研究中心副研究员,主要研究方向为网络舆情、社交媒体分析等
    艾浩军(1972- ),男,博士,武汉大学国家网络安全学院副教授,主要研究方向为普适计算与室内定位
  • 基金资助:
    国家自然科学基金资助项目(61971316)

A social media geolocation method based on comparative learning

Yongchang XU1, Shiduo HUANG2, Haojun AI1   

  1. 1 Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China
    2 Internet Public Opinion Research Center of Wuhan, Wuhan 430014, China
  • Revised:2023-08-07 Online:2023-08-01 Published:2023-08-01
  • Supported by:
    The National Natural Science Foundation of China(61971316)

摘要:

以往基于社交媒体文本的定位方法主要集中在将文本语义空间映射到地理位置空间,忽略了文本之间的语义相关性和地理位置之间的距离相关性。提出了一种新的无监督多层次对比学习框架,并设计了 3 个对比学习模块:语义学习模块、位置学习模块和跨层次学习模块。首先利用Transformer编码器获取文本的语义表示,以无监督的对比学习方式,聚拢位置相近文本之间的语义表示和地理表示,随后进行有监督训练,输出地理位置分类或回归结果。在4个数据集上与5个基线模型的对比实验结果表明,该框架有效地提升了社交媒体地理定位的准确性。

关键词: 社交媒体, 地理定位, 对比学习, 信息挖掘, Transformer

Abstract:

Previous work on social media text-based geolocation focused on mapping language semantic space to geospatial space, which ignores the semantic correlation between social media texts and the distance correlation between geographical locations.To take advantage of these correlations, mCLF, a new unsupervised multiple-level contrastive learning framework was proposed, three contrastive learning modules were designed: a semantic learning module, a location learning module, and a cross-learning module.Transformer encoder was used to obtain semantic representation of posts, utilizing unsupervised contrastive learning method to decrease the distance of semantic representations and location representations of posts with near locations, and then fine-tuned the model with supervised method for geographic location regression or classification outputs.Compared with five baseline methods, extensive experiments based on four datasets demonstrate the effectiveness of the proposed framework.

Key words: social media, geolocation, contrastive learning, information mining, Transformer

中图分类号: 

No Suggested Reading articles found!