基于多模态融合的社交媒体文本地理位置预测方法

doi:10.11959/j.issn.1000-0801.2023183

电信科学 ›› 2023, Vol. 39 ›› Issue (9): 111-121.doi: 10.11959/j.issn.1000-0801.2023183

• 研究与开发 • 上一篇

基于多模态融合的社交媒体文本地理位置预测方法

黄士多¹, 徐永昌², 艾浩军²

¹ 武汉市互联网舆情研究中心，湖北武汉 430014
² 武汉大学国家网络安全学院，湖北武汉 430072

修回日期:2023-09-13 出版日期:2023-08-01 发布日期:2023-08-01
作者简介:黄士多（1965- ），男，武汉市互联网舆情研究中心副研究员、主任，主要研究方向为网络舆情和社交媒体分析
徐永昌（1998- ），男，武汉大学国家网络安全学院硕士生，主要研究方向为普适计算
艾浩军（1972- ），男，博士，武汉大学国家网络安全学院副教授，主要研究方向为普适计算和室内定位
基金资助:
国家自然科学基金资助项目(61971316)

A social media geolocation prediction method based on multimodal fusion

Shiduo HUANG¹, Yongchang XU², Haojun AI²

¹ Wuhan Internet Public Opinion Research Center, Wuhan 430014, China
² School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China

Revised:2023-09-13 Online:2023-08-01 Published:2023-08-01
Supported by:
The National Natural Science Foundation of China(61971316)

摘要/Abstract

摘要：

挖掘社交媒体文本的地理位置信息能发现其空间关系，提出了基于多模态融合的社交媒体文本地理位置预测方法，利用文本获取的相关图片作为增强数据，构建图文数据集，以提高地理位置预测的准确性。多模态融合模型分别利用图片通道和文本通道提取两者的地理位置信息。同时，引入图文匹配模块对图文对进行降噪，解决图文不匹配问题。在Geotext数据集上进行的地理位置预测实验结果显示，与基线模型相比，中值误差距离降低了18.8%，平均误差距离降低了4.5%。

关键词: 社交媒体, 地理定位, 多模态融合, 信息挖掘

Abstract:

Geographical information extracted from social media text reveals underlying spatial correlations.A geographical location prediction method for social media text based on multimodal fusion was proposed.By utilizing images associated with the text as augmented data, an integrated image-text dataset was constructed to enhance the accuracy of geographical location prediction.The multimodal fusion model employs separate channels for images and text to independently extract their respective geographical location information.Additionally, a text-image matching module was introduced to denoise the image-text pairs, effectively solving the issue of text-image misalignment.Experimental results on the Geotext dataset indicate that compared to the baseline model, the proposed method reduces the median error distance by 18.8% and the average error distance by 4.5%.

Key words: social media, geolocation, multimodal fusion, information mining, The National Natural Science Foundation of China

中图分类号:

TP391.1

黄士多, 徐永昌, 艾浩军. 基于多模态融合的社交媒体文本地理位置预测方法[J]. 电信科学, 2023, 39(9): 111-121.

Shiduo HUANG, Yongchang XU, Haojun AI. A social media geolocation prediction method based on multimodal fusion[J]. Telecommunications Science, 2023, 39(9): 111-121.

图/表 10

图1

图2

图3

图4

表1

表2

表3

表4

表5

表6

参考文献 23

[1]	SAKAKI T , OKAZAKI M , MATSUO Y . Earthquake shakes Twitter users:real-time event detection by social sensors[C]// Proceedings of the 19th International Conference on World Wide Web. New York:ACM Press, 2010: 851-860.
[2]	邬柯杰, 吴吉东, 叶梦琪 . 社交媒体数据在自然灾害应急管理中的应用研究综述[J]. 地理科学进展, 2020,39(8): 1412-1422.
	WU K J , WU J D , YE M Q . A review on the application of social media data in natural disaster emergency management[J]. Progress in Geography, 2020,39(8): 1412-1422.
[3]	KINSELLA S , MURDOCK V , O’HARE N . “I’m eating a sandwich in Glasgow”:modeling locations with tweets[C]// Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents. New York:ACM Press, 2011: 61-68.
[4]	杨腾飞, 解吉波, 闫东川 ,等. 基于深度学习的社交媒体情感信息抽取及其在灾情分析中的应用研究[J]. 地理与地理信息科学, 2020,36(2): 62-68,F0002.
	YANG T F , XIE J B , YAN D C ,et al. Extracting sentiment information from social media based on deep learning and the research on disaster reduction[J]. Geography and Geo-Information Science, 2020,36(2): 62-68,F0002.
[5]	徐永昌, 黄士多, 艾浩军 . 基于对比学习的社交媒体地理位置预测方法[J]. 电信科学, 2023,39(8): 58-68.
	XU Y C , HUANG S D , AI H J . A social media geolocation method based on comparative learning[J]. Telecommunications Science, 2023,39(8): 58-68.
[6]	WING B , BALDRIDGE J . Simple supervised document geolocation with geodesic grids[C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. OR:The Association for Computer Linguistics, 2011: 955-964.
[7]	CHENG Z Y , CAVERLEE J , LEE K . You are where you tweet:a content-based approach to geo-locating twitter users[C]// Proceedings of the 19th ACM International Conference on Information and Knowledge Management. New York:ACM Press, 2010: 759-768.
[8]	HAN B , COOK P , BALDWIN T . Geolocation prediction in social media data by finding location indicative words[C]// COLING. Indian:Indian Institute of Technology Bombay, 2012: 1045-1062.
[9]	CHA M , GWON Y , KUNG H . Twitter geolocation and regional classification via sparse coding[J]. Proceedings of the International AAAI Conference on Web and Social Media, 2021,9(1): 582-585.
[10]	LAU J H , CHI L , TRAN K N ,et al. End-to-end network for twitter geolocation prediction and hashing[J]. arXiv preprint, 2017,arXiv:1710.04802.
[11]	MIYAZAKI T , RAHIMI A , COHN T ,et al. Twitter geolocation using knowledge-based methods[C]// Proceedings of the 2018 EMNLP Workshop W-NUT:The 4th Workshop on Noisy User-generated Text. Stroudsburg:Association for Computational Linguistics, 2018: 7-16.
[12]	FORNACIARI T , HOVY D . Geolocation with attention-based multitask learning models[C]// Proceedings of the 5th Workshop on Noisy User-generated Text(W-NUT 2019). Stroudsburg:Association for Computational Linguistics, 2019: 217-223.
[13]	MILUSHEVA S , MARTY R , BEDOYA G ,et al. Applying machine learning and geolocation techniques to social media data (Twitter) to develop a resource for urban planning[J]. PLoS One, 2021,16(2): e0244317.
[14]	AMBROSIO-AGUILAR A D , BáRCENAS E , MOLEROCASTILLO G ,et al. Geolocation of tweets in Spanish with transformer encoders[C]// Proceedings of 2021 9th International Conference in Software Engineering Research and Innovation(CONISOFT). Piscataway:IEEE Press, 2021: 227-231.
[15]	SCHERRER Y , LJUBESIC N . Social media variety geolocation with geobert[C]// Proceedings of the Eighth Workshop on NLP for Similar Languages,Varieties and Dialects (VarDial). Association for Computational Linguistics, 2021: 135-140.
[16]	KARPATHY A , TODERICI G , SHETTY S ,et al. Large-scale video classification with convolutional neural networks[C]// Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2014: 1725-1732.
[17]	HINTON G , VINYALS O , DEAN J . Distilling the knowledge in a neural network[J]. arXiv preprint, 2015,arXiv:1503.02531.
[18]	LI J , LI D , XIONG C ,et al. BLIP:bootstrapping language-image pre-training for unified vision-language understanding and generation[C]// Proceedings of Machine Learning Research:volume 162 ICML. PMLR, 2022: 12888-12900.
[19]	CHAKRAVARTHI B R , GAMAN M , IONESCU R T ,et al. Findings of the vardial evaluation campaign 2021[C]// Proceedings of the Eighth Workshop on NLP for Similar Languages,Varieties and Dialects (VarDial). Association for Computational Linguistics, 2021: 1-11.
[20]	BACKSTROM L , SUN E , MARLOW C . Find me if you can:improving geographical prediction with social and spatial proximity[C]// Proceedings of the 19th International Conference on World Wide Web. New York:ACM Press, 2010: 61-70.
[21]	DAVIS C A Jr , PAPPA G L , DE OLIVEIRA D R R ,et al. Inferring the location of twitter messages based on user relationships[J]. Transactions in GIS, 2011,15(6): 735-751.
[22]	JURGENS D . That’s what friends are for:inferring location in online social media platforms based on social relationships[J]. Proceedings of the International AAAI Conference on Web and Social Media, 2021,7(1): 273-282.
[23]	ROUT D , BONTCHEVA K , PREO?IUC-PIETRO D ,et al. Where’s @wally? :a classification approach to geolocating users based on their social ties[C]// Proceedings of the 24th ACM Conference on Hypertext and Social Media. New York:ACM Press, 2013: 11-20.

数据集名称	训练集样本量	验证集样本量	测试集样本量	每个帖子Token数量
Geotext数据集	302 064	37 758	37 758	12
Geotext图文对数据集	293 003	36 029	37 580	13

方法	Geotext数据集		Geotext图文对数据集
方法	Median/km	Mean/km	Median/km	Mean/km
MLP	389	844	—	—
MDN	412	865	—	—
LR	397	880	—	—
Sparse-Coding	425	581	—	—
geoBERT-cls	380.08	578.89	—	—
VIT+BERT	—	—	308.54	552.89

方法	Median/km	Mean/km
VIT+BERT	308.54	552.89
VIT+BERT-ITM	415.53	607.01

图片	文本	匹配结果	图文相似度
	The sky is blue here, waiting for you in Hollywood	1	0.81
	The night of New York is intoxicating, hoping to live here forever	0	0.22

方法	模型参数量	推理速度/（条·秒^-1）	Median/km	Mean/km
RESNET18+BiLSTM	37.96×106	116.53	421.63	672.44
VIT+BiLSTM	112.78×106	114.44	413.49	649.07
RESNET18+BERT	121.11×106	86.04	344.25	567.10
VIT+BERT	195.53×106	69.50	308.54	552.89

基于多模态融合的社交媒体文本地理位置预测方法

A social media geolocation prediction method based on multimodal fusion

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 23

相关文章 2

Metrics

推荐阅读 0

[1]	徐永昌, 黄士多, 艾浩军. 基于对比学习的社交媒体地理位置预测方法[J]. 电信科学, 2023, 39(8): 58-68.
[2]	林冠辰, 王智飞. 移动设备生物特征识别多模态融合标准研究及展望[J]. 电信科学, 2021, 37(1): 32-38.

方法	Median/km	Mean/km
VIT+BERT	307.38	552.91
无cross-attention	310.37	553.41
无图片通道	381.51	580.17
无图文匹配模块	409.37	609.21