面向6G的深度图像语义通信模型

doi:10.11959/j.issn.1000-436x.2023050

通信学报 ›› 2023, Vol. 44 ›› Issue (3): 198-208.doi: 10.11959/j.issn.1000-436x.2023050

面向6G的深度图像语义通信模型

江沸菠¹, 彭于波¹, 董莉²^,³

¹ 湖南师范大学信息科学与工程学院，湖南长沙 410081
² 湖南工商大学长沙人工智能社会实验室，湖南长沙 410205
³ 湘江实验室，湖南长沙 410205

修回日期:2023-02-09 出版日期:2023-03-25 发布日期:2023-03-01
作者简介:江沸菠（1982− ），男，湖南株洲人，博士，湖南师范大学副教授、硕士生导师，主要研究方向为深度学习与物联网等
彭于波（1996− ），男，重庆人，湖南师范大学硕士生，主要研究方向为语义通信和联邦学习
董莉（1982− ），女，湖南长沙人，博士，湖南工商大学讲师、硕士生导师，主要研究方向为深度学习与物联网等
基金资助:
国家自然科学基金资助项目(41904127);国家自然科学基金资助项目(41604117);湘江实验室开放基金资助项目(6109408DL001);湖南省教育厅科学研究优秀青年基金资助项目(7103408DL001);湖南省教育厅资助科研项目(21A0372)

Deep image semantic communication model for 6G

Feibo JIANG¹, Yubo PENG¹, Li DONG²^,³

¹ School of Information Science and Engineering, Hunan Normal University, Changsha 410081, China
² Changsha Social Laboratory of Artificial Intelligence, Hunan University of Technology and Business, Changsha 410205, China
³ Xiangjiang Laboratory, Changsha 410205, China

Revised:2023-02-09 Online:2023-03-25 Published:2023-03-01
Supported by:
The National Natural Science Foundation of China(41904127);The National Natural Science Foundation of China(41604117);Open Project of Xiangjiang Laboratory(6109408DL001);Project of Outstanding Youth in Scientific Research of Hunan Provincial Department of Education(7103408DL001);Scientific Research Fund of Hunan Provincial Education Department(21A0372)

摘要/Abstract

摘要：

目前的语义通信模型在处理图像数据方面仍有可改善的部分，包括有效的图像语义编解码、高效的语义模型训练和精准的图像语义评估。为此，提出了一种深度图像语义通信（DeepISC）模型。首先采用基于 vision transformer的自编码器（ViTA）网络实现高质量的图像语义编解码；接着采用自编码器实现信道编解码，保证语义在信道上的传输；然后利用判别器网络（DSN）和ViTA的双网络架构协同训练，提高重建图像的语义精度；最后针对不同的下游视觉任务提出不同的图像语义评估指标。仿真结果表明，相较于其他方案，DeepISC可以更有效地还原传输图像的语义特征，使重建图像在各个下游任务中都展现出与原图像相同或相近的语义结果。

关键词: 人工智能, 6G, 语义通信, 图像识别, 特征提取

Abstract:

Current semantic communication models still have some parts that can be improved in processing image data, including effective image semantic codec, efficient semantic model training, and accurate image semantic evaluation.Hence, a deep image semantic communication (DeepISC) model was proposed.The vision transformer-based autoencoder (ViTA) network was used to achieve high-quality image semantic encoding and decoding.Then, an autoencoder realized channel codec to ensure the transmission of semantics on the channel.Furthermore, the discriminator network (DSN) and ViTA’s dual network architecture were used to jointly train, thus improving the semantic accuracy of the reconstructed image.Finally, for different downstream vision tasks, different evaluation indicators of image semantics were presented.Simulation results show that compared with other schemes, DeepISC can more effectively restore the semantic features of the transmitted image, so that the reconstructed image can show the same or similar semantic results as the original image in various downstream tasks.

Key words: artificial intelligence, 6G, semantic communication, image recognition, feature extraction

中图分类号:

TN929.5

江沸菠, 彭于波, 董莉. 面向6G的深度图像语义通信模型[J]. 通信学报, 2023, 44(3): 198-208.

Feibo JIANG, Yubo PENG, Li DONG. Deep image semantic communication model for 6G[J]. Journal on Communications, 2023, 44(3): 198-208.

图/表 10

图1

图2

图3

图4

图5

图6

图7

图8

图9

图10

参考文献 30

[1]	XIE H Q , QIN Z J , LI G Y ,et al. Deep learning enabled semantic communication systems[J]. IEEE Transactions on Signal Processing, 2021,69: 2663-2675.
[2]	TSE D , VISWANATH P . Fundamentals of wireless communication[M]. Cambridge: Cambridge University Press, 2005.
[3]	XIE H Q , QIN Z J . A lite distributed semantic communication system for Internet of things[J]. IEEE Journal on Selected Areas in Communications, 2021,39(1): 142-153.
[4]	TONG H N , YANG Z H , WANG S H ,et al. Federated learning based audio semantic communication over wireless networks[C]// Proceedings of 2021 IEEE Global Communications Conference (GLOBECOM). Piscataway:IEEE Press, 2021:doi.org/10.1109/GLOBECOM46510.2021.9685654.
[5]	KOTSAKIS R , KALLIRIS G , DIMOULAS C . Investigation of broadcast-audio semantic analysis scenarios employing radio-programme-adaptive pattern classification[J]. Speech Communication, 2012,54(6): 743-762.
[6]	HUANG D L , TAO X M , GAO F F ,et al. Deep learning-based image semantic coding for semantic communications[C]// Proceedings of 2021 IEEE Global Communications Conference (GLOBECOM). Piscataway:IEEE Press, 2021:doi.org/10.1109/GLOBECOM46510.2021:9685667.
[7]	PATWA N , AHUJA N , SOMAYAZULU S ,et al. Semantic-preserving image compression[C]// Proceedings of 2020 IEEE International Conference on Image Processing (ICIP). Piscataway:IEEE Press, 2020: 1281-1285.
[8]	SUN Q Z , GUO C L , YANG Y ,et al. Deep joint source-channel coding based on semantics of pixels[J]. arXiv Preprint,arXiv:220811375, 2022.
[9]	WANG J , WANG S X , DAI J C ,et al. Perceptual learned source-channel coding for high-fidelity image semantic transmission[C]// Proceedings of IEEE Global Communications Conference. Piscataway:IEEE Press, 2023: 3959-3964.
[10]	WANG Q , SHEN L Q , SHI Y . Recognition-driven compressed image generation using semantic-prior information[J]. IEEE Signal Processing Letters, 2020,27: 1150-1154.
[11]	HU Q Y , ZHANG G Y , QIN Z J ,et al. Robust semantic communications against semantic noise[C]// Proceedings of 2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall). Piscataway:IEEE Press, 2022:doi.org/10.1109/VTC2022-Fall57202.2022.10012843.
[12]	LIU X Y , WU Y , LIANG W K ,et al. High resolution SAR image classification using global-local network structure based on vision transformer and CNN[J]. IEEE Geoscience and Remote Sensing Letters, 2022,19: 1-5.
[13]	ZHANG P , XU W J , GAO H ,et al. Toward wisdom-evolutionary and primitive-concise 6G:a new paradigm of semantic communication networks[J]. Engineering, 2022,8: 60-73.
[14]	PARMAR N , VASWANI A , USZKOREIT J ,et al. Image transformer[C]// Proceedings of the 35th International Conference on Machine Learning. New York:PMLR, 2018: 4055-4064.
[15]	PU Y C , GAN Z , HENAO R ,et al. Variational autoencoder for deep learning of images,labels and captions[C]// Proceedings of the 30th International Conference on Neural Information Processing Systems. New York:ACM Press, 2016: 2360-2368.
[16]	ZHANG H W , SHAO S , TAO M X ,et al. Deep learning-enabled semantic communication systems with task-unaware transmitter and dynamic data[J]. IEEE Journal on Selected Areas in Communications, 2023,41(1): 170-185.
[17]	SHANNON C E . A mathematical theory of communication[J]. The Bell System Technical Journal, 1948,27(3): 379-423.
[18]	CHEN C F R , FAN Q F , PANDA R . CrossViT:cross-attention multi-scale vision transformer for image classification[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway:IEEE Press, 2022: 347-356.
[19]	ZHU X Z , SU W J , LU L W ,et al. Deformable DETR:deformable transformers for end-to-end object detection[J]. arXiv Preprint,arXiv:201004159, 2020.
[20]	DOSOVITSKIY A , BEYER L , KOLESNIKOV A ,et al. An image is worth 16×16 words:transformers for image recognition at scale[J]. arXiv Preprint,arXiv:201011929, 2020.
[21]	ZHAO H S , JIA J Y , KOLTUN V . Exploring self-attention for image recognition[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2020: 10073-10082.
[22]	VASWANI A , SHAZEER N , PARMAR N ,et al. Attention is all You need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. New York:ACM Press, 2017: 6000-6010.
[23]	KOHL S , BONEKAMP D , SCHLEMMER H P ,et al. Adversarial networks for the detection of aggressive prostate cancer[J]. arXiv Preprint,arXiv:170208014, 2017.
[24]	ISOLA P , ZHU J Y , ZHOU T H ,et al. Image-to-image translation with conditional adversarial networks[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2017: 5967-5976.
[25]	YU J H , JIANG Y N , WANG Z Y ,et al. UnitBox:an advanced object detection network[C]// Proceedings of the 24th ACM International Conference on Multimedia. New York:ACM Press, 2016: 516-520.
[26]	WU W B , PAN Y . Adaptive modular convolutional neural network for image recognition[J]. Sensors, 2022,22(15): 5488.
[27]	THECKEDATH D , SEDAMKAR R R . Detecting affect states using VGG16,ResNet50 and SE-ResNet50 networks[J]. SN Computer Science, 2020,1(2): 79.
[28]	BOCHKOVSKIY A , WANG C Y , LIAO H Y M . YOLOv4:optimal speed and accuracy of object detection[J]. arXiv Preprint,arXiv:200410934, 2020.
[29]	PARIKH H , PATEL S , PATEL V . Evaluation of deep learning and transform domain feature extraction techniques for land cover classification:balancing through augmentation[J]. Environmental Science and Pollution Research, 2023,30(6): 14464-14483.
[30]	ZHANG Z J , . Improved Adam optimizer for deep neural networks[C]// Proceedings of 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS). Piscataway:IEEE Press, 2019: 1-2.

面向6G的深度图像语义通信模型

Deep image semantic communication model for 6G

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 30

相关文章 15

Metrics

推荐阅读 0

[1]	李荣鹏, 汪丙炎, 张宏纲, 赵志峰. 知识增强的语义通信接收端设计[J]. 通信学报, 2023, 44(6): 70-76.
[2]	张平, 牛凯, 姚圣时, 戴金晟. 面向未来的语义通信：基本原理与实现方法[J]. 通信学报, 2023, 44(5): 1-14.
[3]	石光明, 杨旻曦, 高大化, 柴靖轩. 面向语义信息直传的通信架构[J]. 通信学报, 2023, 44(5): 15-27.
[4]	秦志金, 赵菼菼, 李凡, 陶晓明. 多模态语义通信研究综述[J]. 通信学报, 2023, 44(5): 28-41.
[5]	张平, 戴金晟, 张育铭, 王思贤, 秦晓琦, 牛凯. 面向语义通信的非线性变换编码[J]. 通信学报, 2023, 44(4): 1-14.
[6]	杨静雅, 唐晓刚, 周一青, 刘玲, Jiangzhou Wang. 意图抽象与知识联合驱动的6G内生智能网络架构[J]. 通信学报, 2023, 44(2): 12-26.
[7]	王晓云, 张小舟, 马良, 王亚娟, 楼梦婷, 姜涛, 金婧, 王启星, 刘光毅. 6G通信感知一体化网络的感知算法研究与优化[J]. 通信学报, 2023, 44(2): 219-230.
[8]	张海君, 陈安琪, 李亚博, 隆克平. 6G移动网络关键技术[J]. 通信学报, 2022, 43(7): 189-202.
[9]	廖建新, 付霄元, 戚琦, 王敬宇, 孙海峰. 6G-ADM：基于知识空间的6G网络管控体系[J]. 通信学报, 2022, 43(6): 3-15.
[10]	王志勤, 江甲沫, 刘沛西, 曹晓雯, 李阳, 韩凯峰, 杜滢, 朱光旭. 6G联邦边缘学习新范式：基于任务导向的资源管理策略[J]. 通信学报, 2022, 43(6): 16-27.
[11]	李昂, 陈建新, 魏昕, 周亮. 面向6G的跨模态信号重建技术[J]. 通信学报, 2022, 43(6): 28-40.
[12]	刘传宏, 郭彩丽, 杨洋, 陈九九, 朱美逸, 孙鲁楠. 面向智能任务的语义通信：理论、技术和挑战[J]. 通信学报, 2022, 43(6): 41-57.
[13]	兰巨龙, 朱棣, 李丹. 面向多模态网络业务切片的虚拟网络功能资源容量智能预测方法[J]. 通信学报, 2022, 43(6): 143-155.
[14]	王晓丹, 李京泰, 宋亚飞. DDAC：面向卷积神经网络图像隐写分析模型的特征提取方法[J]. 通信学报, 2022, 43(5): 68-81.
[15]	唐盼, 林佳欣, 张建华, 田磊, 常钊玮, 夏亮, 王启星. 面向6G的太赫兹信道反射特性研究[J]. 通信学报, 2022, 43(5): 102-109.