无人机集群联合拓扑控制的智能路由规划方法

doi:10.11959/j.issn.1000-436x.2024032

通信学报 ›› 2024, Vol. 45 ›› Issue (2): 137-149.doi: 10.11959/j.issn.1000-436x.2024032

• 学术论文 • 上一篇

无人机集群联合拓扑控制的智能路由规划方法

颜志, 易正伦, 欧阳博, 王耀南

湖南大学电气与信息工程学院，湖南长沙 410082

修回日期:2023-11-07 出版日期:2024-02-01 发布日期:2024-02-01
作者简介:颜志（1986− ），男，湖南衡东人，博士，湖南大学副教授、博士生导师，主要研究方向为机器人通信与组网
易正伦（1998− ），男，广东惠州人，湖南大学硕士生，主要研究方向为无人机集群通信与组网
欧阳博（1987− ），男，湖南耒阳人，博士，湖南大学副教授、博士生导师，主要研究方向为机器学习、多机器人系统、复杂系统分析与控制、无线通信技术
王耀南（1957− ），男，云南龙陵人，博士，湖南大学教授、博士生导师，主要研究方向为机器人感知与控制技术
基金资助:
国家自然科学基金资助项目(62293511);湖南省科技重大专项基金资助项目(2021GK1010);网络与交换技术国家重点实验室（北京邮电大学）开放课题基金资助项目(SKLNST-2021-2-03)

Intelligent route planning method with jointing topology control of UAV swarm

Zhi YAN, Zhenglun YI, Bo OUYANG, Yaonan WANG

College of Electrical and Information Engineering, Hunan University, Changsha 410082, China

Revised:2023-11-07 Online:2024-02-01 Published:2024-02-01
Supported by:
The National Natural Science Foundation of China(62293511);Special Funding Support for the Construction of Innovative Provinces in Hunan Province(2021GK1010);The State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, China(SKLNST-2021-2-03)

摘要/Abstract

摘要：

针对现有无人机集群路由协议拓扑适变能力弱，易产生包重传、能量空洞和高时延，严重恶化了数据路由性能的问题，针对无人机集群中集群拓扑与路由的耦合特性，提出了一种联合拓扑控制的智能路由规划（IRPJTC）方法。该方法由基于虚拟力的自适应拓扑控制（VFATC）和基于近端策略优化的地理路由规划（PPO-GRP）组成。其中，VFATC 使各无人机根据邻居运动状态信息自适应调整与邻居的距离，保证集群中链路的稳定连接；进一步，PPO-GRP引入VFATC中的链路稳定性指标，并结合端到端时延与能耗指标，设计多目标奖励函数，采用深度强化学习中的近端策略优化算法训练路由策略。仿真实验结果表明，IRPJTC 相比于现有路由方法，能在保证分组传输成功率的同时，使端到端时延降低12.11%，无人机集群能耗降低4.56%，且具备更强的能耗均衡能力。

关键词: 无人机集群, 路由协议, 拓扑控制, 近端策略优化, 深度强化学习

Abstract:

Existing routing protocols without awareness of the topology causes excessive retransmissions, energy holes, and long delay, data routing performance was seriously deteriorated.Considering the relation of topology and routing, an intelligent route planning with jointing topology control (IRPJTC) method was proposed.IRPJTC consisted of two part, the virtual force-based adaptive topology control (VFATC), and the PPO-based geographic routing protocol (PPO-GRP).Based on neighbor’s mobility information, the distance between UAVs was adaptively adjusted by VFATC to provide stable links between UAVs.Combined with link stability metric in VFATC, end-to-end delay and energy consumption, a multi-objective reward function was designed by PPO-GRP to train optimal routing strategy.According to the performance study, the proposed IRPJTC reduces existing routing protocols by 12.11% of end-to-end delay, and 4.56% of energy consumption, and has a better energy balance ability.

Key words: UAV swarm, routing protocol, topology control, proximal policy optimization, deep reinforcement learning

中图分类号:

TN915.04

颜志, 易正伦, 欧阳博, 王耀南. 无人机集群联合拓扑控制的智能路由规划方法[J]. 通信学报, 2024, 45(2): 137-149.

Zhi YAN, Zhenglun YI, Bo OUYANG, Yaonan WANG. Intelligent route planning method with jointing topology control of UAV swarm[J]. Journal on Communications, 2024, 45(2): 137-149.

图/表 14

图1

表1

系统模型参数"

参数	含义
$T = {t_{1}, t_{2}, \dots, t_{m - 1}, t_{m}}$	区域监视总任务时间与时隙划分
τ	每个时隙的长度
$U = {u_{1}, u_{2}, \dots, u_{k}}$	无人机集合
${\vec{p}}_{i} (t_{n})$	无人机u_i在时隙t_n 的位置向量
${\vec{p}}_{BS}$	地面站的位置坐标
${\vec{v}}_{i} (t_{n})$	无人机u_i在时隙t_n 的速度向量
${\vec{a}}_{i} (t_{n})$	无人机u_i在时隙t_n 的加速度向量
R_r	安全距离约束范围（斥力范围）
R_a	无人机最大传输距离（引力范围）
R_BS	无人机与地面站的通信范围
N_i	u _i的单跳邻居集
$N_{i}^{a}$	u _i引力范围内的单跳邻居集
$N_{i}^{r}$	u _i斥力范围内的单跳邻居集
$d_{i j} (t_{n})$	u _i与u_j在时隙t_n 的欧氏距离
$f_{i j}^{Time}$	单跳数据包传输时间
$f_{i j}^{Speed}$	单跳数据包传输速度
LD _ij	链路寿命

表1

图2

图3

表2

表3

PPO-GRP的网络参数"

参数	含义	值
c₁	评估网络目标函数所占权重值	-0.5
c₂	策略模型的熵所占权重值	-0.01
$℧_{\max}$	关于动作值概率分布的标准差的最大值	0.6
$℧_{\min}$	关于动作值概率分布的标准差的最小值	0.1
$α_{℧}$	关于动作值概率分布标准差的衰减因子	0.999 5
Episodes	仿真回合数	500
Baffer-Size	缓冲区D的大小	2 000
n_update	网络连续更新次数	16
ε	PPO的裁剪参数	0.1
γ	计算奖励期望的折扣系数	0.95
策略网络学习率		0.000 03
评估网络学习率		0.000 1

表3

图4

图5

图6

图7

图8

图9

图10

图11

参考文献 19

[1]	CHENG X , DONG C , DAI H P ,et al. MOOC:a mobility control based clustering scheme for area coverage in FANETs[C]// Proceedings of the 2018 IEEE 19th International Symposium on “A World of Wireless,Mobile and Multimedia Networks”. Piscataway:IEEE Press, 2018: 14-22.
[2]	LIN Z J , LIU H H T , WOTTON M . Kalman filter-based large-scale wildfire monitoring with a system of UAVs[J]. IEEE Transactions on Industrial Electronics, 2019,66(1): 606-615.
[3]	WANG H J , ZHAO H T , WU W Y ,et al. Deployment algorithms of flying base stations:5G and beyond with UAVs[J]. IEEE Internet of Things Journal, 2019,6(6): 10009-10027.
[4]	许文俊, 吴思雷, 王凤玉 ,等. 基于多智能体强化学习的大规模灾后用户分布式覆盖优化[J]. 通信学报, 2022,43(8): 1-16.
	XU W J , WU S L , WANG F Y ,et al. Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning[J]. Journal on Communications, 2022,43(8): 1-16.
[5]	余雪勇, 邱礼翔, 宋家宁 ,等. 无人机辅助边缘计算中安全通信与能效优化策略[J]. 通信学报, 2023,44(3): 45-54.
	YU X Y , QIU L X , SONG J N ,et al. Security communication and energy efficiency optimization strategy in UAV-aided edge computing[J]. Journal on Communications, 2023,44(3): 45-54.
[6]	ALAM M M , ARAFAT M Y , MOH S ,et al. Topology control algorithms in multi-unmanned aerial vehicle networks:an extensive survey[J]. Journal of Network and Computer Applications, 2022,207:103495.
[7]	KIM D Y , LEE J W . Joint mission assignment and topology management in the mission-critical FANET[J]. IEEE Internet of Things Journal, 2020,7(3): 2368-2385.
[8]	TROTTA A , MONTECCHIARI L , FELICE M D ,et al. A GPS-free flocking model for aerial mesh deployments in disaster-recovery scenarios[J]. IEEE Access, 2020,8: 91558-91573.
[9]	ZHAO H T , WEI J B , HUANG S C ,et al. Regular topology formation based on artificial forces for distributed mobile robotic networks[J]. IEEE Transactions on Mobile Computing, 2019,18(10): 2415-2429.
[10]	KARP B , KUNG H T . GPSR:greedy perimeter stateless routing for wireless networks[C]// Proceedings of the 6th Annual International Conference on Mobile Computing and Networking. New York:ACM Press, 2000: 243-254.
[11]	JUNG W S , YIM J , KO Y B . QGeo:Q-learning-based geographic ad hoc routing protocol for unmanned robotic networks[J]. IEEE Communications Letters, 2017,21(10): 2258-2261.
[12]	LIU J M , WANG Q , HE C T ,et al. QMR:Q-learning based multi-objective optimization routing protocol for flying ad hoc networks[J]. Computer Communications, 2020,150: 304-316.
[13]	张雅楠, 仇洪冰 . 基于深度强化学习的无人机可信地理位置路由协议[J]. 电子与信息学报, 2022,44(12): 4211-4217.
	ZHANG Y N , QIU H B . Trusted geographic routing protocol based on deep reinforcement learning for unmanned aerial vehicle network[J]. Journal of Electronics ＆ Information Technology, 2022,44(12): 4211-4217.
[14]	LIN D P , PENG T , ZUO P L ,et al. Deep-reinforcement-learning-based intelligent routing strategy for FANETs[J]. Symmetry, 2022,14(9): 1787.
[15]	BAI Y J , ZHANG X , YU D J ,et al. A deep reinforcement learning-based geographic packet routing optimization[J]. IEEE Access, 2022,10: 108785-108796.
[16]	陈辉 . 无线 AdHoc 路由算法和拓扑控制算法研究[D]. 西安:长安大学, 2014.
	CHEN H . Research on wireless AdHoc routing algorithm and topology control algorithm[D]. Xi’an:Changan University, 2014.
[17]	刘航 . 资源受限卫星网络拓扑与路由规划方法研究[D]. 西安:西安电子科技大学, 2022.
	LIU H . Research on topology and routing planning of resource-constrained satellite networks[D]. Xi’an:Xidian University, 2022.
[18]	WU Y L , ZHANG B , YANG S S ,et al. Energy-efficient joint communication-motion planning for relay-assisted wireless robot surveillance[C]// Proceedings of the IEEE Conference on Computer Communications. Piscataway:IEEE Press, 2017: 1-9.
[19]	LIU Q W , ZHOU S L , GIANNAKIS G B . Cross-layer combining of queuing with adaptive modulation and coding over wireless links[C]// Proceedings of the IEEE Military Communications Conference. Piscataway:IEEE Press, 2004: 717-722.

参数	含义	值
M	无人机数量	30～80
τ /s	每个时隙的长度	0.1
σ /s	Hello包广播间隔	0.5
ψ	路径损耗指数	3
SINR_th	SINR阈值	1
R_a/m	最大距离（吸引距离）	150
R_r/m	最小距离（排斥距离）	100
B/MHz	带宽	20
E_total/J	无人机初始能量	20 000

无人机集群联合拓扑控制的智能路由规划方法

Intelligent route planning method with jointing topology control of UAV swarm

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 14

参考文献 19

相关文章 15

Metrics

推荐阅读 0

[1]	卢卓, 吴启晖, 周福辉. 有人机/无人机智能协同目标搜索和轨迹规划算法[J]. 通信学报, 2024, 45(1): 31-40.
[2]	李斌, 彭思聪, 费泽松. 基于边缘计算的无人机通感融合网络波束成形与资源优化[J]. 通信学报, 2023, 44(9): 228-237.
[3]	刘乔寿, 周雄, 刘爽, 邓义锋. 基于深度强化学习的OFDM自适应导频设计[J]. 通信学报, 2023, 44(9): 104-114.
[4]	刘润滋, 马天赐, 吴伟华, 要趁红, 杨清海. 基于分层强化学习的中继卫星网络任务动态调度方法[J]. 通信学报, 2023, 44(7): 207-217.
[5]	陈宁江, 练林明, 欧平杰, 袁雪梅. 基于图协同过滤模型的D2D协作缓存策略[J]. 通信学报, 2023, 44(7): 136-148.
[6]	曾锋, 张政, 陈志刚. 基于深度强化学习的计算卸载与资源分配策略[J]. 通信学报, 2023, 44(7): 124-135.
[7]	金彪, 李逸康, 姚志强, 陈瑜霖, 熊金波. GenFedRL：面向深度强化学习智能体的通用联邦强化学习框架[J]. 通信学报, 2023, 44(6): 183-197.
[8]	李元诚, 秦永泰. 基于深度强化学习的软件定义安全中台QoS实时优化算法[J]. 通信学报, 2023, 44(5): 181-192.
[9]	许国良, 谭峰, 冉泳屹, 陈丰. 面向多波束卫星系统的波束跳变与覆盖控制联合优化算法[J]. 通信学报, 2023, 44(4): 78-86.
[10]	蒋丽, 谢胜利, 田辉. 面向数字孪生边缘网络的区块链分片及资源自适应优化机制[J]. 通信学报, 2023, 44(3): 12-23.
[11]	张宇, 程旻. NDN中边缘计算与缓存的联合优化[J]. 通信学报, 2022, 43(8): 164-175.
[12]	沙宗轩, 霍如, 孙闯, 汪硕, 黄韬. 基于深度强化学习的转发效能感知流量调度算法[J]. 通信学报, 2022, 43(8): 30-40.
[13]	孙雁飞, 尹嘉峥, 亓晋, 胡筱旋, 陈梦婷, 董振江. 基于动态图嵌入的车联网拓扑控制[J]. 通信学报, 2022, 43(6): 133-142.
[14]	张先超, 赵耀, 叶海军, 樊锐. 无线网络多用户干扰下智能发射功率控制算法[J]. 通信学报, 2022, 43(2): 15-21.
[15]	苏新, 孟蕾蕾, 周一青, CELIMUGE Wu. 基于深度强化学习的海洋移动边缘计算卸载方法[J]. 通信学报, 2022, 43(10): 133-145.