Journal on Communications ›› 2022, Vol. 43 ›› Issue (6): 189-199.doi: 10.11959/j.issn.1000-436x.2022111

• Papers • Previous Articles     Next Articles

Security decision method for the edge of multi-layer satellite network based on reinforcement learning

Peiliang ZUO1, Shaolong HOU1,2, Chao GUO1, Hua JIANG1,2, Wenbo WANG3   

  1. 1 Department of Electronics and Communication Engineering, Beijing Electronic Science and Technology Institute, Beijing 100070, China
    2 School of Communication Engineering, Xidian University, Xi’an 710068, China
    3 School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Revised:2022-04-25 Online:2022-06-01 Published:2022-06-01
  • Supported by:
    The National Natural Science Foundation of China(62001251);The National Natural Science Foundation of China(62001252);“High-precision” Discipline Construction Project in Beijing Universities(202100130401);Xidian University Integrated Business Network Theory and Key Technology State Key Laboratory Project(ISN22-13)

Abstract:

Objectives: Multi-layer satellite network is an important component of space-ground integration technology.The purpose of this paper is to rely on the autonomous decision ability of satellite nodes to give full play to the processing and backhaul tasks of sensing data including encryption, decryption and compression in network edge scenarios. Collaboration. With the premise of ensuring data security and the goal of low transmission delay,the edge decision-making of mission satellites in the multi-layer satellite network architecture is realized.

Methods:This paper considers a multi-layer satellite network consisting of low-orbit satellites, medium-orbit satellites, and high-orbit geosynchronous satellites.Among them,the low-orbit satellite nodes are responsible for observation and reconnaissance services (such as meteorological observation, geographic detection, intelligence reconnaissance,etc.),and the medium-orbit satellites are regarded as fog nodes in edge scenarios, and one of them serves as the fog computing processing center, responsible for planning and observing The data compression processing and security encryption are located in the satellite node and the network selection of the data backhaul. The geosynchronous orbit satellite has the largest coverage and the strongest computing processing capability. This paper uses deep reinforcement learning algorithms to implement edge security decisions for satellite networks. Specifically, the edge center node obtains the environmental state of the satellite network through the perception system, and on this basis, uses the ability of deep reinforcement learning algorithm to learn independently, and obtains the optimal data offloading strategy in the scene by fitting, and obtains the optimal link planning., so that the onboard resources can be fully utilized, so as to achieve the goal of minimizing the average return delay of many observation tasks.First,the edge center node observes the environment and obtains state elements such as the data volume, channel conditions, and edge node processing capability of the observation satellite mission in the environment. Selection;the strategy acts on the satellite network,which will change the state of the environment,and the environment will evaluate the strategy and feed it back to the edge center node in the form of reward;the edge center node will perform error calculation and update the Q value based on the new environment state and income,in order to optimize the action selection strategy,so as to obtain higher rewards and new environmental states; the above process is continuously iterated to finally obtain the optimal strategy.

Results:Keras is used as the simulation platform,and in the simulation experiment,the constellation of low-orbit satellites is assumed to be the common Walker constellation. Taking a certain area in the multi-layer satellite network as the simulation object, the number of low-orbit observation satellites in this area is set to 8,the number of medium-orbit satellites is 3, and the number of high-orbit satellites is one. The simulation results include three aspects:1)Simulation of the convergence performance of each method for random snapshots with different numbers of satellites. The simulation results show that the proposed method shows a convergence trend for different numbers of satellites. With the increase of the number of satellites,the number of training times required for the proposed method to achieve convergence increases significantly. This is because the increase in the number of satellites increases significantly.The size of the action space of the method;2)The performance of the proposed method under different network configuration conditions is compared. Simulation results show that the proposed method has the best convergence performance under all 4 different configuration conditions,however,the initial performance of the low-high network configuration is excellent under partial snapshots,but as the training progresses, Its convergence performance becomes poor, because the network configuration has fewer link choices,which limits its performance; 3) The performance of the proposed method and the comparison method is simulated and verified by using the test set. The simulation results show that compared with the random edge security decision and the edge security decision oriented by the signal-to-noise ratio parameter,the method proposed has a greater advantage in the delay performance, and is comparable to the optimal edge security decision performance obtained by traversal.The difference is small.

Conclusions:Aiming at the link selection problem of multi-layer satellite nodes for low-orbit observation satellites in the scene,this paper proposes a data compression and encryption backhaul decision method based on deep reinforcement learning. By rationally designing the state, action, reward, and training network related parameters of the method in combination with the scene, the proposed method can make intelligent and efficient edge decision-making with the goal of low transmission delay.

Key words: multi-layer satellite network, LEO satellite, edge decision, reinforcement learning, data encryption

CLC Number: 

No Suggested Reading articles found!