Chinese Journal of Intelligent Science and Technology

Federated ecology: from federated data to federated intelligence

Fei-Yue WANG, Yanfen WANG, Yizhu CHEN, Yonglin TIAN, Hongwei QI, Xiao WANG, Weishan ZHANG, Jun ZHANG, Yong YUAN

2020, 2(4): 305-313. doi:10.11959/j.issn.2096-6652.202033

Asbtract ( 511 )

HTML ( 54)

PDF (1561KB) ( 783 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

In view of the widespread isolated data island in the era of big data, the basic framework of federated ecology was proposed, and its construction and operation mechanism were discussed.Federated ecology was based on the premise of controllable data privacy, supported by blockchain technology and driven by federated intelligence.It realized data federation by means of federated control, and realized federated service through federated management.Federated ecology provides a new way to address the problem of isolated data island, gives full play to the potential of big data and artificial intelligence, and then realizes federated intelligence.

An overview on algorithms and applications of deep reinforcement learning

Zhaoyang LIU, Chaoxu MU, Changyin SUN

2020, 2(4): 314-326. doi:10.11959/j.issn.2096-6652.202034

Asbtract ( 3645 )

HTML ( 1788)

PDF (2994KB) ( 5305 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

Deep reinforcement learning (DRL) is mainly applied to solve the perception-decision problem, and has become an important research branch in the field of artificial intelligence.Two kinds of DRL algorithms based on value function and policy gradient were summarized, including deep Q network, policy gradient as well as related developed algorithms.In addition, the applications of DRL in video games, navigation, multi-agent cooperation and recommendation field were intensively reviewed.Finally, a prospect for the future research of DRL was made, and some research suggestions were given.

An overview of optimal consensus for data driven multi-agent system based on reinforcement learning

Jinna LI, Weiran CHENG

2020, 2(4): 327-340. doi:10.11959/j.issn.2096-6652.202035

Asbtract ( 901 )

HTML ( 89)

PDF (815KB) ( 1680 )

Knowledge map

References | Related Articles | Metrics

Multi-agent system has attracted extensive attention in the past two decades because of its potential applications in engineering, social science and natural science, etc.To achieving the consensus of multi-agent system, it is usually necessary to solve the correlation matrix equation to design the control protocol offline, which requires system model to be known accurately.However, the actual multi-agent system has the characteristics of large-scale, nonlinear coupling, and dynamic change of environment, which makes it very difficult to accurately model the system.This brings challenges to the design of model dependent multi-agent consensus protocol.Reinforcement learning is widely used to solve the optimal control and decision-making problems of complex systems because it can learn the optimal solution of control problems in real time by using the measurement data along the trajectory of the system.The existing theories and methods of online solving the optimal consensus of multi-agent system inreal-time by using reinforcement learning technology were summarized.The application of data-driven reinforcement learning technology in multi-agent system optimal consensus was introduced from the aspects of continuous and discrete, homogenous and heterogeneous, anti-interference robustness and so on.Finally, the future research direction of the optimal consensus problem of multi-agent system based on data-driven technology was discussed.

Reinforcement learning for green and reliable data center

Qing-Shan JIA, Jingxian TANG, Junjie WU, Xiao HU, Yiting LIN, Heng XIA

2020, 2(4): 341-347. doi:10.11959/j.issn.2096-6652.202036

Asbtract ( 412 )

HTML ( 53)

PDF (5516KB) ( 480 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

It is of significant social and economical impact to achieve green and reliable operation of data center.The optimization and control methods for green and reliable data center were reviewed briefly.An event-based reinforcement learning approach for improving the energy efficiency was developed.And a method to improve the accuracy of battery lifetime forecasting was developed.

Intelligent heating temperature control system based on deep reinforcement learning

Tao LI, Qinglai WEI

2020, 2(4): 348-353. doi:10.11959/j.issn.2096-6652.202037

Asbtract ( 522 )

HTML ( 60)

PDF (3700KB) ( 642 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

It is of great significance to study how to adjust the room temperature adaptively through heating equipment to improve the comfort of the indoor environment.Therefore, a double deep Q network method was developed to control the valve opening of heating equipment to adjust the indoor temperature in real time via human expressions.Firstly, the preprocessing algorithm for the original input state was introduced.Secondly, a double deep Q network method was designed to learn the optimal control policy of the valve opening of heating equipment.Finally, simulation results were given to illustrate the effectiveness of the method proposed.

Depth control of autonomous underwater vehicle using deep reinforcement learning

Rizhong WANG, Huiping LI, Di CUI, Demin XU

2020, 2(4): 354-360. doi:10.11959/j.issn.2096-6652.202038

Asbtract ( 412 )

HTML ( 32)

PDF (1662KB) ( 662 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

The depth control problem of autonomous underwater vehicle (AUV) by using deep reinforcement learning method was mainly studied.Different from the traditional control algorithm, the deep reinforcement learning method allows the AUV to learn the control law independently, avoiding the artificial establishment of accurate model and design control law.The deep deterministic policy gradient method was used to design two neural networks: actor and critic.Actor neural network enabled agents to make corresponding control actions.Critic neural network was used to estimate the action-value function in reinforcement learning.The AUV depth control was conducted by training of actor and critic neural networks.The effectiveness of the algorithm was proved by simulation on OpenAI Gym.

Motion planning for hexapod robot using deep reinforcement learning

Huiqiao FU, Kaiqiang TANG, Guizhou DENG, Xinpeng WANG, Chunlin CHEN

2020, 2(4): 361-371. doi:10.11959/j.issn.2096-6652.202039

Asbtract ( 595 )

HTML ( 50)

PDF (8346KB) ( 430 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

Hexapod robot have multiple redundant degrees of freedom and are suitable for complex unstructured environments.Discrete environments, as a harsh special case of unstructured environments, require hexapod robots to have more efficient and reliable motion strategies.A plane random plum-blossom pile environment was taken as an example.A random starting point and a target area were set, and the deep reinforcement learning algorithm was applied to plan a motion strategy for a hexapod robot in theplane plum-blossompile environment.To speed up the training process, a deep deterministic policy gradient algorithm with a prioritized experience replay mechanism was used.Finally the policy was verified in a real environment.The results show that the planned motion strategy can make the hexapod robot move efficiently and smoothly from a starting point to a target area in aplane plum-blossom pile environment.This work lays the foundation for the precise motion planning of hexapod robots in the real discrete environment.

A DQN-based approach for energy-efficient train driving control

Shuai SU, Qingyang ZHU, Qinglai WEI, Tao TANG, Jiateng YIN

2020, 2(4): 372-384. doi:10.11959/j.issn.2096-6652.202040

Asbtract ( 248 )

HTML ( 23)

PDF (2256KB) ( 444 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

The energy consumption in railway system is growing rapidly due to the expanding scale of the railway network and decreased operational headway.Hence, it is of great significant to apply the energy-efficient operation of the vehicles to cut down the energy cost of the railway system.A method for solving the energy-efficient train driving control based on deep Q-network (DQN) approach was proposed.Firstly, the traditional energy-efficient train driving control problem was presented and its inverse problem was formulated, i.e., distributing the least energy consumption units to achieve the scheduled trip time.Moreover, the problem was reformulated as a Markov decision process (MDP) and a DQN-based approach for energy-efficient train driving control was proposed.A DQN was built to approximate the action value function which determines the optimal energy distribution policy and further obtain the optimal driving strategy.Finally, a numerical experiment based on the real-world operational data was proposed to verify the effectiveness of the proposed method and analyze the performance of the proposed method.The driving data of the trains is applied to improve the driving strategy via the proposed method in the paper which reduces the traction energy consumption.It is of significance for the future development of Chinese intelligent urban railway system.

Dermoscopic image lesion segmentation method based on deep separable convolutional network

Wencheng CUI, Pengxia ZHANG, Hong SHAO

2020, 2(4): 385-393. doi:10.11959/j.issn.2096-6652.202041

Asbtract ( 304 )

HTML ( 22)

PDF (2690KB) ( 553 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem of the difficulty in locating the lesions in dermoscopic images and achieving precise segmentation of the lesions, a method of lesion segmentation in dermatological images based on deep separable convolutional network was proposed.Firstly, perform the black frame removal and hair removal processing on the dermoscopic image to remove the artificial and natural noise that hinders the location of the lesion in the image.Then the image after the noise reduction process was deformed and rotated to expand the data set.Finally, a encoder-decoder segmentation model based on depth separable convolution and hole convolution was constructed.The coding part extracts the features of the image, and the decoding part fuses the feature maps and restores the image details.Experimental results show that this method can achieve better segmentation results for the problem of skin disease image lesion segmentation.The accuracy of segmenting lesions reaches 95.24%.Compared with the segmentation model U-Net, the accuracy is improved by 6.17%.

Output synchronization of heterogeneous multi-agent system:a reinforcement learning approach based on data

Yingying LIU, Zhanshan WANG

2020, 2(4): 394-400. doi:10.11959/j.issn.2096-6652.202042

Asbtract ( 386 )

HTML ( 29)

PDF (1066KB) ( 628 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

The output synchronization of heterogeneous multi-agent system was studied by reinforcement learning.According to the topology of multi-agent system, the performance index and value function with neighbor control input were defined.To overcome the disadvantage of existing control methods that require system model, a reinforcement learning algorithm based on system data was proposed.Hence, the output synchronization controller can also be applied when the system model was unknown.In addition, by adjusting the weight matrix in value function, the control cost of each agent can be reduced.Finally, a simulation example was given to illustrate the effectiveness of proposed method and the superiority of defined value function.

Modeling signal propagation in wireless network:an interval type-2 fuzzy ensemble deep learning approach

Liang ZHAO, Zhifeng XIE, Kunpeng ZHANG, Yuqing ZHENG, Yuankun FU

2020, 2(4): 401-411. doi:10.11959/j.issn.2096-6652.202043

Asbtract ( 291 )

HTML ( 16)

PDF (5450KB) ( 356 )

Knowledge map

Figures and Tables | References | Related Articles | Metrics

There exist some problems in the commonly used signal propagation models, such as single usage scenario and poor prediction accuracy.A data-driven wireless signal propagation model suitable for multiple scenarios was proposed.Firstly, the initial features were constructed from the preprocessed data according to the prior knowledge, and then the input feature set was obtained by using feature selection technique.Then analyzing the modeling requirements, selecting the deep belief network (DBN), residual network (ResNet) and stacked auto encoder (SAE) as the consequents (individual learners) of the interval type-2 fuzzy rules, and leveraged interval type-2 fuzzy inference to ensemble them.Finally, the actual measurement data of 5G signal propagation was applied for experimental verification.The results demonstrate that the performance of the three individual learners for the test set is better than those of the Cost231-Hata and back propagation neural network (BPNN), as well as the accuracy of ResNet is higher than those of DBN and SAE.Moreover, the performance of the interval type-2 fuzzy ensemble deep learning model is positively correlated with those of its individual learners and the number of fuzzy rules.Meanwhile, the heterogeneous ensemble is superior to the homogeneous counterpart for the test set.

Current Issue