Please wait a minute...
Most Download
Most Cited
Total visitors:
Visitors of today:
Now online:

Current Issue

    15 September 2021, Volume 3 Issue 3
    Review Intelligence
    Parallel philosophy and intelligent technology: dual equations and testing systems for parallel industries and smart societies
    Fei-Yue WANG
    2021, 3(3):  245-255.  doi:10.11959/j.issn.2096-6652.202126
    Asbtract ( 288 )   HTML ( 48)   PDF (4026KB) ( 319 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    Starting from Karl Popper’s Three Worlds, the knowledge system facing three worlds and the corresponding philosophical problem were proposed.The scope of traditional philosophy from being and becoming to believing was extended, and a new parallel philosophy as the foundation for intelligent science and technology was proposed.The dual equation systems and parallel testing were introduced through ACP, and cyber-physical-social systems (CPSS).This will enable a true DAO approach for intelligent parallel industries and smart societies: trustable, reliable, usable, and effective +efficient (true), distributed + decentralized, autonomous + automated, organized + ordered (DAO), and the corresponding distributed autonomous organizations and operations (DAOs).

    Special Issue: Intelligent Object Detection and Recognition
    Retinal multi-disease screening and recognition method based on deep convolution ensemble network
    Heyang WANG, Qiming YANG, Qi ZHU
    2021, 3(3):  259-267.  doi:10.11959/j.issn.2096-6652.202127
    Asbtract ( 152 )   HTML ( 26)   PDF (3931KB) ( 133 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    As for the characteristics of various types of retinal diseases and uncertainty of the location of the lesions, a retinal multi-disease screening and recognition method based on deep convolutional ensemble network was proposed.Firstly, the black borders on both sides of the retinal fundus image were cut off, and the noise in the image was removed to reduce the interference to the retinal image and increase the clarity of the image.After that, data augmentation methods such as cropping and rotating were performed to process retinal fundus image to amplify the dataset.Then, a model based on deep convolutional neural network was built for feature extraction, and the network model was fine-tuned to complete the task of screening and identifying retinal diseases.Finally, the results of multiple models were ensembled.The experimental results show that this method has achieved good results for the screening and recognition of retinal diseases, the accuracy of retinal disease screening is 96.05%, and the accuracy of retinal disease recognition is 72.55%.

    3D convolution-based image sequence feature extraction and self-attention for license plate recognition method
    Ganxiong ZENG, Xiao KE
    2021, 3(3):  268-279.  doi:10.11959/j.issn.2096-6652.202128
    Asbtract ( 150 )   HTML ( 15)   PDF (10995KB) ( 173 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    In recent years, neural networks based on self-attentive mechanism have been widely used in computer vision tasks.As the intelligent transportation system is widely used, the task difficulty of license plate recognition is increasing and the need for correct recognition is getting more pressing in the face of complex and changing traffic scenes.Therefore, a rectification-free license plate recognition method T-LPR based on self-attention was proposed.Firstly, the images were sliced and sequenced, and 3D convolution was used for feature extraction of the sliced sequences to obtain a sequence of image embedding vectors.Secondly, the sequence of embedding vectors was fed into an encoder based on Transformer Encoder, which learned the relationship between the individual embedding vectors and outputs the final encoding result.Finally, the final encoding result was classified by a classifier.Experimental results on several public datasets show that T-LPR proposed is very effective for recognizing license plates in all kinds of difficult scenarios.

    A method based on visual saliency for vehicle-mounted monocular camera ego-motion estimation and vehicle scale estimation
    Mingxin AI, Tie LIU, Jing WANG, Jiali DING, Zejian YUAN, Yuanyuan SHANG
    2021, 3(3):  280-293.  doi:10.11959/j.issn.2096-6652.202129
    Asbtract ( 130 )   HTML ( 19)   PDF (2518KB) ( 158 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    A method based on visual saliency for ego-motion estimation and scale estimation of the vehicle in front was proposed.Firstly, for the ego-motion estimation of vehicle-mounted camera, the visual saliency calculation method was used to detect and remove moving objects in the monocular image sequence containing noise.While considering the image texture and smooth region, the weighted saliency map was used to retain useful feature points, to improve the robustness of ego-motion estimation.Secondly, the distance of the vehicle in front was converted into a vehicle scale estimation, by integrating descriptor match and the strength of regularization match of the lie algebra to minimize loss function.The visual attention mechanism was used to get texture image block without shade, and the pixel in the image block weight to mitigate the effects of destroyed by noise pixel, so as to realize the robust and accurate scale estimation.Finally, several challenging datasets were used to analyze and verify the proposed method.The results show that the monocular camera ego-motion estimation method reaches the level of the stereo camera-based method, and the vehicle scale estimation method ensures the prediction accuracy while giving full play to the advantages of strong robustness.

    Anchor free multispectral pedestrian detection algorithm based on differential feature attention mechanism
    Jifeng SHEN, Yue1 LIU, Hao WEI, Xin ZUO, Wankou YANG
    2021, 3(3):  294-303.  doi:10.11959/j.issn.2096-6652.202130
    Asbtract ( 196 )   HTML ( 24)   PDF (4937KB) ( 202 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    Multispectral pedestrian detection system suffers with low feature fusion quality, high quantity of model hyper-parameters and complex anchor matching algorithm.To deal with these problems, an anchor free multispectral pedestrian detection algorithm based on differential feature attention mechanism was proposed.Firstly, differential modality aware fusion was used to obtain the complementary information between different modalities to optimize the channel features.Secondly, the CenterNet detection framework with anchor free mechanism was adopted to greatly reduce the computational complexity of the model and thus improve the detection speed.Finally, differential feature guided attention mechanism was introduced to improve the quality of feature fusion and further enhance the detection accuracy.Experimental results on three open datasets, KAIST, CVC14 and FLIR, show that the proposed algorithm can effectively improve the detection accuracy and speed compared with the current advanced methods, and has a good practical application prospect.

    Plant leaf detection technology based on multi-scale CNN feature fusion
    Ying LI, Long CHEN, Zhaohong HUANG, Yang SUN, Guorong CAI
    2021, 3(3):  304-311.  doi:10.11959/j.issn.2096-6652.202131
    Asbtract ( 142 )   HTML ( 22)   PDF (2235KB) ( 157 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    Plant leaf detection is one of the essential aspects of the scientific plant breeding and precision agriculture process.The traditional practice of plant leaf detection requires professional knowledge of the operators, high labor costs, and long time-consuming cycles.The plant leaf detection technology based on multi-scale CNN feature fusion (MCFF) was proposed.Starting from the needs of deep learning technology assisted plant cultivation, a MCFF was used to detect leaf count for three different types and resolutions of rosette model plants, arabidopsisthaliana, and tobacco.Compared with the other three algorithms, the MCFF has a higher detection accuracy with an average detection rate of mAP 0.662, a highly competitive performance (AP = 0.946) has been achieved for each indicator close to the practical level.

    Insulator detection based on SE-YOLOv5s
    Qing TIAN, Rong HU, Zuoyong LI, Yuanzheng CAI, Zhaochai YU
    2021, 3(3):  312-321.  doi:10.11959/j.issn.2096-6652.202132
    Asbtract ( 155 )   HTML ( 21)   PDF (4481KB) ( 178 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    In a large environment where the power system needs to be inspected, the traditional method of manual inspection has great inconvenience and potential safety risks, and the object detection method of unmanned aerial vehicle has great application prospects in the direction of insulator detection and recognition.SE-YOLOv5s, a lightweight insulator detection network that performs efficient detection for this task was presented.Firstly, backbone of YOLOv5s by fusing the SE attention module was strengthen.Then, the position distribution of insulator object was investigated and predefined templates of a priori box by K-means clustering on prior coordinate vectors were generated.Finally, the network by a multitask loss function was trained combined with confidence and position regression task.Furthermore, Mosaic data augmentation was utilized to supplement additional training samples.Experimental results demonstrate that the proposed SE-YOLOv5s significantly outperforms baseline methods at multiple criterions including accuracy, recall rate, detection rate and mean average precision.In comparison with the baselines, the proposed network has a flexible trade-off between robustness and memory overhead and it is a potential approach to promote the power system development.

    Light weight rotating object detector based on angle sensitive spatial attention mechanism
    Kaishi YIN, Meng YANG, Xi GU, Zhicheng WANG
    2021, 3(3):  322-333.  doi:10.11959/j.issn.2096-6652.202133
    Asbtract ( 140 )   HTML ( 18)   PDF (6137KB) ( 132 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    With the rapid development of deep learning, more and more target detection algorithms based on anchor frame are applied to remote sensing images in recent years.However, the cost of improving the accuracy of the algorithm is to sacrifice the detection speed.Therefore, the target detection network framework of anchor free was chosen, and a remote sensing detection algorithm of rotating frame was proposed according to the characteristics of remote sensing scene.A simple and effective representation of rotating frame was proposed according to the spatial position relationship between rotating frame and its external rectangular frame.In addition, an angle sensitive attention mechanism was designed to assist the detection of rotating targets.By introducing angle information, the detection ability of the model for rotating targets was improved.The proposed algorithm was tested on the open remote sensing dataset DOTA.The mean average precision of target detection network of rotating frame is 68.5% and the detection speed is 17.4 frames per second.

    Collaborative representation based classifier with maximum correntropy criterion and locality constraint
    Qinru YU, Guifu LU
    2021, 3(3):  334-341.  doi:10.11959/j.issn.2096-6652.202134
    Asbtract ( 100 )   HTML ( 13)   PDF (1199KB) ( 100 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    A method which utilizes maximum correntropy criterion and locality information called collaborative representation based classifier with maximum correntropy criterion and locality constraint (CRC/MCCLC) was proposed.On the one hand, CRC/MCCLC was not only more robust to outliers than L1 norm but also could be computed efficiently using half-quadratic optimization technique because of the use of maximum correntropy criterion.On the other hand, CRC/MCCLC could obtain more discriminative information from the training samples and could lead to an approximately sparse representation because of the use of locality information.Extensive experimental results on some image databases demonstrate that CRC/MCCLC can achieve the state-of-the-art performance on these image databases.

    Robust non-negative supervised low-rank discriminant embedding algorithm
    Yu YAO, Minghua WAN
    2021, 3(3):  342-350.  doi:10.11959/j.issn.2096-6652.202135
    Asbtract ( 86 )   HTML ( 9)   PDF (13761KB) ( 79 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    Non-negative matrix factorization (NMF) has been widely used.However, NMF pays more attention to the local information of the data, it ignores the global representation of the data.In terms of image classification, the global information of the data is often more robust to noise than the local information.In order to improve the robustness of the algorithm, combined with the data of local and global representation, and considered the characteristics of low-rank representation, a non-negative supervised low-rank discriminant embedded algorithm was proposed.This algorithm assumed the existence of noise in the data, decomposed the data into clean data and noise data, and made sparse constraints on the noise matrix through the L1norm, so as to enhance the robustness to noise.In addition, the algorithm used low-rank representation to learn a low-rank matrix, then through non-negative decomposition, the robustness of the algorithm was enhanced again.Finally, combined with a study of graph embedding method, the local and global data were retained at the same time.The algorithm is applied to various noisy databases, and the recognition rate of this algorithm is improved by about 5%~15% compared with the comparison algorithm.

    End-to-end speech enhancement based on ultra-lightweight channel attention
    Yi HONG, Chengli SUN, Yan LENG
    2021, 3(3):  351-358.  doi:10.11959/j.issn.2096-6652.202136
    Asbtract ( 164 )   HTML ( 19)   PDF (2888KB) ( 176 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    The full convolutional time-domain audio separation network (Conv-TasNet) is a state-of-the-art end-to-end speech separation model which was proposed recently.The Conv-TasNet used dilated convolution to expand the receptive field and fuse more speech features in space, which greatly improved the speech separation performance of the network, but at the same time ignored the importance of information across different convolution channels.An end-to-end speech enhancement method based on ultra-lightweight channel attention was proposed, which effectively combined Conv-TasNet and channel attention.At the same time, a group of filters was added to the Conv-TasNet codec to improve the speech feature extraction ability of the network.This method can make convolutional neural network combine spatial information and channel information more effectively to improve the speech enhancement effect.Experiment shows that the proposed model can effectively improve the performance of speech enhancement when the model capacity is only increased by about 0.02%.

    Sparse representation for image recognition based on semi-genetic algorithm in feature space
    Linrui SHI, Yijing HUANG, Jinwu FU, Xinyue GUO, Zizhu FAN
    2021, 3(3):  359-369.  doi:10.11959/j.issn.2096-6652.202137
    Asbtract ( 89 )   HTML ( 12)   PDF (1098KB) ( 75 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    The typical sparse representation for classification (SRC) is usually based on L1minimization problem.Conceptually, SRC is essentially an L0norm minimization problem solved in the original input space, which cannot capture well the nonlinear information within the data.In order to address this problem, a nonlinear mapping to map the original input data into a new high dimensional feature space was applied, and a new representation approach based on L0norm was proposed.The representing dictionary used to represent the test sample contains two parts in the proposed approach.The first part is fixed to the neighbors of the test sample.The training samples of the second part is chosen by the variation of genetic algorithm (GA), i.e., the semi GA (SGA) algorithm, which exploits the representation error to determine the second part of the representing dictionary.In the approach, if the training samples combining the determined neighbors of the test sample yield the least representation error, these training samples are determined as the second part of the representing dictionary by SGA.Experiments on several popular face databases and one handwritten digit data set demonstrate that the proposed approach can achieve better classification performance.

    Research on anti-spoofing method of face recognition based on semi-supervised learning
    Li LI, Weiliang ZENG, Yonghui HUANG, Weijun SUN
    2021, 3(3):  370-380.  doi:10.11959/j.issn.2096-6652.202138
    Asbtract ( 198 )   HTML ( 35)   PDF (2002KB) ( 170 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    It is a long-term challenge to identify the real and fake faces in the images.When the synthetic fake faces are very realistic, it is difficult for machines and even naked eyes to distinguish the real and fake ones.The supervised anti-spoofing method often requires a large number of labeled samples for a good performance.An anti-spoofing method of face recognition based on semi-supervised learning was proposed to reduce the dependence on massive labeled samples.The method adopted an image inpainting model to learn the data distribution of face images.During the training process, a few labeled samples periodically provided supervised signals to train the classifier to distinguish real faces from fake ones.The proposed method could be used for face anti-spoofing in different scenario, such as faces captured by cameras or generated by generative adversarial net.Accordingly, it was evaluated on the NUAA and RMFD datasets.Experiment results show that the proposed method can keep the quality of restored images, and achieve desirable classification accuracy.With a few labeled samples, the proposed method outperforms Improved-GAN and common semi-supervised methods, and surpasses supervised learning method based on support vector machine and convolutional neural network.