Please wait a minute...

当期目录

    15 January 2021, Volume 7 Issue 1
    Data-Driven Intelligent Software Development
    Big data based intelligent software development methodology and environment
    Bing XIE, Xin PENG, Gang YIN, Xuandong LI, Jun WEI, Hailong SUN
    2021, 7(1):  3-21.  doi:10.11959/j.issn.2096-0271.2021001
    Asbtract ( 596 )   HTML ( 187)   PDF (2620KB) ( 628 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    A series of researches were conducted on the collection and organization of software engineering big data, software development knowledge representation and extraction, intelligent software development tools and service platforms. The purpose is to establish big data based intelligent software development technique systems, develop intelligent software development supporting tools, and form the next-generation intelligent software development environment and cloud-based platforms incorporating human, tools, and data. The outcome of the project includes a public service platform for the widespread innovation of the people and a series of intelligent software development environments for enterprises.

    Software knowledge graph construction and Q&A technology based on big data
    Yanzhen ZOU, Min WANG, Bing XIE, Zeqi LIN
    2021, 7(1):  22-36.  doi:10.11959/j.issn.2096-0271.2021002
    Asbtract ( 992 )   HTML ( 238)   PDF (2526KB) ( 1181 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    With the increasing of software scale and software evolution, it is more and more important to construct software project knowledge graph for software maintenance and software development. Automatically constructing software knowledge graph with complex structure and rich semantic relations based on the multi-source heterogeneous mass data such as source code, mailing list, issue report and Q&A document generated in the process of software project development is a key challenge to be solved urgently in the field of software engineering. A code-centric software knowledge model was proposed, a two-layer plugin framework for knowledge graph construction and software Q&A was provided, which improves the efficiency of software understanding and software reuse. At present, software project knowledge graph has successfully deployed in the Apache open source community and in the domestic famous enterprises.

    Context-based intelligent recommendation for code reuse
    Xin PENG, Chi CHEN, Yun LIN
    2021, 7(1):  37-47.  doi:10.11959/j.issn.2096-0271.2021003
    Asbtract ( 340 )   HTML ( 69)   PDF (2071KB) ( 455 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    Intelligent code reuse recommendation based on code-related big data analysis, mining, and learning can improve the efficiency and quality of software reuse significantly. The targets of reuse include domain specific common code units and domain independent common code units. Context-based intelligent recommendation for code reuse was focused, template mining based code reuse recommendation and deep learning based code reuse recommendation were described. Based on these two parts of work, the future trend of context based intelligent recommendation for code reuse was discussed further.

    Big-data based intelligent bug triage techniques for open-source projects
    Shengqu XI, Feng XU, Xin CHEN, Xuandong LI
    2021, 7(1):  48-63.  doi:10.11959/j.issn.2096-0271.2021004
    Asbtract ( 203 )   HTML ( 43)   PDF (1432KB) ( 360 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    Bug triage aims to determine the priority and repair measures and is critical in ensuring software trustability. However, in the increasingly popular open-source projects, due to a large number of defects and lack of organization and management, it is challenging to triage all the bug reports by hand on time, making big-data based, automated and intelligent bug triage urgent. An intelligent bug triage technical framework based on industry and academia’s cognition was proposed, and three key tasks: bug priority classification, bug assignment, and bug reassignment, were identified comprehensively and systematically. Related technologies for the characteristics of open-source projects were proposed. The preliminary experiment results show the reasonableness and effectiveness of the above techniques.

    An approach to automatically building Docker images by using domain knowledge
    Wei CHEN, Hongjie YE, Jiahong ZHOU, Jun WEI
    2021, 7(1):  64-75.  doi:10.11959/j.issn.2096-0271.2021005
    Asbtract ( 274 )   HTML ( 46)   PDF (1552KB) ( 547 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    A Dockerfile builds a Docker image by specifying how to construct a software system by downloading, installing and configuring software packages and their dependencies. However, manually writing a Dockerfile can be error-prone because system dependency resolution requires a lot of domain knowledge. Therefore, an approach to automating Dockerfile generation based on domain knowledge was proposed. The approach automatically parses Dockerfiles and extracts knowledge of building Docker images and stores the knowledge in a graph database. When generating new Dockerfiles, the system dependencies and their installation operations for the designated software based on the knowledge base were inferred. Experiments indicate that it is viable to automate Dockerfile generation for diversified software by inferring system dependencies and software package installations with the domain knowledge.

    Data driven intelligent collaboration of software developers
    Jian ZHANG, Xiangxin MENG, Hailong SUN, Xu WANG, Xudong LIU
    2021, 7(1):  76-93.  doi:10.11959/j.issn.2096-0271.2021006
    Asbtract ( 384 )   HTML ( 45)   PDF (2388KB) ( 552 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    Mining big software data and utilizing the knowledge contained in it to explore intelligent methods for software development is an active research topic. However, existing researches on software developer and crowd collaboration have not yet formed systematic methods. Therefore, the key technologies for intelligent collaboration through in-depth analysis of developer behavior were studied. Besides, the corresponding support environment was also developed on the basis of the key technologies to improve the efficiency and quality of software development. Firstly, a large amount of data related to developers were collected and analyzed. Secondly, a systematic approach of analyzing developers and their collaboration which is called developer knowledge graph was proposed. Thirdly, supported by the developer knowledge graph, the collaborative development method based on intelligent recommendation was introduced thoroughly. Depending on the above technologies, the corresponding supporting tools were developed, and a system of intelligent collaborative development environment was provided. Finally, the future work was prospected.

    Big data of open source ecosystem for intelligent software development
    Yang ZHANG, Tao WANG, Gang YIN, Yue YU, Jingquan HUANG
    2021, 7(1):  94-106.  doi:10.11959/j.issn.2096-0271.2021007
    Asbtract ( 394 )   HTML ( 64)   PDF (1840KB) ( 580 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    The open source software development process contains a lot of valuable data, which is huge in scale, fragmented, and rapidly expanding. Aimming to the characteristics, the big data structure of open source ecosystem of software engineering was studied, and a self-growing collection and processing framework and a convergence and sharing environment was proposed. The related research on the development of intelligent software based on open source big data of software engineering, and typical applications based on analysis and mining of open source big data of software engineering were expounded, and relevant guidance for the research and application of big data of open source ecosystem for intelligent software development was provided.

    STUDY
    Travel time estimation based on urban traffic surveillance data
    Wenming LI, Fang LIU, Peng LYU, Yanwei YU
    2021, 7(1):  107-123.  doi:10.11959/j.issn.2096-0271.2021008
    Asbtract ( 471 )   HTML ( 117)   PDF (1940KB) ( 617 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    With the development of intelligent transportation, more and more surveillance cameras are deployed at the intersections of urban roads, which makes it possible to use the urban traffic surveillance data to estimate the vehicle travel time and query the route. Aiming at the problem of urban travel time estimation, a travel time estimation method based on the urban traffic surveillance data was proposed, which is called UTSD. Firstly, the traffic surveillance cameras were mapped into the urban road network, and a directed weighted urban road network graph was constructed based on traffic monitoring data recording. Secondly, a spatio-temporal index and a reverse index structure were built for travel time estimation, the former was used to quick search the camera records of all vehicles, and the latter was used to fast obtain the travel time and the passing camera trajectory of each vehicle. These two indexes significantly improved the efficiency of data query and travel time estimation. Finally, based on the constructed indexing structures, an effective travel time estimation and path query method was given. According to the departure time, origin and destination, the vehicles with the same origin and destination were matched on the spatio-temporal index structure, and then the reverse index was used to quickly obtain the travel time estimate and vehicle route. Using the real traffic monitoring big data of a provincial capital city for experimental evaluation, compared with Dijkstra shortest path algorithm based on directed graph and Baidu algorithm, the accuracy rate of the proposed method UTSD is improved by 65.02% and 40.94%, respectively. In addition, the average query time of UTSD is less than 0.3 s when the 7-day monitoring data is used as historical data, which verifies the effectiveness and efficiency of the proposed method.

    APPLICATION
    Application of big data technology in precise prevention and control of epidemic situation
    Gang LI, Xiangchun ZHENG, Huashan YIN, Wenchao HUANG
    2021, 7(1):  124-134.  doi:10.11959/j.issn.2096-0271.2021009
    Asbtract ( 955 )   HTML ( 288)   PDF (1746KB) ( 911 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    Taking City X as an example, based on the actual situation of a mega-city, big data processing and analysis methods, a large database for epidemic situation prevention and control based on “four standards and four realities” data was built. And through big data technology, to assist epidemic situation prevention and control, a system of real-time awareness of the epidemic situation, precise personnel control, and precise enterprise assistance was built. The specific technical methods were analyzed in detail, such as the data construction status in the system, the association rule mining algorithm adopted, the infection warning mechanism based on expectation maximum probability clustering, and the unstructured data utilization strategy based on text mining. The system approximately saved more than 100 000 hours for country cadres, precisely located and traced tens of thousands of susceptible people who were focused, had played a huge role in blocking epidemic infection, elevating the rate of production resumption, and reducing economic losses, therefore it has a reference significance for all parts of the country.

    FORUM
    Primary exploration of transborder data folw supervision
    Yangyong ZHU, Yun XIONG
    2021, 7(1):  135-144.  doi:10.11959/j.issn.2096-0271.2021010
    Asbtract ( 297 )   HTML ( 70)   PDF (993KB) ( 491 )   Knowledge map   
    Figures and Tables | References | Related Articles | Metrics

    With the increasing awareness of the value of data, transborder data flow has attracted more and more attention. On one hand, transborder data flow is necessary for economic globalization and the development of the digital economy. On the other hand, it may cause damage to national data security without effective supervision. Therefore, it is necessary to distinguish between reasonable transborder data flow and malicious one, and formulate appropriate regulations. According to the analysis, two types of current transborder data flow and four transborder data flow channels were given. Furthermore, a classification supervision method for transborder data flow was proposed. This work provides the support for transborder data flow supervision and the legislation for transborder data flow.

Most Download
Most Read
Most Cited