Editors Recommend

    Please wait a minute...
    For Selected: Toggle Thumbnails
    A survey of expressive speech synthesis
    Haobin TANG, Xulong ZHANG, Jianzong WANG, Ning CHENG, Jing XIAO
    Big Data Research    2023, 9 (6): 53-71.   DOI: 10.11959/j.issn.2096-0271.2022082
    Abstract229)   HTML71)    PDF(pc) (3524KB)(200)       Save

    Speech synthesis is a hot research topic in the field of speech, language and machine learning, which aims to synthesize understandable and natural speech for a given text.It has a wide range of applications in industry.One of the goals of speech synthesis is to make the synthesized speech natural, and there is still a lot of room for improvement in emotion, prosody and other aspects of speech synthesis.A comprehensive survey of expressive speech synthesis was conducted with the aim of better understanding current research status and future trends.A comprehensive summary, comparison and analysis of emotion-based and prosodic speech synthesis in recent years were given.Firstly the traditional way and bottleneck of common speech synthesis were introduced, then expressive speech synthesis was introduced and the benefits of expressive speech synthesis in the aspects of emotion and prosody were described.Finally, the prospect and summary of expressive speech synthesis were presented.

    Table and Figures | Reference | Related Articles | Metrics
    Research on the internal logic and solution of the “Channel Computing Resources from the East to the West” project
    Nannan TONG, Dong CHEN, Huiying LI, Honglin ZHU
    Big Data Research    2023, 9 (5): 9-19.   DOI: 10.11959/j.issn.2096-0271.2023055
    Abstract294)   HTML112)    PDF(pc) (1659KB)(462)       Save

    The "Channel Computing Resources from the East to the West" project is a major strategic project to build a balance of computing resources and on-demand scheduling in China's territorial space.Since the full launch of the Chinese"Channel Computing Resources from the East to the West" project construction, a series of problems have been exposed on many different fields, such as the supply side, demand side, energy side, technology side, and mechanism side.It is urgent to re-analyze and define the internal logic of the "Channel Computing Resources from the East to the West"project from a theoretical level.The internal logic of "Channel Computing Resources from the East to the West", that is, the infrastructure of Chinese computing, was analyzed from different perspectives such as economic form, technological trend, technological competition, and cost-benefit.It also proposed to build a new type of infrastructure for the Chinese national computing network, called computing NET, and build a national computing NET construction path from the aspects of policy layout, network direct connection, technical support, and mechanism innovation.

    Table and Figures | Reference | Related Articles | Metrics
    A survey on digital content generation, detection, and forensics techniques
    Juan CAO, Yongchun ZHU, Peng QI, Ziyao HUANG, Tianyun YANG, Zhengjia WANG, Yuyan BU
    Big Data Research    2023, 9 (5): 150-173.   DOI: 10.11959/j.issn.2096-0271.2023066
    Abstract207)   HTML55)    PDF(pc) (3015KB)(400)       Save

    In recent years, the technology of digital content generation has been greatly developed, and the detection and forensic technology of digital content are facing new challenges.This paper firstly introduced digital content generation technology from three aspects: large natural language model, visual generation technology, and multimodal generation technology.Secondly, it introduced digital content detection technology from three aspects: generated text detection, generated image detection, and generated audio and video detection.Thirdly, it introduced digital content forensics technology from two aspects: utilizing fact ual information and forging traces.Then, this paper introduced the application scenarios of these techniques.Finally, it prospected the future work in this research field, and pointed out several directions that need to be focused on.

    Table and Figures | Reference | Related Articles | Metrics
    Harp: optimization algorithm for cross-domain distributed transactions
    Qiyu ZHUANG, Tong LI, Wei LU, Xiaoyong DU
    Big Data Research    2023, 9 (4): 16-31.   DOI: 10.11959/j.issn.2096-0271.2023043
    Abstract75)   HTML6)    PDF(pc) (3847KB)(127)       Save

    The paradigm of near-data computing has driven banks and securities firms to build multiple data centers globally or nationally.In the traditional business model, transactions focused on accessing data within a single data center.With the changing business model, distributed transactions across data centers have become common, such as transferring money between bank accounts or exchanging equipment between game accounts, with data stored in different data centers in different regions.Distributed transaction processing requires the two-phase commit protocol to ensure the atomicity of the sub-transactions submitted by each participating node.In processing cross-domain transactions, traditional transaction processing technology needs to be expanded to ensure that the system can provide higher throughput due to the longer and more varied network latency between nodes.After analyzing the problems and optimizing space for crossdomain distributed transactions, this paper proposes a new distributed transaction processing algorithm called Harp.Harp delays the execution of some sub-transactions based on the difference in network latency while ensuring serializable isolation level, reducing the duration of transaction lock contention, and improving system concurrency and throughput.Experiments show that Harp improves the performance by 1.39 times compared with the traditional algorithm under YCSB workload.

    Table and Figures | Reference | Related Articles | Metrics
    Digital transformation service platform:enhancing enterprise competitiveness in a new competitive situation
    Yazhen YE, Yangyong ZHU
    Big Data Research    2023, 9 (3): 3-14.   DOI: 10.11959/j.issn.2096-0271.2023029
    Abstract427)   HTML184)    PDF(pc) (1743KB)(445)       Save

    With the improvement of data abilities and the development of emerging technologies, there are profound changes occurring in economic patterns and competitive structure of industries.In order to better respond to future opportunities and challenges, and to improve competitiveness of enterprises in new situations, it is necessary to understand and master the knowledge of digital transformation.The new competitive situation was discussed in which traditional enterprises would gradually be replaced by digital-transformed ones, digital transformation was differentiated from digitalization.Main challenges facing traditional enterprises while undergoing digital transformation were pinpointed, which were the lack of funds, talents, data and consciousness.A digital transformation service platform oriented to new competitive situation was proposed, which provided a feasible solution to enhancing enterprise competitiveness and conducting digital transformation.

    Table and Figures | Reference | Related Articles | Metrics
    Generative AI empowered metaverse organisms: prospects and challenges
    Hao WANG, Yushan PAN, Yi PAN
    Big Data Research    2023, 9 (3): 85-96.   DOI: 10.11959/j.issn.2096-0271.2023033
    Abstract379)   HTML95)    PDF(pc) (2778KB)(401)       Save

    The metaverse has been discussed in fields such as medicine, manufacturing, finance, education, and public services, but the application scenarios based on virtual reality have not truly achieved the "real-virtual-real" loop interaction method.Its interaction mode has also not truly given the digital world the same consciousness and perception as the physical world.Taking medicine as a case study, the prospective applications and challenges of generative artificial intelligence models in metaverse organisms were explored, including digitizing biological cells, and building connections between digitized cells and digital neurons, in order to promote metaverse life forms to have perception and biochemical reactions consistent with the physical world, thereby empowering the development of the medical field.In response to the current advantages and disadvantages of the metaverse and generative artificial intelligence models, the clever design of human-machine collaboration mechanisms was discussed to promote conscious interaction between humans and metaverse organisms in medicine.

    Table and Figures | Reference | Related Articles | Metrics
    Human avatars synthesis technologies: a survey
    Yimin DENG, Xulong ZHANG, Shijing SI, Jianzong WANG, Jing XIAO
    Big Data Research    2023, 9 (3): 114-139.   DOI: 10.11959/j.issn.2096-0271.2022081
    Abstract195)   HTML41)    PDF(pc) (3812KB)(298)       Save

    Nowadays, the demand for efficient human avatars modeling is becoming increasingly urgent since metaverse has attracted more and more attention.Creating human avatars from human image datasets has always been a popular topic in the field of computer vision.3D human avatars synthesis can be regarded as a sub-module of 3D reconstruction focusing on reproducing the complex articulated body and surface details of human.A comprehensive survey of the literature related to the human reconstruction in recent years was conducted, including the work of full-body avatars, talking-head and clothing modeling.By analyzing and summarizing existing work, human avatars synthesis technologies were divided into five categories: mesh-based methods, image-based methods, voxel-based methods, implicit methods and hybrid methods due to the features of their pipelines.Firstly, the basic principles of them were introduced respectively.Secondly, the realization based on related work was discussed and then the advantages and disadvantages of methods respectively were pointed out.Thirdly, the datasets and metrics for model quality evaluation were introduced.Besides, an overview of various applications was given.Finally, the future directions of human avatars synthesis technology were prospected to synthesize high-quality, high-fidelity and low-latency human avatars.

    Table and Figures | Reference | Related Articles | Metrics
    Structural separation of data property rights based on data factor circulation value chain
    Lihua HUANG, Wanli DU, Biyu WU
    Big Data Research    2023, 9 (2): 5-15.   DOI: 10.11959/j.issn.2096-0271.2023022
    Abstract153)   HTML35)    PDF(pc) (2104KB)(194)       Save

    Data property rights are of great significance in the data foundational institution system.Data property rights are structurally separated into three rights, namely, the right of data resource possessing, the right of data processing and the right of data product managing.Existing theories are not practice-oriented, and the responsive way of right-granting can hardly fix the market failures as well.Based on the data factor circulation practice, two typical logics of data factor circulation were proposed, namely, architecture logic and market logic.Based on market logic, the data factor circulation value chain model was constructed.The generated way of right-granting could provide an explanation of the structural separation of data property rights.

    Table and Figures | Reference | Related Articles | Metrics
    Investigation into authorized public data operation: its positioning and nature
    Feng GAO
    Big Data Research    2023, 9 (2): 16-32.   DOI: 10.11959/j.issn.2096-0271.2023017
    Abstract178)   HTML20)    PDF(pc) (1603KB)(248)       Save

    At present, the theory and practice of authorized data operation are still chaotic and controversial.Based on the clarification of what open data means in the Chinese context, the current theory that authorized data operation is either complementary to open data or part of open data was challenged, and authorized data operation that implemented open data through a market-oriented delegation mechanism was repositioned.The need was presented to unpack the definitions of two key elements of authorized data operation: data product and data operation, and a novel interpretation of authorized data operation was proposed as it was a social-technical system serving the purpose of open data and mainly producing primary data products for reuse rather than just using with support from a rich tiered data operation service ecosystem.

    Table and Figures | Reference | Related Articles | Metrics
    Research on the regularity of data factor formation and value release
    Zeyu WANG, Ailin LYU, Shu YAN
    Big Data Research    2023, 9 (2): 33-45.   DOI: 10.11959/j.issn.2096-0271.2023019
    Abstract115)   HTML16)    PDF(pc) (1666KB)(93)       Save

    Based on the conceptual and historical analysis of data and production factors, it was proposed that the data factor was the computer data and its derivative that was gathered, sorted and processed according to specific production needs and participated in social production and operation activities.It should focus on the value of data factor as a new driving force for economic growth and production development.It was summarized the three value release ways of data factor, namely, business operation by data, intelligent analysis by data and external effects by data exchange, and it should be given full attention in the process of promoting the development of data factor.

    Table and Figures | Reference | Related Articles | Metrics
    Data trust: a trustworthy data transaction model
    Jinglei HUANG, Jinpu LI, Ke TANG
    Big Data Research    2023, 9 (2): 67-78.   DOI: 10.11959/j.issn.2096-0271.2023016
    Abstract65)   HTML9)    PDF(pc) (2045KB)(148)       Save

    Data trust is regarded as a new and credible data transaction model.Data trust is not only an organizational structure to guarantee information security, but also an innovative institutional design to enhance the trustworthiness of data factor market.A data trust operation mechanism was designed, and the organizational structure, characteristics, functions, and regulatory scheme of the data trust under this mechanism were discussed.That data trust contained two independent legal relationships were pointed out.Its most important feature was the risk segregation among participants, which was the basis of data trust as a credible data transaction model.Data trusts could play an important function in the data value chain, specifically in terms of data value addition, data escrow, and data commons.Finally, the unique advantages of data trust based on the institutional guidelines of “20 measures to build basic systems for data” were demonstrated, and that personal data trust and public data trust were feasible practical paths was pointd out.

    Table and Figures | Reference | Related Articles | Metrics
    Intelligent text generation: recent advances and challenges
    Xiaojun WAN
    Big Data Research    2023, 9 (2): 99-109.   DOI: 10.11959/j.issn.2096-0271.2023014
    Abstract313)   HTML39)    PDF(pc) (2520KB)(385)       Save

    Intelligent text generation is one of the advanced research directions in the fields of artificial intelligence and natural language processing.It is one of key technologies for making AIGC successful, and it has been highly concerned by both academia and industry in recent years.The technology has been deployed and used in many application areas including media publishing and e-commerce, and it can much improve the efficiency of text content production.The systematic overview of the applications of intelligent text generation and the mainstream ways of text generation were given.And the recent deep learning techniques for text generation were introduced.Lastly, the challenges faced by neural text generation were summarized.

    Table and Figures | Reference | Related Articles | Metrics
    Internet of data: a solution for dataspace infrastructure and its technical challenges
    Chaoran LUO, Yun MA, Xiang JING, Gang HUANG
    Big Data Research    2023, 9 (2): 110-121.   DOI: 10.11959/j.issn.2096-0271.2023024
    Abstract114)   HTML11)    PDF(pc) (2256KB)(273)       Save

    Dataspace is the transformation of cyberspace from "computing centric" to "data centric", which contains great technological issues and innovative opportunities.Similar to the internet, which is the main infrastructure of cyberspace, dataspace also needs a new "data-centric" infrastructure, whose core function is to realize the first-class entity of data.From the perspective of dataspace, the supports and shortcomings of mainstream technologies such as the internet, the World Wide Web, and the digital object architecture for the first-class entity of data were analyzed and summarized, and then the basic connotations and technical challenges of dataspace infrastructure were given.Finally, a first-class data substantialization method based on data pragmatics was proposed.Based on this method, a solution called the internet of data by integrating digital object architecture, distributed ledger, smart contract, and other technologies was proposed to support the construction and operation of internetscale dataspace infrastructure.

    Table and Figures | Reference | Related Articles | Metrics
    Big data technologies forward-looking
    Hong MEI, Xiaoyong DU, Hai JIN, Xueqi CHENG, Yunpeng CHAI, Xuanhua SHI, Xiaolong JIN, Yasha WANG, Chi LIU
    Big Data Research    2023, 9 (1): 1-20.   DOI: 10.11959/j.issn.2096-0271.2023009
    Abstract2557)   HTML961)    PDF(pc) (1087KB)(1443)       Save

    Major countries in the world attach great importance to the development of big data technology.China also puts big data as a national strategy, of great significance to develop in the long run.Big data technologies include data collection, transmission, management, processing, analysis, and application, forming a data life cycle as well as the data governance related to each procedure.Big data management, processing, analysis, and governance in four areas were seleceted, to identify the gap between China and the world.On the other hand, driven by diverse successful big data applications, the system architecture of computing technology is being restructured.From “computation-centric” to “data-centric”, fundamental computing theories and core technologies need to be redesigned, therefore a new type of big data system technology is becoming an important research direction.Against this background, four technical challenges and ten future development trends of big data technologies were aimed at identifying.

    Reference | Related Articles | Metrics
    Cloud-edge-end collaborative big data management for metaverse
    Rui ZHU, Hongzhi WANG, Shuangshuang CUI, Kaixin ZHANG, Yu YAN
    Big Data Research    2023, 9 (1): 63-77.   DOI: 10.11959/j.issn.2096-0271.2023011
    Abstract382)   HTML106)    PDF(pc) (1807KB)(464)       Save

    With the increasing number of users in the metaverse, the data also increases accordingly, which brings challenges to the data management of the metaverse.Big data management techniques are essential to realizing the metaverse.Therefore, data management technology in the metaverse was discussed.The metaverse was decomposed into three levels cloud, edge, and end.The massive data in the metaverse was analyzed.The four challenges of data management in the metaverse were discussed, and the corresponding research routes were put forward from four aspects of data synchronization, data access, data model, and query optimization.

    Table and Figures | Reference | Related Articles | Metrics
    Research on the legal conundrums and regulation ideas of metaverse
    Bo HE
    Big Data Research    2023, 9 (1): 87-102.   DOI: 10.11959/j.issn.2096-0271.2023007
    Abstract280)   HTML50)    PDF(pc) (1502KB)(377)       Save

    The rise and development of the metaverse have brought challenges to legal regulation.The metaverse is not a place outside the law, it also needs to abide by the law and ensure the correct operation in the legal orbit.Firstly, the development characteristics of the metaverse were summarized, such as technological, commercial, social as well as transnational, and the risks and responses were analyzed.Secondly, the main legal conundrums of the metaverse were analyzed including cybersecurity, personal information and privacy protection, data governance, virtual property, ecological governance, platform liability, and cybercrime.Finally, the idea of regulating metaverse development according to law was put forward, it was suggested to adhere to the principle of safe and controllable development, the appropriately advanced layout of legislation of metaverse, promote regulations in main areas through legislative means such as enactment, reform, abolition, and interpretation, and achieved good governance through good laws.

    Table and Figures | Reference | Related Articles | Metrics
    Threats and defenses of federated learning: a survey
    Jianhan WU, Shijing SI, Jianzong WANG, Jing XIAO
    Big Data Research    2022, 8 (5): 12-32.   DOI: 10.11959/j.issn.2096-0271.2022038
    Abstract1713)   HTML246)    PDF(pc) (2537KB)(1832)       Save

    With the comprehensive application of machine learning technology, data security problems occur from time to time, and people’s demand for privacy protection is emerging, which undoubtedly reduces the possibility of data sharing between different entities, making it difficult to make full use of data and giving rise to data islands.Federated learning (FL), as an effective method to solve the problem of data islands, is essentially distributed machine learning.Its biggest characteristic is to save user data locally so that the models’ joint training process won’t leak sensitive data of partners.Nevertheless, there are still many security risks in federated learning in reality, which need to be further studied.The possible attack means and corresponding defense measures were investigated in federal learning comprehensively and systematically.Firstly, the possible attacks and threats were classified according to the training stages of federal learning, common attack methods of each category were enumerated, and the attack principle of corresponding attacks was introduced.Then the specific defense measures against these attacks and threats were summarized along with the principle analysis, to provide a detailed reference for the researchers who first contact this field.Finally, the future work in this research area was highlighted, and several areas that need to be focused on were pointed out to help improve the security of federal learning.

    Table and Figures | Reference | Related Articles | Metrics
    Value chain model of data governance and its application on data governance regulation analysis
    Keman HUANG, Xiaoyong DU
    Big Data Research    2022, 8 (4): 3-16.   DOI: 10.11959/j.issn.2096-0271.2022062
    Abstract912)   HTML348)    PDF(pc) (1444KB)(919)       Save

    Cultivating the data marketplace is an important mechanism to achieve the value of big data.The prosperity of such a data marketplace needs a sustainable and healthy data service ecosystem.A data governance value chain model was developed to identify the primary and support activities for data value release.Then the data service ecosystem model was implemented accordingly to distinguish different stakeholders and their core functions that a data marketplace should have.Using the developed data governance value chain model and data service ecosystem model, the data dovernance regulation was analyzed systematically, aiming at providing suggestions to promote the growth of the data marketplace.

    Table and Figures | Reference | Related Articles | Metrics
    Analysis on various patterns of data intermediary
    Zhenhua LI, Tongyi WANG
    Big Data Research    2022, 8 (4): 94-104.   DOI: 10.11959/j.issn.2096-0271.2022068
    Abstract326)   HTML68)    PDF(pc) (1783KB)(547)       Save

    Data intermediaries are expected to become the backbone of promoting data circulation through diversified innovative practices.Various patterns of data intermediaries were introduced.Different data intermediaries focus on solving different practical problems.For example, the data transaction platform focused on solving the information asymmetry between the supply and demand sides, open banking service providers such as Plaid focused on the unified conversion of data standards and data interfaces, and data trust could optimize the personal information sharing the path of authorization and consent.It was suggested to adhere to the problem orientation, actively explore and innovate diversified data intermediary patterns according to the needs of the scenario, build a healthy and reliable data ecosystem, and fully release the value of data.

    Table and Figures | Reference | Related Articles | Metrics
    Features and transaction modes of data products in data markets
    Lihua HUANG, Yifan DOU, Mengke GUO, Qifeng TANG, Gen LI
    Big Data Research    2022, 8 (3): 3-14.   DOI: 10.11959/j.issn.2096-0271.2022045
    Abstract1254)   HTML279)    PDF(pc) (1700KB)(1578)       Save

    Developing the markets of data as a factor of production is the key in the efficient allocation of data factor.However, the early practices of data markets in China have revealed a series of problems, which urgently calls for a systematic review and analysis on the data market theoretical mechanisms.The circulation process of data products was analyzed from different perspectives, such as transaction cost theory, electronic market framework, and electronic trading mode.And it was further proposed that the effects of the data computability were two-fold.On the one hand, the computability enabled data to be analyzed so as to fit in the specific demand in certain industries.On the other hand, the computability was also likely to remove the data transaction process from the market, also known as platform disintermediation.Based on the classical theoretical framework of electronic market, the offerings of data products were divided into four quadrants and analysis was conducted correspondingly.Finally, suggestions for data product suppliers and data transaction platform providers were put forward.

    Table and Figures | Reference | Related Articles | Metrics
    Authenticating and licensing architecture of data rights in data trade
    Qifeng TANG, Zhiqing SHAO, Yazhen YE
    Big Data Research    2022, 8 (3): 40-53.   DOI: 10.11959/j.issn.2096-0271.2022029
    Abstract695)   HTML113)    PDF(pc) (1417KB)(602)       Save

    Data is a key factor of production in digital economy and establishing a factor market of data is inevitable.The development of data factor market includes efforts in the fields of the authentication of data rights, object of transaction, pricing mechanics, exchange platform, trade regulation and so on.The rights and authentication process necessary for a data product or data service to be traded in a data exchange were explored systematically.The form of transaction object in data trade was designed as “data product/service + a right”.A variety of licenses for different forms of data products and data services were further designed, and a licensing system supporting the exchange of data was formed.

    Table and Figures | Reference | Related Articles | Metrics
    Digital economics in metaverse: state-of-the-art, characteristics, and vision
    Chenhuizi WANG, Wei CAI
    Big Data Research    2022, 8 (3): 140-150.   DOI: 10.11959/j.issn.2096-0271.2022048
    Abstract833)   HTML166)    PDF(pc) (1379KB)(1006)       Save

    Metaverse has become a very popular technology buzzword at the end of 2021, since Facebook changed its name to Meta, indicating their long-term commitment tometaverse.Firstly, the technical development process to expound on the inevitability and necessity of metaverse was reviewd.Afterward, the risks and challenges of the decentralized digital economy were revealed, through the analysis of the overseas metaverse digital economy.Lastly, it was pointed out that the key spiritual core of decentralization lies in the global anti-monopoly ideology, and the future of the domestic metaverse industry was envisioned.

    Table and Figures | Reference | Supplementary Material | Related Articles | Metrics
    Opportunities and challenges of geo-spatial information science from the perspective of big data
    Deren LI, Guo ZHANG, Yonghua JIANG, Xin SHEN, Weiling LIU
    Big Data Research    2022, 8 (2): 3-14.   DOI: 10.11959/j.issn.2096-0271.2022012
    Abstract837)   HTML171)    PDF(pc) (1585KB)(511)       Save

    The era of big data has arrived, and it has penetrated every aspect of human life.As the geo-spatial information science spawned by the intersection of earth sciences and information sciences, the advent of the era of big data provides it with richer prosperous data protection, but also brings new challenges in data storage, management, analysis, and mining, and even caused a certain degree of “data explosion”.From the perspective of big data, the bottlenecks and challenges in the four core areas of geographic information systems, smart cities, remote sensing big data, and spatial data mining were sorted out.And it was pointed out that geo-spatial information science can provide more accurate and real-time spatial information frameworks and more intelligent and more efficient information processing methods for geoscience research, serving intelligent cities, smart earth construction, and sustainable development in the era of big data.Moreover, in the era of big data, the development of geo-spatial information science is facing the double test of software and hardware levels.

    Reference | Related Articles | Metrics
    A survey on information extraction technology based on remote sensing big data
    Weiquan LIU, Cheng WANG, Yu ZANG, Qian HU, Shangshu YU, Baiqi LAI
    Big Data Research    2022, 8 (2): 28-57.   DOI: 10.11959/j.issn.2096-0271.2022014
    Abstract858)   HTML219)    PDF(pc) (9092KB)(426)       Save

    With the rapid development of remote sensing technology, our country has established a relatively complete space remote sensing and flexible and diverse aerial remote sensing data acquisition system.Remote sensing big data is mainly based on massive remote sensing data, integrating other multi-source remote sensing data, using big data thinking and methods, and discovering knowledge laws and high-value information in massive data.Firstly, the research work of information extraction technology based on remote sensing big data was reviewed in recent years.Secondly, the development history of remote sensing information extraction technology was expounded from three aspects: remote sensing target detection, remote sensing surface object segmentation, and remote sensing change detection.Finally, the information extraction technology based on remote sensing big data was sorted out, summarized and prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Legislative background and system of China’s Personal Information Protection Law
    Xiaoyang YU, Bo HE
    Big Data Research    2022, 8 (2): 168-181.   DOI: 10.11959/j.issn.2096-0271.2022022
    Abstract580)   HTML80)    PDF(pc) (1538KB)(440)       Save

    The Personal Information Protection Law was deliberated in August, 2021 and officially implemented in November, promoting the establishment of China’s legal system for the protection of personal information.Firstly, the background of the Law was systematically introduced, which has gone through three deliberations and adheres to problem orientation, with distinctive features and great significance.Secondly, the rules of the Law were analyzed, focusing on the scope of the concept of personal information, the basic principles of processing personal information, the basis of legality, the rights and obligations of relevant subjects, the cross-border provision rules, as well as the legal liability.Finally, the development of personal information from indirect protection to direct protection and to comprehensive protection was summarized.

    Table and Figures | Reference | Related Articles | Metrics
    Open access of scientific data in the context of open science: the practice of the National Tibetan Plateau Data Center
    Xiaoduo PAN, Xin LI, Youhua RAN, Xuejun GUO
    Big Data Research    2022, 8 (1): 113-120.   DOI: 10.11959/j.issn.2096-0271.2022010
    Abstract311)   HTML76)    PDF(pc) (3012KB)(603)       Save

    The concept, connotation and importance of open science and open data practice to scientific research were introduced.The challenges faced by open data currently were described in detail, that included data citation, data metrics, data interoperability, and big data analysis.Taking the National Tibetan Plateau Data Center as an example, its measurements and results in data citation, data interoperability and big data analysis were expounded.Finally, the role of data center in promoting open data was prospected.

    Table and Figures | Reference | Related Articles | Metrics
    A review and comparative analysis of domestic and foreign research on big data pricing methods
    Nan LIU, Xuejing HAO, Yuhong CHEN
    Big Data Research    2021, 7 (6): 89-102.   DOI: 10.11959/j.issn.2096-0271.2021063
    Abstract897)   HTML162)    PDF(pc) (1377KB)(986)       Save

    Due to the value characteristics of big data itself, the problem of data pricing is complicated.Although researchers have conducted a lot of research on this, most of them have a single angle and lack a certain practical application.In view of this, the big data pricing methods were reviewed, five types of pricing were sorted out: cost-oriented, market-oriented, demand-oriented, profit-oriented, and life-cycle-based pricing.The advantages and disadvantages of the six mainstream pricing methods were compared: cost method, agreement pricing, market method, income method, quality-based and query-based pricing.Finally, through the analysis of the big data pricing process, the characteristics of the different pricing methods were further revealed, and the data pricing direction was forecasted.The article aims to provide some reference for future related research.

    Table and Figures | Reference | Related Articles | Metrics
    Legal judgment prediction based on legal judgment documents
    Hu ZHANG, Bangze PAN, Hongye TAN, Ru LI
    Big Data Research    2021, 7 (5): 164-175.   DOI: 10.11959/j.issn.2096-0271.2021055
    Abstract556)   HTML92)    PDF(pc) (2304KB)(431)       Save

    According to the actual needs of the task of “legal judgment prediction” in the field of intelligent judicial services, the research ideas and implementation ways were discussed, and the overall framework and the specific process of this task were introduced.Based on the massive real cases obtained by China Judgments Online and the evaluation dataset of CAIL2018, the categories were sorted out.The format of the experimental dataset was standardized.And the prediction dataset of legal judgment prediction based on legal judgment documents was built.For the judgment prediction model,the high-quality sentences by using the method of decision elements extraction were extracted.Then refer to the judge’s judgment ideas, the whole task of legal judgment prediction was transform into three subtasks, namely the law articles prediction, the charge prediction, and the penalty prediction.Meanwhile, construct the prediction models based on the judgment elements respectively.The experimental results show that the proposed methods achieves excellent results on the criminal law judgment prediction dataset.

    Table and Figures | Reference | Related Articles | Metrics
    Issues faced by the determination of data ownership and solutions
    Bo HE
    Big Data Research    2021, 7 (4): 3-13.   DOI: 10.11959/issn.2096-0271.2021034
    Abstract657)   HTML115)    PDF(pc) (1009KB)(792)       Save

    As data becomes a key production factor, the determination of data ownership is becoming an important issue.Firstly, the problems that need to be solved urgently from perspectives of the government, enterprise, and individual brought by the unclear determination of data ownership were analyzed, including national data sovereignty and digital governance challenges, enterprise’s data concentration and disorderly competition problems, as well as personal data protection issues.Then, the theoretical and practical dilemmas in the determination of data ownership were pointed out.Finally, on the basis of adhering to the principles of equal emphasis on development and regulation, strictly abiding by the personal information protection bottom line and classification, the solution to crack the data ownership dilemma was proposed.That is by improving the design of the legal system to establish a basic data management system and explore the rules of data ownership determination by classification, strengthening the administrative supervision measures to improve data processing transparency and personal information protection, and making the full use of technical means.

    Reference | Related Articles | Metrics
    Assessment and pricing of data assets:research review and prospect
    Chuanru YIN, Tao JIN, Peng ZHANG, Jianmin WANG, Jiayi CHEN
    Big Data Research    2021, 7 (4): 14-27.   DOI: 10.11959/issn.2096-0271.2021035
    Abstract1106)   HTML202)    PDF(pc) (1814KB)(1458)       Save

    In the digital economy era, data has become a new key production factor.As a new form of assets, how to manage the value of data assets has become a new research topic.Through literature research, the research results of domestic and foreign scholars on data asset value management were analyzed systematically.And the concept of data asset value index on this basis was recommended, which was used to measure the relative value of data assets.The process of calculating the data asset value index by the use of analytic hierarchy process and the fuzzy comprehensive evaluation method were summarized, and the steps were decomposed.The internal connection and difference between the value and price of the data asset, the value assessment and the pricing of the data asset were demonstrated.The prospect for future research on data asset value management was proposed.

    Table and Figures | Reference | Related Articles | Metrics
    Key technologies and research progress of medical knowledge graph construction
    Ling TAN, Haihong E, Zemin KUANG, Meina SONG, Yu LIU, Zhengyu CHEN, Xiaoxuan XIE, Jundi LI, Jiawei FAN, Qingchuan WANG, Xiaoyang KANG
    Big Data Research    2021, 7 (4): 80-104.   DOI: 10.11959/issn.2096-0271.2021040
    Abstract1791)   HTML289)    PDF(pc) (1542KB)(2094)       Save

    With the continuous iterative updating of Internet technology, the semantic understanding of massive data is becoming more and more important.Knowledge graph is a kind of semantic network that reveals the relationship between entities.Medicine is one of the most widely used vertical fields of knowledge graph.The construction of medical knowledge graph is also a hot research in the field of artificial intelligence at home and abroad.Starting from the ontology construction of medical knowledge graph, named entity recognition, entity relationship extraction, entity alignment, entity linking, knowledge graph storage and application of knowledge graph were reviewed.The difficulties, existing technologies, challenges and future research directions in the process of constructing medical knowledge graph in recent years were introduced.Finally, the application of knowledge graph and the future development direction of medical knowledge graph were discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Recognition method of accounting fraud risk based on financial knowledge graph
    Qiang CHEN, Shiya DAI
    Big Data Research    2021, 7 (3): 116-129.   DOI: 10.11959/j.issn.2096-0271.2021029
    Abstract1007)   HTML205)    PDF(pc) (2017KB)(1303)       Save

    Since the accounting risk events exhibit complexity increasingly and occur frequently, a method merged by industrial knowledge and financial knowledge graph was proposed to recognize and prevent commercial bank's accounting risk more precisely.Based on the financial knowledge graph of account transaction, deep graph connected risk features were extracted via various graph analysis and mining technologies.Combining the graph features with industrial knowledge, 249 single rules and 425 assembled rules were constructed to form a more affluent and flexibly configurable anti-fraud strategy system, which was then applied to verify commercial bank's current accounts and select the high suspicious ones.The experimental results show that the risk recognition accuracy rate of the intelligent strategy is much higher than the traditional one and reaches up to 85% above, which significantly promotes the efficiency of the accounting risk verification.

    Table and Figures | Reference | Related Articles | Metrics
    Application of big data visual analysis in the marine field
    Cui XIE, Mingkui LI, Ping CHEN, Xiaotian LI, Jian SONG, Junyu DONG, Jiameng ZHAO
    Big Data Research    2021, 7 (2): 3-14.   DOI: 10.11959/j.issn.2096-0271.2021011
    Abstract795)   HTML280)    PDF(pc) (4778KB)(924)       Save

    With the development of ocean observation technology and numerical simulation technology, larger scale and higher resolution ocean data can be obtained, which brings opportunities for the analysis of complex ocean environmental elements and structures, and also brings great challenges to traditional analysis methods.For this reason, the method of big data visual analysis was introduced and its application value in the analysis of multivariate ocean spatiotemporal data, the spatiotemporal characteristics and evolution analysis of important ocean structures was explored.Some visual analysis systems were developed and the basic framework of visual analysis of ocean data through case studies of data analysis of some sea areas around the world and China was summarized, showing that visual analysis is a promising technology for ocean complex data analysis in the era of big data.

    Table and Figures | Reference | Related Articles | Metrics
    An approach to buffering data efficiently in distributed storage systems
    Qinglin YANG, Guiyong WU, Guangyan ZHANG
    Big Data Research    2021, 7 (2): 147-157.   DOI: 10.11959/j.issn.2096-0271.2021018
    Abstract491)   HTML91)    PDF(pc) (1393KB)(520)       Save

    To address the problems of write amplification, long I/O path, and high access latency in distributed storage systems, an efficient SSD-based caching approach for distributed storage systems was proposed.This approach adopts read/write bypassing and lazy caching methods to manage the cache system, considers last access time and historical access frequency when performing cache replacement, and adjusts the flushing speed according to the foreground workload.It improves significantly the reading and writing performance of storage systems.

    Table and Figures | Reference | Related Articles | Metrics
    Application of big data technology in precise prevention and control of epidemic situation
    Gang LI, Xiangchun ZHENG, Huashan YIN, Wenchao HUANG
    Big Data Research    2021, 7 (1): 124-134.   DOI: 10.11959/j.issn.2096-0271.2021009
    Abstract946)   HTML287)    PDF(pc) (1746KB)(892)       Save

    Taking City X as an example, based on the actual situation of a mega-city, big data processing and analysis methods, a large database for epidemic situation prevention and control based on “four standards and four realities” data was built. And through big data technology, to assist epidemic situation prevention and control, a system of real-time awareness of the epidemic situation, precise personnel control, and precise enterprise assistance was built. The specific technical methods were analyzed in detail, such as the data construction status in the system, the association rule mining algorithm adopted, the infection warning mechanism based on expectation maximum probability clustering, and the unstructured data utilization strategy based on text mining. The system approximately saved more than 100 000 hours for country cadres, precisely located and traced tens of thousands of susceptible people who were focused, had played a huge role in blocking epidemic infection, elevating the rate of production resumption, and reducing economic losses, therefore it has a reference significance for all parts of the country.

    Table and Figures | Reference | Related Articles | Metrics
Most Download
Most Read
Most Cited