大数据 ›› 2022, Vol. 8 ›› Issue (5): 106-123.doi: 10.11959/j.issn.2096-0271.2022035

• 研究 • 上一篇    

税收优惠政策关键要素抽取与可视化分析

关海山1,2, 郑玉龙1,2, 魏笔凡2,3, 张泽民1,2, 岳浩1,2, 师斌2,4, 董博2,4   

  1. 1 西安交通大学软件学院,陕西 西安 710049
    2 陕西省天地网技术重点实验室,陕西 西安 710049
    3 西安交通大学继续教育学院,陕西 西安 710049
    4 西安交通大学计算机科学与技术学院,陕西 西安 710049
  • 出版日期:2022-09-01 发布日期:2022-09-01
  • 作者简介:关海山(1996- ),男,西安交通大学软件学院硕士生,主要研究方向为自然语言处理、文本问题生成
    郑玉龙(1996- ),男,西安交通大学软件学院硕士生,主要研究方向为自然语言处理、文本问题生成
    魏笔凡(1977- ),男,博士,西安交通大学继续教育学院研究员,主要研究方向为Web信息抽取、教育知识图谱构建及应用
    张泽民(1999- ),男,西安交通大学软件学院硕士生,主要研究方向为自然语言处理、深度学习、问题生成等
    岳浩(1999- ),男,西安交通大学软件学院硕士生,主要研究方向为大数据、深度学习等
    师斌(1992- ),男,博士,西安交通大学计算机科学与技术学院讲师,主要研究方向为金融数据挖掘、云计算及虚拟化技术
    董博(1983- ),男,博士,西安交通大学计算机科学与技术学院高级工程师,主要研究方向为金融数据挖掘、智能教育
  • 基金资助:
    国家重点研发计划资助项目(2020AAA0108800);国家自然科学基金资助项目(62137002);国家自然科学基金资助项目(61937001);国家自然科学基金资助项目(62176209);国家自然科学基金资助项目(62176207);国家自然科学基金资助项目(62106190);国家自然科学基金资助项目(62050194);国家自然科学基金创新研究群体资助项目(61721002);教育部创新团队资助项目(IRT_17R86);中国工程院咨询研究资助项目“基于MOOC中国的‘一带一路’人才培养的线上线下混合教学支撑信息化平台与服务体系”;中国博士后科学基金项目(2020M683493);中国工程科技知识中心资助项目

Extraction and visualization analysis of key elements of tax preferential policies

Haishan GUAN1,2, Yulong ZHENG1,2, Bifan WEI2,3, Zemin ZHANG1,2, Hao YUE1,2, Bin SHI2,4, Bo DONG2,4   

  1. 1 School of Software Engineering, Xi’an Jiaotong University, Xi’an 710049, China
    2 Shaanxi Key Laboratory of Satellite &Terrestrial Network Technology R&D, Xi’an 710049, China
    3 School of Continuing Education, Xi’an Jiaotong University, Xi’an 710049, China
    4 School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China
  • Online:2022-09-01 Published:2022-09-01
  • Supported by:
    The National Key Research and Development Program of China(2020AAA0108800);The National Natural Science Foundation of China(62137002);The National Natural Science Foundation of China(61937001);The National Natural Science Foundation of China(62176209);The National Natural Science Foundation of China(62176207);The National Natural Science Foundation of China(62106190);The National Natural Science Foundation of China(62050194);Innovative Research Group of the National Natural Science Foundation of China(61721002);Innovation Research Team of Ministry of Education(IRT_17R86);Consulting Research Project of Chinese Academy of Engineering “The Online and Offline Mixed Educational Service System for ‘The Belt and Road’ Training in MOOC China”;China Postdoctoral Science Foundation(2020M683493);Project of China Knowledge Centre for Engineering Science and Technology

摘要:

随着税收优惠政策数量的迅速增加,纳税人面对海量的税收优惠政策难以快速定位与自身相关的优惠内容,导致许多纳税人没有享受到应该享受的优惠政策。基于预训练语言模型BERT与规则处理相结合的方法实现了对税收优惠政策法规的表示、关键要素抽取和税收优惠的可视化查询,使纳税人可以快速准确地定位与自身相关的税收优惠信息,并对结果进行可视化展示。实验结果表明,关键要素抽取性能优越,税收优惠政策查询快速直观,可有效缓解海量税收优惠信息过载。

关键词: 税收优惠政策, 预训练语言模型, 信息抽取, 可视化

Abstract:

With the rapid increase in the number of preferential tax policies, taxpayers face a large number of preferential tax policies, and it is difficult for taxpayers to quickly locate the preferential content related to them.As a result, many taxpayers do not enjoy the preferential policies they should enjoy.Based on the combination of pre-training language model BERT and rule processing, the visualization was realized of the characterization of preferential tax policies and regulations, the extraction of key elements, and the visual query of tax incentives, so that taxpayers can intuitively and quickly locate tax incentives related to themselves, and visualize the results.The experimental results show that the extraction performance of key elements is superior, and the query of preferential tax policies is quick and intuitive, which can effectively alleviate the problem of massive tax preferential information overload.

Key words: preferential tax policy, pre-trained language model, information extraction, visualization

中图分类号: 

No Suggested Reading articles found!