物联网学报 ›› 2022, Vol. 6 ›› Issue (1): 113-122.doi: 10.11959/j.issn.2096-3750.2022.00255

• 理论与技术 • 上一篇    下一篇

基于加权朴素贝叶斯的水质数据分类研究

方志豪1, 李正权1,2, 张铭玮1   

  1. 1 江南大学物联网工程学院,江苏 无锡 214122
    2 江苏省未来网络创新研究院,江苏 南京 211111
  • 修回日期:2021-12-29 出版日期:2022-03-30 发布日期:2022-03-01
  • 作者简介:方志豪(1996− ),男,江南大学物联网工程学院硕士生,主要研究方向为水质监测系统应用和开发
    李正权(1976− ),男,江南大学物联网工程学院教授,主要研究方向为大规模MIMO技术、协作通信、物联网等
    张铭玮(1998− ),男,江南大学物联网工程学院硕士生,主要研究方向为水质监测系统应用和开发
  • 基金资助:
    国家自然科学基金资助项目(61571108);无锡市科技发展资金资助项目(H20191001);无锡市科技发展资金资助项目(G20192010);未来网络科研基金项目(FNSRFP-2021-YB-11)

Research on water quality data classification based on weighted Naive Bayes

Zhihao FANG1, Zhengquan LI1,2, Mingwei ZHANG1   

  1. 1 School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China
    2 HJiangsu Future Networks Innovation Institute, Nanjing 211111, China
  • Revised:2021-12-29 Online:2022-03-30 Published:2022-03-01
  • Supported by:
    The National Natural Science Foundation of China(61571108);The Wuxi Science and Technology Development Fund(H20191001);The Wuxi Science and Technology Development Fund(G20192010);The Future Network Scientific Research Fund Project(FNSRFP-2021-YB-11)

摘要:

为更好地实施水环境管理政策,水质评价是基础环节,即根据某一水域多个水质参数,如何将其合理地划分到特定水质类别。针对该问题,提出了一种改进的朴素贝叶斯分类方法,该方法赋予不同属性以不同的权值,削弱了朴素贝叶斯条件独立性的假设,使分类结果更接近实际类别。首先,参考国家地表水水质自动监测站(以下简称国控水站)发布的数据,选取其中500条水质数据作为样本,基于溶解氧、高锰酸盐指数、氨氮和总磷4个指标建立评价体系;然后,利用改进朴素贝叶斯分类方法对样本进行学习与评价,并采用五折交叉验证法验证其分类性能。结果表明,改进朴素贝叶斯分类方法的准确率、精确率、召回率和F1值分别达到96.0%、95.9%、93.8%和94.8%,水质数据分类的性能指标相较于其他朴素贝叶斯分类方法更高,可对实际工程中遇到水质数据分类的问题提供一定的参考。

关键词: 水质评价, 朴素贝叶斯, 五折交叉验证, 性能指标

Abstract:

In order to better implement the water environmental management policies, water quality evaluation is the basic step, that is to reasonably divide it into specific water quality category according to multiple water quality parameters in a certain water area.Aimed at this problem, an improved Naive Bayes classification method was proposed, which endowed different attributes with different weights, weakened the assumption of Naive Bayes conditional independence, and made the classification result closer to the actual category.Firstly, referred to the data released by the national surface water quality automatic monitoring station, 500 water quality data were selected as samples, and an evaluation system with four indicators was established, including dissolved oxygen, permanganate index, ammonia nitrogen and total phosphorus.And then, the improved Naive Bayes classification method was used to learn and evaluate the samples, and its classification performance by the five fold cross validation method was verified.The results show that the accuracy, precision, recall and F1 value of the improved Naive Bayes classification method reach 96.0%, 95.9%, 93.8% and 94.8% respectively, with higher performance index of water quality data classification compared with other Naive Bayes classification method, which can provide some reference for the problem of water quality data classification encountered in actual engineering.

Key words: water quality evaluation, Naive Bayes, ive fold cross validation, performance index

中图分类号: 

No Suggested Reading articles found!