电信科学 ›› 2020, Vol. 36 ›› Issue (10): 159-171.doi: 10.11959/j.issn.1000-0801.2020141

• 运营技术广角 • 上一篇    下一篇

基于大数据分析的云资源池告警信息关联方案

李青   

  1. 中国电信股份有限公司研究院,上海 200135
  • 修回日期:2020-04-25 出版日期:2020-10-20 发布日期:2020-11-07
  • 作者简介:李青(1973- ),男,中国电信股份有限公司研究院高级工程师,核心网与平台领域高级专家,中国电信集团B级人才,主要从事电信增值业务平台及产品开发、网络演进、云计算/大数据等运营技术研究工作

Alarm information association scheme of cloud resource pool based on big data analysis

Qing LI   

  1. Research Institute of China Telecom Co.,Ltd.,Shanghai 200135,China
  • Revised:2020-04-25 Online:2020-10-20 Published:2020-11-07

摘要:

研究了一种云资源池端到端智能化运维管理系统,提出一种智能判断故障模块的技术架构,分析了实现云资源池端到端告警关联的基本方法,阐述了云资源池单KPI异常检测分析方法和多KPI故障传播链分析方法的实现原理,并重点介绍了物理主机告警与虚拟主机告警、IP SAN存储告警与虚拟对象存储告警、主机设备告警与网络告警的关联关系,为运营商提升云资源池端到端智能化运维能力提供了借鉴和参考。

关键词: 云资源池, 告警关联, 根因分析, 智能运维

Abstract:

An end-to-end intelligent operation and maintenance management system of cloud resource pool was studied,and a technical framework of intelligent obstacle detection module was proposed.The basic method of realizing end-to-end alarm association of cloud resource pool was analyzed.The realization of anomaly detection analysis method of single KPI and fault propagation chain analysis method of multi-KPI in cloud resource pool were expounded.The relationship between physical host alarm and virtual host alarm,IP SAN storage alarm and virtual object storage alarm,host device alarm and network alarm were introduced emphatically,which provide a reference for improving the end-to-end intelligent operation and maintenance capability of cloud resource pool.

Key words: cloud resource pool, alert correlation, root cause analysis, intelligent operation and maintenance

中图分类号: 

No Suggested Reading articles found!