通信学报 ›› 2022, Vol. 43 ›› Issue (6): 235-245.doi: 10.11959/j.issn.1000-436x.2022113

• 学术通信 • 上一篇    

高阶最优LPC根值筛选的共振峰估计算法研究

龙华, 苏树盟   

  1. 昆明理工大学信息工程与自动化学院,云南 昆明 650031
  • 修回日期:2022-04-30 出版日期:2022-06-01 发布日期:2022-06-01
  • 作者简介:龙华(1963- ),女,回族,云南大理人,博士,昆明理工大学教授,主要研究方向为无线网络及音频信号处理
    苏树盟(1996- ),男,云南保山人,昆明理工大学硕士生,主要研究方向为音频信号处理、语音识别
  • 基金资助:
    国家自然科学基金资助项目(61761025)

Research on formant estimation algorithm for high order optimal LPC root value screening

Hua LONG, Shumeng SU   

  1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650031, China
  • Revised:2022-04-30 Online:2022-06-01 Published:2022-06-01
  • Supported by:
    The National Natural Science Foundation of China(61761025)

摘要:

目的:现有的线性预测(LP)共振峰估计算法存在伪根干扰与极点交互,精确定位共振峰十分困难。LP预测共振峰的低阶拟合从根本上限制了共振峰提取的精度,为解决LP共振峰检测误差较大的问题,针对LP提取共振峰中伪根难以去除及极点交互带来的频谱混叠,提出一种基于高阶最优LP系数根值筛选的共振峰估计算法,探究算法中不同阶次下语音数字共振模型约束的根判定阈值、最优LP根值分布、频谱包络中的共振峰峰值分布及共振峰估计误差。

方法:考虑增大LP阶次的取值,提高LP系统频谱对语音信号的拟合程度,在不同阶次下分析语音信号共振峰频率的计算精度,获取含有更高线性峰值拟合精度的线性系统根值。采用语音数字共振模型约束共振峰的根幅值范围,通过匹配阶次的根幅值来筛选线性系统根值的方法来减少伪根数,滤除非共振峰频率值对应的伪根,消除频谱混叠。结合功率加权来加重信号的主要频谱成分,修正语音频率幅值,增强语音信号谱峰与LPC谱峰能量的匹配性,拉大极点间距离,降低谐波产生干扰带来的预测误差,提高频谱峰值频率区分度。

结果:从算法结构可看出,先对语音信号做预处理,预加重削减低频信息以降低基频对共振峰检测的干扰,并且增强高频以增加高频谱线中第三共振峰的幅值区分度,端点检测隔离无话段对有话帧做数字共振模型约束下的高阶LP分析。模型包含三个主要技术对性能的提升:(1)在系统容限范围内,提升LP阶次提高共振峰预测精度。共振峰是频谱包络的峰值频率且对应LP多项式的零极点,9阶线性预测仅保留了语音信号LP响应幅度谱的基本形状,LP阶次提高至15阶时,LP增大了信号的拟合度,LP零极点数目更多且分布更加靠近单位圆,15阶LP补偿了9阶线性拟合带来的共振峰拟合精度的牺牲,共振峰提取精度提升2.5%。(2)采用数字共振根值约束下的阈值判定根,有效滤除基频谐波产生的低频伪根及共振峰谐波产生的伪根。LP多项式的零极点是共振峰峰值对应的复数根,从共振峰检测根值分布来看,数字共振根值约束下的高阶LP根阈值有效滤除声道谐波作用产生的伪根,准确定位出共振峰峰值对应的根在单位圆的位置。(3)对语音在频率上做功率加权,修正后的信号预测共振峰更加精确。功率加权后的信号频谱包络能量更集中,18阶时,共振峰峰值频率1363Hz对1359Hz的混叠干扰被消除。在算法稳健性及不同方法整体性能比较上,本文算法在9阶到22阶均可稳健提取共振峰,且模型算法提取共振峰在18阶时表现出最优的性能。

结论:本文对基于LPC共振峰检测的方法做出改进,研究提高线性预测阶次对提取共振峰的影响,针对提高线性预测阶次带来的多伪根以及多极点交互的问题,最小化语音数字共振模型约束共振峰提取误差。分析线性预测阶次与根幅值筛选阈值的关系,采用数字共振约束下的根幅值反馈的方式获取匹配高阶次的低误差率筛选阈值来去除伪根,并且结合功率加权突出频谱峰值的幅值,消除共振峰提取过程中的极点交互,实现精准有效的共振峰提取。

关键词: 线性预测, 数字共振, 功率加权, 共振峰

Abstract:

Objectives: The existing linear prediction (LP) formant estimation algorithms are difficult to locate formant precisely because of the pseudo root interference and interaction between poles.Because of the low order fitting formant of LP prediction,the accuracy of formant extraction is fundamentally limited.It is difficult to remove false roots and spectrum aliasing caused by pole interaction in the formant extraction of high-order LP.In order to solve the problem of large error of LP formant detection,a formant estimation algorithm based on high-order LP coefficient root value screening was proposed. The root determination threshold, optimal LP root value distribution, peak distribution of formant in spectral envelope and formant estimation error of speech digital resonance model constraints under different orders are investigated.

Methods: The value of LP order is increased to improve the fitting degree of LP system spectrum of speech signal.The calculation precision of formant frequency of speech signal is analyzed in different order,and the root value of linear system with higher linear peak fitting precision is obtained.A speech digital resonance model is used to constrain the root amplitude range of the formant, and the number of false roots is reduced by matching the root amplitude of the order to filter the root values of the linear system.Combined with power weighting,the main spectral components of the signal are weighted. So the amplitude of speech frequency is corrected, and the energy matching between the spectral peak of the speech signal and the spectral peak of LPC is enhanced, the distance between poles is extended, the prediction error caused by harmonic generation interference is reduced, and the peak frequency discrimination of spectrum is improved.

Results: As can be seen from the algorithm structure, the speech signal is preprocessed, in which the low frequency information is reweighted to reduce the interference of fundamental frequency to formant detection.And the high frequency information is enhanced to increase the amplitude distinction of the third formant in the high spectrum line. And the end detection is isolated to do the high-order LP analysis of the spoken frame under the constraint of digital resonance model. The model includes three main techniques which improving the performance:(1) Within the system tolerance range, LP order is increased, which can improve the formant prediction accuracy. The formant is the peak frequency of the spectral envelope, which corresponding to the zero-pole of the LP polynomial. The 9-order linear prediction only preserves the basic shape of LP response amplitude spectrum of speech signal.When the order of LP is increased to the 15,the fitting degree of the signal is increased,and the zero and pole of LP is dense and the distribution of LP is closer to the unit circle.The 15th order LP compensates for the sacrifice of formant fitting accuracy caused by the 9th order linear fitting, which improves the formant extraction accuracy by 2.5%. (2) Using the threshold value under the constraint of digital resonance root value to determine the complex roots,the low frequency false roots generated by fundamental frequency harmonics and the false roots generated by formant harmonics is effectively filtered.The zeroes-poles of the LP polynomial are the complex roots corresponding to the formant peaks.In the view of the distribution of formant detection root values, the high-order LP root threshold constrained by digital formant root values can effectively filter the false roots generated by harmonic action of sound channel. And accurately the location of the root corresponding to formant root values in the unit circle is accurately located. (3) The revised signal prediction formant is more accurate by reweighting the speech frequency power.The spectrum envelope energy is more concentrated after power weighting.At order 18, the aliasing interference caused by the peak frequency of the formant at 1363Hz to 1359Hz is eliminated. In terms of the robustness of the algorithm and the overall performance comparison of different methods,the proposed algorithm can extract the formant robustly from order 9 to 22, and the model algorithm shows the optimal performance when the formant is extracted from order 18.

Conclusions:The method of formant detection based on LPC is improved.The effect of improving the order of linear prediction on formant extraction was studied.Aiming at the problem of multiple pseudo-roots and multi-pole interaction caused by increasing the order of linear prediction, the error of formant extraction constrained by the speech-digital resonance model is minimized. The relationship between the order of linear prediction and the screening threshold of root amplitude was analyzed. To remove false roots, the root amplitude feedback method under digital resonance constraint was used to obtain the filtering threshold of matching high order and low error rate. Combined with the power weighting, amplitude of the peak of the prominent spectrum is strengthened,which eliminates the pole interaction in formant extraction,achieving accurate and effective formant extraction.

Key words: linear prediction, digital resonance, power weighting, formant

中图分类号: 

No Suggested Reading articles found!