Gamma Modeling of Speech Power and Its On-Line Estimation for Statistical Speech Enhancement

DAT, Tran Huy; TAKEDA, Kazuya; ITAKURA, Fumitada

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

Gamma Modeling of Speech Power and Its On-Line Estimation for Statistical Speech Enhancement

http://hdl.handle.net/2237/15052

名前 / ファイル	ライセンス	アクション
431.pdf (591.9 kB)

Item type

学術雑誌論文 / Journal Article(1)

公開日

2011-07-07

タイトル

Gamma Modeling of Speech Power and Its On-Line Estimation for Statistical Speech Enhancement

言語

著者

DAT, Tran Huy
TAKEDA, Kazuya
ITAKURA, Fumitada

アクセス権

open access

アクセス権URI

http://purl.org/coar/access_right/c_abf2

キーワード

主題Scheme

Other

主題

speech enhancement

キーワード

主題Scheme

Other

主題

speech recognition

キーワード

主題Scheme

Other

主題

gamma modeling

キーワード

主題Scheme

Other

主題

fourth-order moment

キーワード

主題Scheme

Other

主題

MMSE

キーワード

主題Scheme

Other

主題

MAP

キーワード

主題Scheme

Other

主題

spectral magnitude

キーワード

主題Scheme

Other

主題

power

キーワード

主題Scheme

Other

主題

log-spectral magnitude

抄録

内容記述

This study shows the effectiveness of using gamma distribution in the speech power domain as a more general prior distribution for the model-based speech enhancement approaches. This model is a super-set of the conventional Gaussian model of the complex spectrum and provides more accurate prior modeling when the optimal parameters are estimated. We develop a method to adapt the modeled distribution parameters from each actual noisy speech in a frame-by-frame manner. Next, we derive and investigate the minimum mean square error (MMSE) and maximum a posterior probability (MAP) estimations in different domains of speech spectral magnitude, generalized power and its logarithm, using the proposed gamma modeling. Finally, a comparative evaluation of the MAP and MMSE filters is conducted. As the MMSE estimations tend to more complicated using more general prior distributions, the MAP estimations are given in closed-form extractions and therefore are suitable in the implementation. The adaptive estimation of the modeled distribution parameters provides more accurate prior modeling and this is the principal merit of the proposed method and the reason for the better performance. From the experiments, the MAP estimation is recommended due to its high efficiency and low complexity. Among the MAP based systems, the estimation in log-magnitude domain is shown to be the best for the speech recognition as the estimation in power domain is superior for the noise reduction.

言語

内容記述タイプ

Abstract

出版者

言語

出版者

Institute of Electronics, Information and Communication Engineers

言語

eng

資源タイプ

資源タイプresource

http://purl.org/coar/resource_type/c_6501

タイプ

journal article

出版タイプ

VoR

出版タイプResource

http://purl.org/coar/version/c_970fb48d4fbd8a85

Versions

Ver.1

2021-03-01 18:37:10.017274

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Gamma Modeling of Speech Power and Its On-Line Estimation for Statistical Speech Enhancement

× DAT, Tran Huy

× TAKEDA, Kazuya

× ITAKURA, Fumitada

Versions

Share

Cite as

エクスポート