WEKO3
アイテム
Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN
http://hdl.handle.net/2237/14966
http://hdl.handle.net/2237/14966a564939d-9fcc-4a06-a4c9-215ad95fb026
名前 / ファイル | ライセンス | アクション |
---|---|---|
393.pdf (350.5 kB)
|
|
Item type | 学術雑誌論文 / Journal Article(1) | |||||
---|---|---|---|---|---|---|
公開日 | 2011-06-28 | |||||
タイトル | ||||||
タイトル | Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN | |||||
言語 | en | |||||
著者 |
WANG, Longbiao
× WANG, Longbiao× NAKAGAWA, Seiichi× KITAOKA, Norihide |
|||||
アクセス権 | ||||||
アクセス権 | open access | |||||
アクセス権URI | http://purl.org/coar/access_right/c_abf2 | |||||
権利 | ||||||
言語 | en | |||||
権利情報 | Copyright (C) 2008 IEICE | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | robust speech recognition | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | distant-talking environment | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | CMN | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | long-term spectrum | |||||
抄録 | ||||||
内容記述タイプ | Abstract | |||||
内容記述 | In a distant-talking environment, the length of channel impulse response is longer than the short-term spectral analysis window. Conventional short-term spectrum based Cepstral Mean Normalization (CMN) is therefore, not effective under these conditions. In this paper, we propose a robust speech recognition method by combining a short-term spectrum based CMN with a long-term one. We assume that a static speech segment (such as a vowel, for example) affected by reverberation, can be modeled by a long-term cepstral analysis. Thus, the effect of long reverberation on a static speech segment may be compensated by the long-term spectrum based CMN. The cepstral distance of neighboring frames is used to discriminate the static speech segment (long-term spectrum) and the non-static speech segment (short-term spectrum). The cepstra of the static and non-static speech segments are normalized by the corresponding cepstral means. In a previous study, we proposed an environmentally robust speech recognition method based on Position-Dependent CMN (PDCMN) to compensate for channel distortion depending on speaker position, and which is more efficient than conventional CMN. In this paper, the concept of combining short-term and long-term spectrum based CMN is extended to PDCMN. We call this Variable Term spectrum based PDCMN (VT-PDCMN). Since PDCMN/VT-PDCMN cannot normalize speaker variations because a position-dependent cepstral mean contains the average speaker characteristics over all speakers, we also combine PDCMN/VT-PDCMN with conventional CMN in this study. We conducted the experiments based on our proposed method using limited vocabulary (100 words) distant-talking isolated word recognition in a real environment. The proposed method achieved a relative error reduction rate of 60.9% over the conventional short-term spectrum based CMN and 30.6% over the short-term spectrum based PDCMN. | |||||
言語 | en | |||||
出版者 | ||||||
出版者 | Institute of Electronics, Information and Communication Engineers | |||||
言語 | en | |||||
言語 | ||||||
言語 | eng | |||||
資源タイプ | ||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||
資源タイプ | journal article | |||||
出版タイプ | ||||||
出版タイプ | VoR | |||||
出版タイプResource | http://purl.org/coar/version/c_970fb48d4fbd8a85 | |||||
関連情報 | ||||||
関連タイプ | isVersionOf | |||||
識別子タイプ | URI | |||||
関連識別子 | http://www.ieice.org/jpn/trans_online/index.html | |||||
ISSN | ||||||
収録物識別子タイプ | PISSN | |||||
収録物識別子 | 0916-8532 | |||||
書誌情報 |
en : IEICE transactions on information and systems 巻 E91-D, 号 3, p. 457-466, 発行日 2008-03-01 |
|||||
著者版フラグ | ||||||
値 | publisher | |||||
URI | ||||||
識別子 | http://www.ieice.org/jpn/trans_online/index.html | |||||
識別子タイプ | URI | |||||
URI | ||||||
識別子 | http://hdl.handle.net/2237/14966 | |||||
識別子タイプ | HDL |