ショット内及びショット間の画像・音声特徴に着目したスピーチショット抽出

熊谷, 章吾; 道満, 恵介; 高橋, 友和; 出口, 大輔; 井手, 一郎; 村瀬, 洋; KUMAGAI, Shogo; DOMAN, Keisuke; TAKAHASHI, Tomokazu; DEGUCHI, Daisuke; IDE, Ichiro; MURASE, Hiroshi

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

ショット内及びショット間の画像・音声特徴に着目したスピーチショット抽出

http://hdl.handle.net/2237/23832

名前 / ファイル	ライセンス	アクション
110009546402.pdf (923.2 kB)

Item type

学術雑誌論文 / Journal Article(1)

公開日

2016-03-15

タイトル

ショット内及びショット間の画像・音声特徴に着目したスピーチショット抽出

言語

その他のタイトル

Extraction of Speech Shots Focusing on Visual and Audio Features within and between Shots

言語

著者

熊谷, 章吾
道満, 恵介
高橋, 友和
出口, 大輔
井手, 一郎
村瀬, 洋
KUMAGAI, Shogo
DOMAN, Keisuke
TAKAHASHI, Tomokazu
DEGUCHI, Daisuke
IDE, Ichiro
MURASE, Hiroshi

アクセス権

open access

アクセス権URI

http://purl.org/coar/access_right/c_abf2

権利

言語

権利情報

(c)一般社団法人電子情報通信学会本文データは学協会の許諾に基づきCiNiiから複製したものである

キーワード

主題Scheme

Other

主題

スピーチショット抽出

キーワード

主題Scheme

Other

主題

ニュース映像

キーワード

主題Scheme

Other

主題

映像検索

キーワード

主題Scheme

Other

主題

画像・音声特徴

キーワード

主題Scheme

Other

主題

Speech shot extraction

キーワード

主題Scheme

Other

主題

news video

キーワード

主題Scheme

Other

主題

video retrieval

キーワード

主題Scheme

Other

主題

audio−visual features

抄録

内容記述

本報告では,ショット内及びショット間の特徴に基づく被写体と話者の異同判定によるニュース映像からのスピーチショット抽出手法を提案する.スピーチショットはマルチメディア情報を豊富に含み,資料的価値が高い.そこで我々はこれまで,被写体の口唇動作と話者の声から得られる複数の音声特徴と画像特徴の相関に基づく被写体と話者の異同判定手法を提案してきた.この手法は,音声ノイズの少ないショットに対しては高精度な異同判定が可能であるが,多量の音声ノイズを含むショットに対しての異同判定は困難であった.そこで本報告では,2段階の処理による被写体と話者の異同判定手法を提案する.まず第1段階で,これまでに提案した手法により異同判定を行う.その後,第2段階で,ショット内及びその前後のショットとの間に表れる特徴的な画像・音声の性質に基づいて異同判定を行う.スピーチショット抽出実験の結果,提案手法の有効性を確認した.

言語

内容記述タイプ

Abstract

抄録

内容記述

We propose a method to extract speech shots from news videos using detecting the inconsisteny between a subject and the speaker focusing on features within and between shots. Speech shots in news videos contain a wealth of multimedia information, and are valuable as archived material. To extract speech shots, we have previously proposed a method to detect the inconsistency between a subject and the speaker based on the co-occurrence between a subject's lip motion and the speaker's voice. This previous method could detect the inconsistency in a shot with little audio noises. However, it is difficult to detect the inconsistency in a shot with significant amount of audio noises. In order to deal with this problem, the proposed method detects the inconsisteny between a subject and the speaker in two steps. The first step detects the inconsistency by our previous method, and the second step detects the inconsistency based on the intra- and inter- shot features. Experimental results showed the effectiveness of the proposed method.

言語

内容記述タイプ

Abstract

内容記述

IEICE Technical Report;IE2011-147,IEICE Technical Report;MVE2011-109

言語

内容記述タイプ

Other

出版者

言語

出版者

一般社団法人電子情報通信学会

言語

jpn

資源タイプ

資源タイプresource

http://purl.org/coar/resource_type/c_6501

タイプ

journal article

出版タイプ

VoR

出版タイプResource

http://purl.org/coar/version/c_970fb48d4fbd8a85

Versions

Ver.1

2021-03-01 15:15:50.321614

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

ショット内及びショット間の画像・音声特徴に着目したスピーチショット抽出

× 熊谷, 章吾

× 道満, 恵介

× 高橋, 友和

× 出口, 大輔

× 井手, 一郎

× 村瀬, 洋

× KUMAGAI, Shogo

× DOMAN, Keisuke

× TAKAHASHI, Tomokazu

× DEGUCHI, Daisuke

× IDE, Ichiro

× MURASE, Hiroshi

Versions

Share

Cite as

エクスポート