WEKO3
AND
アイテム
{"_buckets": {"deposit": "bd8c7de9-560c-452e-b3da-57d71a8c2602"}, "_deposit": {"id": "21684", "owners": [], "pid": {"revision_id": 0, "type": "depid", "value": "21684"}, "status": "published"}, "_oai": {"id": "oai:nagoya.repo.nii.ac.jp:00021684"}, "item_10_alternative_title_19": {"attribute_name": "\u305d\u306e\u4ed6\u306e\u8a00\u8a9e\u306e\u30bf\u30a4\u30c8\u30eb", "attribute_value_mlt": [{"subitem_alternative_title": "Extraction of Speech Shots Focusing on Visual and Audio Features within and between Shots"}]}, "item_10_biblio_info_6": {"attribute_name": "\u66f8\u8a8c\u60c5\u5831", "attribute_value_mlt": [{"bibliographicIssueDates": {"bibliographicIssueDate": "2012-03", "bibliographicIssueDateType": "Issued"}, "bibliographicIssueNumber": "479", "bibliographicPageEnd": "86", "bibliographicPageStart": "81", "bibliographicVolumeNumber": "111", "bibliographic_titles": [{"bibliographic_title": "\u96fb\u5b50\u60c5\u5831\u901a\u4fe1\u5b66\u4f1a\u6280\u8853\u7814\u7a76\u5831\u544a. MVE, \u30de\u30eb\u30c1\u30e1\u30c7\u30a3\u30a2\u30fb\u4eee\u60f3\u74b0\u5883\u57fa\u790e"}]}]}, "item_10_description_4": {"attribute_name": "\u6284\u9332", "attribute_value_mlt": [{"subitem_description": "\u672c\u5831\u544a\u3067\u306f,\u30b7\u30e7\u30c3\u30c8\u5185\u53ca\u3073\u30b7\u30e7\u30c3\u30c8\u9593\u306e\u7279\u5fb4\u306b\u57fa\u3065\u304f\u88ab\u5199\u4f53\u3068\u8a71\u8005\u306e\u7570\u540c\u5224\u5b9a\u306b\u3088\u308b\u30cb\u30e5\u30fc\u30b9\u6620\u50cf\u304b\u3089\u306e\u30b9\u30d4\u30fc\u30c1\u30b7\u30e7\u30c3\u30c8\u62bd\u51fa\u624b\u6cd5\u3092\u63d0\u6848\u3059\u308b.\u30b9\u30d4\u30fc\u30c1\u30b7\u30e7\u30c3\u30c8\u306f\u30de\u30eb\u30c1\u30e1\u30c7\u30a3\u30a2\u60c5\u5831\u3092\u8c4a\u5bcc\u306b\u542b\u307f,\u8cc7\u6599\u7684\u4fa1\u5024\u304c\u9ad8\u3044.\u305d\u3053\u3067\u6211\u3005\u306f\u3053\u308c\u307e\u3067,\u88ab\u5199\u4f53\u306e\u53e3\u5507\u52d5\u4f5c\u3068\u8a71\u8005\u306e\u58f0\u304b\u3089\u5f97\u3089\u308c\u308b\u8907\u6570\u306e\u97f3\u58f0\u7279\u5fb4\u3068\u753b\u50cf\u7279\u5fb4\u306e\u76f8\u95a2\u306b\u57fa\u3065\u304f\u88ab\u5199\u4f53\u3068\u8a71\u8005\u306e\u7570\u540c\u5224\u5b9a\u624b\u6cd5\u3092\u63d0\u6848\u3057\u3066\u304d\u305f.\u3053\u306e\u624b\u6cd5\u306f,\u97f3\u58f0\u30ce\u30a4\u30ba\u306e\u5c11\u306a\u3044\u30b7\u30e7\u30c3\u30c8\u306b\u5bfe\u3057\u3066\u306f\u9ad8\u7cbe\u5ea6\u306a\u7570\u540c\u5224\u5b9a\u304c\u53ef\u80fd\u3067\u3042\u308b\u304c,\u591a\u91cf\u306e\u97f3\u58f0\u30ce\u30a4\u30ba\u3092\u542b\u3080\u30b7\u30e7\u30c3\u30c8\u306b\u5bfe\u3057\u3066\u306e\u7570\u540c\u5224\u5b9a\u306f\u56f0\u96e3\u3067\u3042\u3063\u305f.\u305d\u3053\u3067\u672c\u5831\u544a\u3067\u306f,2\u6bb5\u968e\u306e\u51e6\u7406\u306b\u3088\u308b\u88ab\u5199\u4f53\u3068\u8a71\u8005\u306e\u7570\u540c\u5224\u5b9a\u624b\u6cd5\u3092\u63d0\u6848\u3059\u308b.\u307e\u305a\u7b2c1\u6bb5\u968e\u3067,\u3053\u308c\u307e\u3067\u306b\u63d0\u6848\u3057\u305f\u624b\u6cd5\u306b\u3088\u308a\u7570\u540c\u5224\u5b9a\u3092\u884c\u3046.\u305d\u306e\u5f8c,\u7b2c2\u6bb5\u968e\u3067,\u30b7\u30e7\u30c3\u30c8\u5185\u53ca\u3073\u305d\u306e\u524d\u5f8c\u306e\u30b7\u30e7\u30c3\u30c8\u3068\u306e\u9593\u306b\u8868\u308c\u308b\u7279\u5fb4\u7684\u306a\u753b\u50cf\u30fb\u97f3\u58f0\u306e\u6027\u8cea\u306b\u57fa\u3065\u3044\u3066\u7570\u540c\u5224\u5b9a\u3092\u884c\u3046.\u30b9\u30d4\u30fc\u30c1\u30b7\u30e7\u30c3\u30c8\u62bd\u51fa\u5b9f\u9a13\u306e\u7d50\u679c,\u63d0\u6848\u624b\u6cd5\u306e\u6709\u52b9\u6027\u3092\u78ba\u8a8d\u3057\u305f. We propose a method to extract speech shots from news videos using detecting the inconsisteny between a subject and the speaker focusing on features within and between shots. Speech shots in news videos contain a wealth of multimedia information, and are valuable as archived material. To extract speech shots, we have previously proposed a method to detect the inconsistency between a subject and the speaker based on the co-occurrence between a subject\u0027s lip motion and the speaker\u0027s voice. This previous method could detect the inconsistency in a shot with little audio noises. However, it is difficult to detect the inconsistency in a shot with significant amount of audio noises. In order to deal with this problem, the proposed method detects the inconsisteny between a subject and the speaker in two steps. The first step detects the inconsistency by our previous method, and the second step detects the inconsistency based on the intra- and inter- shot features. Experimental results showed the effectiveness of the proposed method.", "subitem_description_type": "Abstract"}]}, "item_10_identifier_60": {"attribute_name": "URI", "attribute_value_mlt": [{"subitem_identifier_type": "URI", "subitem_identifier_uri": "http://ci.nii.ac.jp/naid/110009546402/"}, {"subitem_identifier_type": "HDL", "subitem_identifier_uri": "http://hdl.handle.net/2237/23832"}]}, "item_10_publisher_32": {"attribute_name": "\u51fa\u7248\u8005", "attribute_value_mlt": [{"subitem_publisher": "\u4e00\u822c\u793e\u56e3\u6cd5\u4eba\u96fb\u5b50\u60c5\u5831\u901a\u4fe1\u5b66\u4f1a"}]}, "item_10_relation_40": {"attribute_name": "\u30b7\u30ea\u30fc\u30ba", "attribute_value_mlt": [{"subitem_relation_name": [{"subitem_relation_name_text": "IEICE Technical Report;IE2011-147"}]}, {"subitem_relation_name": [{"subitem_relation_name_text": "IEICE Technical Report;MVE2011-109"}]}]}, "item_10_rights_12": {"attribute_name": "\u6a29\u5229", "attribute_value_mlt": [{"subitem_rights": "(c)\u4e00\u822c\u793e\u56e3\u6cd5\u4eba\u96fb\u5b50\u60c5\u5831\u901a\u4fe1\u5b66\u4f1a \u672c\u6587\u30c7\u30fc\u30bf\u306f\u5b66\u5354\u4f1a\u306e\u8a31\u8afe\u306b\u57fa\u3065\u304dCiNii\u304b\u3089\u8907\u88fd\u3057\u305f\u3082\u306e\u3067\u3042\u308b"}]}, "item_10_select_15": {"attribute_name": "\u8457\u8005\u7248\u30d5\u30e9\u30b0", "attribute_value_mlt": [{"subitem_select_item": "publisher"}]}, "item_10_source_id_7": {"attribute_name": "ISSN", "attribute_value_mlt": [{"subitem_source_identifier": "0913-5685", "subitem_source_identifier_type": "ISSN"}]}, "item_creator": {"attribute_name": "\u8457\u8005", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "\u718a\u8c37, \u7ae0\u543e"}], "nameIdentifiers": [{"nameIdentifier": "64139", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "\u9053\u6e80, \u6075\u4ecb"}], "nameIdentifiers": [{"nameIdentifier": "64140", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "\u9ad8\u6a4b, \u53cb\u548c"}], "nameIdentifiers": [{"nameIdentifier": "64141", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "\u51fa\u53e3, \u5927\u8f14"}], "nameIdentifiers": [{"nameIdentifier": "64142", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "\u4e95\u624b, \u4e00\u90ce"}], "nameIdentifiers": [{"nameIdentifier": "64143", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "\u6751\u702c, \u6d0b"}], "nameIdentifiers": [{"nameIdentifier": "64144", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "KUMAGAI, Shogo"}], "nameIdentifiers": [{"nameIdentifier": "64145", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "DOMAN, Keisuke"}], "nameIdentifiers": [{"nameIdentifier": "64146", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "TAKAHASHI, Tomokazu"}], "nameIdentifiers": [{"nameIdentifier": "64147", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "DEGUCHI, Daisuke"}], "nameIdentifiers": [{"nameIdentifier": "64148", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "IDE, Ichiro"}], "nameIdentifiers": [{"nameIdentifier": "64149", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "MURASE, Hiroshi"}], "nameIdentifiers": [{"nameIdentifier": "64150", "nameIdentifierScheme": "WEKO"}]}]}, "item_files": {"attribute_name": "\u30d5\u30a1\u30a4\u30eb\u60c5\u5831", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "2018-02-21"}], "displaytype": "detail", "download_preview_message": "", "file_order": 0, "filename": "110009546402.pdf", "filesize": [{"value": "923.2 kB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_free", "mimetype": "application/pdf", "size": 923200.0, "url": {"label": "110009546402.pdf", "url": "https://nagoya.repo.nii.ac.jp/record/21684/files/110009546402.pdf"}, "version_id": "cb4361c8-7404-43b3-a9cf-caf1944fccc2"}]}, "item_keyword": {"attribute_name": "\u30ad\u30fc\u30ef\u30fc\u30c9", "attribute_value_mlt": [{"subitem_subject": "\u30b9\u30d4\u30fc\u30c1\u30b7\u30e7\u30c3\u30c8\u62bd\u51fa", "subitem_subject_scheme": "Other"}, {"subitem_subject": "\u30cb\u30e5\u30fc\u30b9\u6620\u50cf", "subitem_subject_scheme": "Other"}, {"subitem_subject": "\u6620\u50cf\u691c\u7d22", "subitem_subject_scheme": "Other"}, {"subitem_subject": "\u753b\u50cf\u30fb\u97f3\u58f0\u7279\u5fb4", "subitem_subject_scheme": "Other"}, {"subitem_subject": "Speech shot extraction", "subitem_subject_scheme": "Other"}, {"subitem_subject": "news video", "subitem_subject_scheme": "Other"}, {"subitem_subject": "video retrieval", "subitem_subject_scheme": "Other"}, {"subitem_subject": "audio\u2212visual features", "subitem_subject_scheme": "Other"}]}, "item_language": {"attribute_name": "\u8a00\u8a9e", "attribute_value_mlt": [{"subitem_language": "jpn"}]}, "item_resource_type": {"attribute_name": "\u8cc7\u6e90\u30bf\u30a4\u30d7", "attribute_value_mlt": [{"resourcetype": "journal article", "resourceuri": "http://purl.org/coar/resource_type/c_6501"}]}, "item_title": "\u30b7\u30e7\u30c3\u30c8\u5185\u53ca\u3073\u30b7\u30e7\u30c3\u30c8\u9593\u306e\u753b\u50cf\u30fb\u97f3\u58f0\u7279\u5fb4\u306b\u7740\u76ee\u3057\u305f\u30b9\u30d4\u30fc\u30c1\u30b7\u30e7\u30c3\u30c8\u62bd\u51fa", "item_titles": {"attribute_name": "\u30bf\u30a4\u30c8\u30eb", "attribute_value_mlt": [{"subitem_title": "\u30b7\u30e7\u30c3\u30c8\u5185\u53ca\u3073\u30b7\u30e7\u30c3\u30c8\u9593\u306e\u753b\u50cf\u30fb\u97f3\u58f0\u7279\u5fb4\u306b\u7740\u76ee\u3057\u305f\u30b9\u30d4\u30fc\u30c1\u30b7\u30e7\u30c3\u30c8\u62bd\u51fa"}]}, "item_type_id": "10", "owner": "1", "path": ["312/313/314"], "permalink_uri": "http://hdl.handle.net/2237/23832", "pubdate": {"attribute_name": "\u516c\u958b\u65e5", "attribute_value": "2016-03-15"}, "publish_date": "2016-03-15", "publish_status": "0", "recid": "21684", "relation": {}, "relation_version_is_last": true, "title": ["\u30b7\u30e7\u30c3\u30c8\u5185\u53ca\u3073\u30b7\u30e7\u30c3\u30c8\u9593\u306e\u753b\u50cf\u30fb\u97f3\u58f0\u7279\u5fb4\u306b\u7740\u76ee\u3057\u305f\u30b9\u30d4\u30fc\u30c1\u30b7\u30e7\u30c3\u30c8\u62bd\u51fa"], "weko_shared_id": null}
ショット内及びショット間の画像・音声特徴に着目したスピーチショット抽出
http://hdl.handle.net/2237/23832
ef7bdc7f-fc7a-40ed-b3c2-a0289157369f
名前 / ファイル | ライセンス | アクション | |
---|---|---|---|
![]() |
|
Item type | 学術雑誌論文 / Journal Article(1) | |||||
---|---|---|---|---|---|---|
公開日 | 2016-03-15 | |||||
タイトル | ||||||
タイトル | ショット内及びショット間の画像・音声特徴に着目したスピーチショット抽出 | |||||
その他のタイトル | ||||||
その他のタイトル | Extraction of Speech Shots Focusing on Visual and Audio Features within and between Shots | |||||
著者 |
熊谷, 章吾
× 熊谷, 章吾× 道満, 恵介× 高橋, 友和× 出口, 大輔× 井手, 一郎× 村瀬, 洋× KUMAGAI, Shogo× DOMAN, Keisuke× TAKAHASHI, Tomokazu× DEGUCHI, Daisuke× IDE, Ichiro× MURASE, Hiroshi |
|||||
権利 | ||||||
権利情報 | (c)一般社団法人電子情報通信学会 本文データは学協会の許諾に基づきCiNiiから複製したものである | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | スピーチショット抽出 | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | ニュース映像 | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | 映像検索 | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | 画像・音声特徴 | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | Speech shot extraction | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | news video | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | video retrieval | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | audio−visual features | |||||
抄録 | ||||||
内容記述 | 本報告では,ショット内及びショット間の特徴に基づく被写体と話者の異同判定によるニュース映像からのスピーチショット抽出手法を提案する.スピーチショットはマルチメディア情報を豊富に含み,資料的価値が高い.そこで我々はこれまで,被写体の口唇動作と話者の声から得られる複数の音声特徴と画像特徴の相関に基づく被写体と話者の異同判定手法を提案してきた.この手法は,音声ノイズの少ないショットに対しては高精度な異同判定が可能であるが,多量の音声ノイズを含むショットに対しての異同判定は困難であった.そこで本報告では,2段階の処理による被写体と話者の異同判定手法を提案する.まず第1段階で,これまでに提案した手法により異同判定を行う.その後,第2段階で,ショット内及びその前後のショットとの間に表れる特徴的な画像・音声の性質に基づいて異同判定を行う.スピーチショット抽出実験の結果,提案手法の有効性を確認した. We propose a method to extract speech shots from news videos using detecting the inconsisteny between a subject and the speaker focusing on features within and between shots. Speech shots in news videos contain a wealth of multimedia information, and are valuable as archived material. To extract speech shots, we have previously proposed a method to detect the inconsistency between a subject and the speaker based on the co-occurrence between a subject's lip motion and the speaker's voice. This previous method could detect the inconsistency in a shot with little audio noises. However, it is difficult to detect the inconsistency in a shot with significant amount of audio noises. In order to deal with this problem, the proposed method detects the inconsisteny between a subject and the speaker in two steps. The first step detects the inconsistency by our previous method, and the second step detects the inconsistency based on the intra- and inter- shot features. Experimental results showed the effectiveness of the proposed method. | |||||
内容記述タイプ | Abstract | |||||
出版者 | ||||||
出版者 | 一般社団法人電子情報通信学会 | |||||
言語 | ||||||
言語 | jpn | |||||
資源タイプ | ||||||
資源タイプresource | http://purl.org/coar/resource_type/c_6501 | |||||
タイプ | journal article | |||||
ISSN | ||||||
収録物識別子タイプ | ISSN | |||||
収録物識別子 | 0913-5685 | |||||
書誌情報 |
電子情報通信学会技術研究報告. MVE, マルチメディア・仮想環境基礎 巻 111, 号 479, p. 81-86, 発行日 2012-03 |
|||||
著者版フラグ | ||||||
値 | publisher | |||||
シリーズ | ||||||
関連名称 | ||||||
関連名称 | IEICE Technical Report;IE2011-147 | |||||
シリーズ | ||||||
関連名称 | ||||||
関連名称 | IEICE Technical Report;MVE2011-109 | |||||
URI | ||||||
識別子 | http://ci.nii.ac.jp/naid/110009546402/ | |||||
識別子タイプ | URI | |||||
URI | ||||||
識別子 | http://hdl.handle.net/2237/23832 | |||||
識別子タイプ | HDL |