AUTOMATIC DISCRIMINATION BETWEEN SINGING AND SPEAKING VOICES FOR A FLEXIBLE MUSIC RETRIEVAL SYSTEM

OHISHI, Yasunori; GOTO, Masataka; ITOU, Katunobu; TAKEDA, Kazuya

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

{"_buckets": {"deposit": "58fbc28d-1188-459c-9bb4-ac1e51dfea2c"}, "_deposit": {"id": "8724", "owners": [], "pid": {"revision_id": 0, "type": "depid", "value": "8724"}, "status": "published"}, "_oai": {"id": "oai:nagoya.repo.nii.ac.jp:00008724", "sets": ["1038"]}, "author_link": ["24558", "24559", "24560", "24561"], "item_18_biblio_info_6": {"attribute_name": "書誌情報", "attribute_value_mlt": [{"bibliographicIssueDates": {"bibliographicIssueDate": "2006-12", "bibliographicIssueDateType": "Issued"}, "bibliographicPageEnd": "114", "bibliographicPageStart": "113", "bibliographic_titles": [{"bibliographic_title": "4th Symposium on \"Intelligent Media Integration for Social Information Infrastructure\" December 7-8, 2006", "bibliographic_titleLang": "en"}]}]}, "item_18_description_4": {"attribute_name": "抄録", "attribute_value_mlt": [{"subitem_description": "This paper describes a music retrieval system that enables a user to retrieve a song by two different methods: by singing its melody or by saying its title. To allow the user to use those methods seamlessly without changing a voice input mode, a method of automatically discriminating between singing and speaking voices is indispensable. We therefore first investigated measures that characterize differences between singing and speaking voices. From subjective experiments, we found that even short term characteristics such as the spectral envelope represented as MFCC can be used as a discrimination cue, while the temporal structure is the most important cue when longer signals are given. According to these results, we developed the automatic method of discriminating between singing and speaking voices by combining two measures: MFCC and an F0 (voice pitch) contour. Based on this method, we built the music retrieval system that can accept both singing voices for the melody and speaking voices for the title.", "subitem_description_language": "en", "subitem_description_type": "Abstract"}]}, "item_18_identifier_60": {"attribute_name": "URI", "attribute_value_mlt": [{"subitem_identifier_type": "HDL", "subitem_identifier_uri": "http://hdl.handle.net/2237/10475"}]}, "item_18_publisher_32": {"attribute_name": "出版者", "attribute_value_mlt": [{"subitem_publisher": "INTELLIGENT MEDIA INTEGRATION  NAGOYA UNIVERSITY / COE", "subitem_publisher_language": "en"}]}, "item_18_select_15": {"attribute_name": "著者版フラグ", "attribute_value_mlt": [{"subitem_select_item": "publisher"}]}, "item_18_text_14": {"attribute_name": "フォーマット", "attribute_value_mlt": [{"subitem_text_value": "application/pdf"}]}, "item_access_right": {"attribute_name": "アクセス権", "attribute_value_mlt": [{"subitem_access_right": "open access", "subitem_access_right_uri": "http://purl.org/coar/access_right/c_abf2"}]}, "item_creator": {"attribute_name": "著者", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "OHISHI, Yasunori", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "24558", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "GOTO, Masataka", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "24559", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "ITOU, Katunobu", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "24560", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "TAKEDA, Kazuya", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "24561", "nameIdentifierScheme": "WEKO"}]}]}, "item_files": {"attribute_name": "ファイル情報", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "2018-02-19"}], "displaytype": "detail", "download_preview_message": "", "file_order": 0, "filename": "p113-114_Automatic_Discrimination.pdf", "filesize": [{"value": "10.1 MB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_note", "mimetype": "application/pdf", "size": 10100000.0, "url": {"label": "p113-114_Automatic_Discrimination.pdf", "objectType": "fulltext", "url": "https://nagoya.repo.nii.ac.jp/record/8724/files/p113-114_Automatic_Discrimination.pdf"}, "version_id": "b3d250eb-5cb9-4aa3-81bd-a1235590bb7c"}]}, "item_language": {"attribute_name": "言語", "attribute_value_mlt": [{"subitem_language": "eng"}]}, "item_resource_type": {"attribute_name": "資源タイプ", "attribute_value_mlt": [{"resourcetype": "conference paper", "resourceuri": "http://purl.org/coar/resource_type/c_5794"}]}, "item_title": "AUTOMATIC DISCRIMINATION BETWEEN SINGING AND SPEAKING VOICES FOR A FLEXIBLE MUSIC RETRIEVAL SYSTEM", "item_titles": {"attribute_name": "タイトル", "attribute_value_mlt": [{"subitem_title": "AUTOMATIC DISCRIMINATION BETWEEN SINGING AND SPEAKING VOICES FOR A FLEXIBLE MUSIC RETRIEVAL SYSTEM", "subitem_title_language": "en"}]}, "item_type_id": "18", "owner": "1", "path": ["1038"], "permalink_uri": "http://hdl.handle.net/2237/10475", "pubdate": {"attribute_name": "PubDate", "attribute_value": "2008-08-28"}, "publish_date": "2008-08-28", "publish_status": "0", "recid": "8724", "relation": {}, "relation_version_is_last": true, "title": ["AUTOMATIC DISCRIMINATION BETWEEN SINGING AND SPEAKING VOICES FOR A FLEXIBLE MUSIC RETRIEVAL SYSTEM"], "weko_shared_id": -1}

AUTOMATIC DISCRIMINATION BETWEEN SINGING AND SPEAKING VOICES FOR A FLEXIBLE MUSIC RETRIEVAL SYSTEM

http://hdl.handle.net/2237/10475

名前 / ファイル	ライセンス	アクション
p113-114_Automatic_Discrimination.pdf (10.1 MB)

Item type

会議発表論文 / Conference Paper(1)

公開日

2008-08-28

タイトル

AUTOMATIC DISCRIMINATION BETWEEN SINGING AND SPEAKING VOICES FOR A FLEXIBLE MUSIC RETRIEVAL SYSTEM

言語

著者

OHISHI, Yasunori
GOTO, Masataka
ITOU, Katunobu
TAKEDA, Kazuya

アクセス権

open access

アクセス権URI

http://purl.org/coar/access_right/c_abf2

抄録

内容記述

This paper describes a music retrieval system that enables a user to retrieve a song by two different methods: by singing its melody or by saying its title. To allow the user to use those methods seamlessly without changing a voice input mode, a method of automatically discriminating between singing and speaking voices is indispensable. We therefore first investigated measures that characterize differences between singing and speaking voices. From subjective experiments, we found that even short term characteristics such as the spectral envelope represented as MFCC can be used as a discrimination cue, while the temporal structure is the most important cue when longer signals are given. According to these results, we developed the automatic method of discriminating between singing and speaking voices by combining two measures: MFCC and an F0 (voice pitch) contour. Based on this method, we built the music retrieval system that can accept both singing voices for the melody and speaking voices for the title.

言語

内容記述タイプ

Abstract

出版者

言語

出版者

INTELLIGENT MEDIA INTEGRATION NAGOYA UNIVERSITY / COE

言語

eng

資源タイプ

資源

http://purl.org/coar/resource_type/c_5794

タイプ

conference paper

書誌情報

en : 4th Symposium on "Intelligent Media Integration for Social Information Infrastructure" December 7-8, 2006

p. 113-114, 発行日 2006-12

フォーマット

application/pdf

著者版フラグ

値

publisher

URI

識別子

http://hdl.handle.net/2237/10475

識別子タイプ

HDL

戻る

views

See details

	Views

Versions

Ver.1

2021-03-01 19:54:30.524007

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

AUTOMATIC DISCRIMINATION BETWEEN SINGING AND SPEAKING VOICES FOR A FLEXIBLE MUSIC RETRIEVAL SYSTEM

× OHISHI, Yasunori

× GOTO, Masataka

× ITOU, Katunobu

× TAKEDA, Kazuya

Versions

Share

Cite as

エクスポート