WEKO3
アイテム
{"_buckets": {"deposit": "95091830-9891-4755-8e14-f87e8d5b5f06"}, "_deposit": {"created_by": 3, "id": "1134", "owners": [3], "pid": {"revision_id": 0, "type": "depid", "value": "1134"}, "status": "published"}, "_oai": {"id": "oai:uec.repo.nii.ac.jp:00001134", "sets": ["14"]}, "author_link": ["7094"], "control_number": "1134", "item_10006_date_granted_11": {"attribute_name": "学位授与年月日", "attribute_value_mlt": [{"subitem_dategranted": "2009-03-24"}]}, "item_10006_degree_grantor_9": {"attribute_name": "学位授与機関", "attribute_value_mlt": [{"subitem_degreegrantor": [{"subitem_degreegrantor_name": "電気通信大学"}]}]}, "item_10006_degree_name_8": {"attribute_name": "学位名", "attribute_value_mlt": [{"subitem_degreename": "博士(工学)"}]}, "item_10006_description_10": {"attribute_name": "学位授与年度", "attribute_value_mlt": [{"subitem_description": "2008", "subitem_description_type": "Other"}]}, "item_10006_description_7": {"attribute_name": "抄録", "attribute_value_mlt": [{"subitem_description": "“Cocktail party” problem describes a method for focusing listener’s attention on asingle talker among the mixture of conversations and background noises. In recentdecades, a lot of technological studies to solve this problem have been researched, anda lot of methods for Blind Source Separation (BSS) have been proposed and achieved toimprove the Signal to Noise Ratio (SNR). Post-processing methods aiming to improvethe performance of BSS have also been proposed actively.The goal of the solution of the cocktail party problem has been thought to extract aspecific source component in order to achieve a good SNR as much as possible from themixture of conversations. In that sense, both of BSS methods and the post-processingmethods seem to have succeeded. However, listeners can often recognize the existenceof residual speech component, i.e. crosstalk component, in the output of these methodsbecause their main object is extraction or enhancement of a target signal.In contrast, we are aiming to extract a target speech without recognition of crosstalkcomponents. In other words, our main claim in this paper is that the performance ofBSS must be evaluated on the basis of not only the quality of signal separation butalso the absolute levels of residual crosstalk components. The purpose of our studyis to achieve this goal by adding a post-processing unit after BSS processing. If thisgoal can be achieved, our method is useful for intercepting effectively the transmissionof speech which must not be heard on the other side for teleconference. Moreover,it is also useful for improving the security of one-to-one conversation on an exclusivetelephone line. For example, an accidental speech of a person who shares a room withthe talker will be intercepted. These kinds of applications will lead a further possibilityof BSS methods.This paper consists of six chapters.In Chapter 1, the background and the purpose of this study are described. In addition,the outline and the composition of this paper also be described.In Chapter 2, the cocktail party problem is shown. Moreover, it explains the BSS thatis the approach that technologically solves the cocktail party problem. Here, it describesIndependent Component Analysis(ICA) paid attention to as BSS technique, and itexplains frequency domain ICA used as a preprocessing by this study. Furthermore,the traditional postprocessing methods are shown in order to illustrate the purpose ofthis study.In Chapter 3, the crosstalk suppressing method using the vocal tract model generatedwith the linear prediction analysis is described. This method is called SpectrumEnvelope Inverse Filtering (SEIF). In this method, the crosstalk components are efficientlysuppressed based on a spectrum structure of the speeches by using the linearprediction filter. In this study, it is necessary to evaluate the suppressing performanceof the crosstalk components. Then, the listening experiments using the examinees areillustrated in addition to a evaluation of the SNR. Experimental result shows that theSEIF method achieves preventing the recognition of the crosstalk components withoutsacrificing hearing of the target speech.In Chapter 4, an approach to make the SEIF method efficient further using lowerorderfilter is described. The purpose of this technique is to improve the distortion ofa target components that is the problem still remains in the SEIF method. Here, it isassumed that the distortion of a target components caused by the synthesis filteringin the SEIF method. Thus, it aims at the target components quality improvement byintroducing a lower-order filter in the synthetic filtering processing. The method isevaluated by the Perceptual Evaluation of Speech Quality (PESQ) criterion about atarget speech quality. As the results, it is shown that the target speech quality of theoutputs was improved by using this method.In Chapter 5, the method is described to fully suppress crosstalk components in atemporal frame where only a crosstalk speech is uttered. To prevent crosstalk componentsfrom being recognized at this temporal frame, the energy of the crosstalkcomponents has been spread smoothly all over the frequency region. The property ofthis method is analyzed on the basis of speech signals recorded in a real room. Experimentalresults suggest that the proposed method achieved effective suppression ofcrosstalk components even in the temporal frame where only a crosstalk speech exists.In Chapter 6, this is a conclusion, and this study is summarized, and future tasksare described.", "subitem_description_type": "Abstract"}]}, "item_creator": {"attribute_name": "著者", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "中川, 和也", "creatorNameLang": "ja"}, {"creatorName": "ナカガワ, カズヤ", "creatorNameLang": "ja-Kana"}, {"creatorName": "Nakagawa, Kazuya", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "7094", "nameIdentifierScheme": "WEKO"}]}]}, "item_files": {"attribute_name": "ファイル情報", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "2016-09-16"}], "displaytype": "detail", "download_preview_message": "", "file_order": 0, "filename": "9000000334.pdf", "filesize": [{"value": "7.8 MB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_free", "mimetype": "application/pdf", "size": 7800000.0, "url": {"label": "9000000334.pdf", "url": "https://uec.repo.nii.ac.jp/record/1134/files/9000000334.pdf"}, "version_id": "3cfd13ef-04dd-4f29-b9da-54dcc7028c71"}]}, "item_language": {"attribute_name": "言語", "attribute_value_mlt": [{"subitem_language": "jpn"}]}, "item_resource_type": {"attribute_name": "資源タイプ", "attribute_value_mlt": [{"resourcetype": "thesis", "resourceuri": "http://purl.org/coar/resource_type/c_46ec"}]}, "item_title": "ブラインド信号源分離に付加する漏話抑制処理に関する研究", "item_titles": {"attribute_name": "タイトル", "attribute_value_mlt": [{"subitem_title": "ブラインド信号源分離に付加する漏話抑制処理に関する研究", "subitem_title_language": "ja"}, {"subitem_title": "A Study on Crosstalk Suppressing Algorithms for Blind Source Separation", "subitem_title_language": "en"}]}, "item_type_id": "10006", "owner": "3", "path": ["14"], "permalink_uri": "https://uec.repo.nii.ac.jp/records/1134", "pubdate": {"attribute_name": "PubDate", "attribute_value": "2009-03-24"}, "publish_date": "2009-03-24", "publish_status": "0", "recid": "1134", "relation": {}, "relation_version_is_last": true, "title": ["ブラインド信号源分離に付加する漏話抑制処理に関する研究"], "weko_shared_id": -1}
ブラインド信号源分離に付加する漏話抑制処理に関する研究
https://uec.repo.nii.ac.jp/records/1134
https://uec.repo.nii.ac.jp/records/1134272004ae-9c7c-4792-92e2-0c056e412e9b
名前 / ファイル | ライセンス | アクション |
---|---|---|
9000000334.pdf (7.8 MB)
|
|
Item type | 学位論文 / Thesis or Dissertation(1) | |||||
---|---|---|---|---|---|---|
公開日 | 2009-03-24 | |||||
タイトル | ||||||
言語 | ja | |||||
タイトル | ブラインド信号源分離に付加する漏話抑制処理に関する研究 | |||||
タイトル | ||||||
言語 | en | |||||
タイトル | A Study on Crosstalk Suppressing Algorithms for Blind Source Separation | |||||
言語 | ||||||
言語 | jpn | |||||
資源タイプ | ||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_46ec | |||||
資源タイプ | thesis | |||||
著者 |
中川, 和也
× 中川, 和也 |
|||||
抄録 | ||||||
内容記述タイプ | Abstract | |||||
内容記述 | “Cocktail party” problem describes a method for focusing listener’s attention on asingle talker among the mixture of conversations and background noises. In recentdecades, a lot of technological studies to solve this problem have been researched, anda lot of methods for Blind Source Separation (BSS) have been proposed and achieved toimprove the Signal to Noise Ratio (SNR). Post-processing methods aiming to improvethe performance of BSS have also been proposed actively.The goal of the solution of the cocktail party problem has been thought to extract aspecific source component in order to achieve a good SNR as much as possible from themixture of conversations. In that sense, both of BSS methods and the post-processingmethods seem to have succeeded. However, listeners can often recognize the existenceof residual speech component, i.e. crosstalk component, in the output of these methodsbecause their main object is extraction or enhancement of a target signal.In contrast, we are aiming to extract a target speech without recognition of crosstalkcomponents. In other words, our main claim in this paper is that the performance ofBSS must be evaluated on the basis of not only the quality of signal separation butalso the absolute levels of residual crosstalk components. The purpose of our studyis to achieve this goal by adding a post-processing unit after BSS processing. If thisgoal can be achieved, our method is useful for intercepting effectively the transmissionof speech which must not be heard on the other side for teleconference. Moreover,it is also useful for improving the security of one-to-one conversation on an exclusivetelephone line. For example, an accidental speech of a person who shares a room withthe talker will be intercepted. These kinds of applications will lead a further possibilityof BSS methods.This paper consists of six chapters.In Chapter 1, the background and the purpose of this study are described. In addition,the outline and the composition of this paper also be described.In Chapter 2, the cocktail party problem is shown. Moreover, it explains the BSS thatis the approach that technologically solves the cocktail party problem. Here, it describesIndependent Component Analysis(ICA) paid attention to as BSS technique, and itexplains frequency domain ICA used as a preprocessing by this study. Furthermore,the traditional postprocessing methods are shown in order to illustrate the purpose ofthis study.In Chapter 3, the crosstalk suppressing method using the vocal tract model generatedwith the linear prediction analysis is described. This method is called SpectrumEnvelope Inverse Filtering (SEIF). In this method, the crosstalk components are efficientlysuppressed based on a spectrum structure of the speeches by using the linearprediction filter. In this study, it is necessary to evaluate the suppressing performanceof the crosstalk components. Then, the listening experiments using the examinees areillustrated in addition to a evaluation of the SNR. Experimental result shows that theSEIF method achieves preventing the recognition of the crosstalk components withoutsacrificing hearing of the target speech.In Chapter 4, an approach to make the SEIF method efficient further using lowerorderfilter is described. The purpose of this technique is to improve the distortion ofa target components that is the problem still remains in the SEIF method. Here, it isassumed that the distortion of a target components caused by the synthesis filteringin the SEIF method. Thus, it aims at the target components quality improvement byintroducing a lower-order filter in the synthetic filtering processing. The method isevaluated by the Perceptual Evaluation of Speech Quality (PESQ) criterion about atarget speech quality. As the results, it is shown that the target speech quality of theoutputs was improved by using this method.In Chapter 5, the method is described to fully suppress crosstalk components in atemporal frame where only a crosstalk speech is uttered. To prevent crosstalk componentsfrom being recognized at this temporal frame, the energy of the crosstalkcomponents has been spread smoothly all over the frequency region. The property ofthis method is analyzed on the basis of speech signals recorded in a real room. Experimentalresults suggest that the proposed method achieved effective suppression ofcrosstalk components even in the temporal frame where only a crosstalk speech exists.In Chapter 6, this is a conclusion, and this study is summarized, and future tasksare described. | |||||
学位名 | ||||||
学位名 | 博士(工学) | |||||
学位授与機関 | ||||||
学位授与機関名 | 電気通信大学 | |||||
学位授与年度 | ||||||
内容記述タイプ | Other | |||||
内容記述 | 2008 | |||||
学位授与年月日 | ||||||
学位授与年月日 | 2009-03-24 |