混合ディレクレ分布を用いた文脈のモデル化と言語モデルへの応用

山本, 幹雄; 貞光, 九月; 三品, 拓也; Mikio, Yamamoto; Kugatsu, Sadamitsu; Takuya, Mishina

WEKO3

インデックスツリー

RootNode

アイテム

混合ディレクレ分布を用いた文脈のモデル化と言語モデルへの応用

https://ipsj.ixsq.nii.ac.jp/records/57212

名前 / ファイル	ライセンス	アクション
IPSJ-SLP03048005.pdf (609.2 kB)	Copyright (c) 2003 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2003-10-17

タイトル

混合ディレクレ分布を用いた文脈のモデル化と言語モデルへの応用

タイトル

言語

タイトル

Context modeling using Dirichlet mixtures and its applications to language models

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

筑波大学電子・情報工学系

著者所属

筑波大学情報学類

著者所属

筑波大学理工学研究科

著者所属(英)

Institute of Information Sciences and Electronics, University of Tsukuba

著者所属(英)

Collage of Information Sciences, University of Tsukuba

著者所属(英)

Masters' Program in Science and Engineering, University of Tsukuba

著者名

山本, 幹雄貞光, 九月三品, 拓也

著者名(英)

Mikio, Yamamoto Kugatsu, Sadamitsu Takuya, Mishina

論文抄録

内容記述タイプ

Other

内容記述

混合ディレクレ分布を多項分布パラメータの事前分布とした(合成分布は混合Polya分布）、文脈／文書の確率モデルを検討する。本稿では、混合ディレクレ分布のパラメータおよび適応時に必要な事後分布の期待値推定方法をいくつか述べ、動的に適応する?textit{n}gram言語モデルを用いた実験で確率的LSAのベイズ的な発展モデルとの比較を示す。混合ディレクレ分布や混合Polya分布は他のベイズ的な文脈モデルに比べて単純なので、予測分布を閉じた式で導出可能である。これは、Latent Dirichlet Allocation (LDA)のような他のベイズ的なモデルがいずれも予測分布の推定に近似を必要とする点と比べて、大きな優位性といえる。実験では、混合ディレクレ分布を用いたモデルが低い混合数で比較モデルよりも低いパープレキシティを達成できることを示す。

論文抄録(英)

内容記述タイプ

Other

内容記述

We investigate a generative context/text model using Dirichlet mixtures as a distribution for parameters of a multinominal distribution, whose compound distribution is Polya mixtures. In this paper, we describe some estimation methods for parameters of Dirichlet mixtures and a posterior distribution (adaptation), and show experiments to compare the proposed model with the other Bayesian variants of Probabilistic LSA in perplexity of adaptive \textit{n}gram language models. Since the Dirichlet and Polya mixtures are simpler than the other Baysian context models such as Latent Dirichlet Allocation (LDA), the posterior distribution can be derived as a closed form without approximations needed by LDA. In the experiments we show lower perplexity of Dirichlet mixtures than that of the other.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN10442647

書誌情報

情報処理学会研究報告音声言語情報処理（SLP）

巻 2003, 号 104(2003-SLP-048), p. 29-34, 発行日 2003-10-17

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-22 04:36:24.430927

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

混合ディレクレ分布を用いた文脈のモデル化と言語モデルへの応用

× 山本, 幹雄貞光, 九月三品, 拓也

× Mikio, Yamamoto Kugatsu, Sadamitsu Takuya, Mishina

Versions

Share

Cite as

エクスポート

インデックスリンク

インデックスツリー

アイテム

混合ディレクレ分布を用いた文脈のモデル化と言語モデルへの応用

× 山本, 幹雄 貞光, 九月 三品, 拓也

× Mikio, Yamamoto Kugatsu, Sadamitsu Takuya, Mishina

Versions

Share

Cite as

エクスポート

× 山本, 幹雄貞光, 九月三品, 拓也