多人数不完全情報ゲームにおける仮想自己対戦を用いた強化学習

河村, 圭悟; 水上, 直紀; 鶴岡, 慶雅; Keigo, Kawamura; Naoki, Mizukami; Yoshimasa, Tsuruoka

WEKO3

インデックスツリー

RootNode

アイテム

多人数不完全情報ゲームにおける仮想自己対戦を用いた強化学習

https://ipsj.ixsq.nii.ac.jp/records/175360

名前 / ファイル	ライセンス	アクション
IPSJ-GPWS2016031.pdf (835.3 kB)	Copyright (c) 2016 by the Information Processing Society of Japan
オープンアクセス

Item type

Symposium(1)

公開日

2016-10-28

タイトル

多人数不完全情報ゲームにおける仮想自己対戦を用いた強化学習

タイトル

言語

タイトル

Neural Fictitious Self-Play in Multiplayer Imperfect Information Games

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_5794

資源タイプ

conference paper

著者所属

東京大学工学部

著者所属

東京大学大学院工学系研究科

著者所属

東京大学大学院工学系研究科

著者所属(英)

Department of Information and Communication Engineering, The University of Tokyo

著者所属(英)

Department of Electrical Engineering and Information Systems, Graduate School of Engineering, The University of Tokyo

著者所属(英)

Department of Electrical Engineering and Information Systems, Graduate School of Engineering, The University of Tokyo

著者名

河村, 圭悟
水上, 直紀
鶴岡, 慶雅

著者名(英)

Keigo, Kawamura
Naoki, Mizukami
Yoshimasa, Tsuruoka

論文抄録

内容記述タイプ

Other

内容記述

不完全情報ゲームにおいて，ナッシュ均衡戦略は非常に重要なテーマである．特に多人数不完全情報ゲームにおいては，ナッシュ均衡解を一般に求める方法はまだ確立されていないことから，多くの関心を集めている．2人テキサス・ホールデムはCFR+ (Tamelin, 2014) によって解かれた (generally weakly solved) が，CFR+は空間計算量の観点から3人以上のテキサス・ホールデムに適用するには問題がある．本研究ではNFSP (Heinrich and Silver, 2016) と呼ばれる手法を用いて，CFR+では解くことが難しい多人数不完全情報ゲームのナッシュ均衡解を求めることを目指す．本研究では，学習部分にソフトマックス回帰を用いたFictitious Self-Play (FSP) を使用して，テキサス・ホールデムのトイゲームである2人クン・ポーカーにおいてFSPが近似的なナッシュ均衡解を求められることを示した．また，多人数ゲームである3人クン・ポーカーにおいても，FSPが近似的なナッシュ均衡解を求められることを示し，CFR+の戦略に対するFSPの戦略の平均被搾取量が減少することを示した．

論文抄録(英)

内容記述タイプ

Other

内容記述

Computing Nash equilibrium solutions is an important problem in the domain of imperfect information games. Attempts of solving the problem draw considerable attention especially in the domain of multiplayer games because there is currently no method that can calculate approximate Nash equilibrium solutions in a general setting. CFR+ (Tamelin, 2014) can be used to (essentially weakly) solve two-player limit Texas Hold'em, but it cannot be applied to large multiplayer games due to the problem of space complexity. In this paper, we use Neural Fictitious Self-Play (Heinrich and Silver, 2016) to calculate approximate Nash equilibrium solutions for multiplayer imperfect information games that CFR+ can hardly solve. We show that Fictitious Self-Play (FSP) with a softmax regression allows us to calculate approximate Nash equilibrium solutions in two-player Kuhn poker and three-player Kuhn poker. We also show that the exploitability of the FSP solution by that of CFR+ decreases.

書誌情報

ゲームプログラミングワークショップ2016論文集

巻 2016, p. 188-195, 発行日 2016-10-28

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-20 06:15:57.776856

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

多人数不完全情報ゲームにおける仮想自己対戦を用いた強化学習

× 河村, 圭悟

× 水上, 直紀

× 鶴岡, 慶雅

× Keigo, Kawamura

× Naoki, Mizukami

× Yoshimasa, Tsuruoka

Versions

Share

Cite as

エクスポート