強化学習による評価関数の獲得における報酬設定について

但馬, 康宏; Yasuhiro, Tajima

WEKO3

インデックスツリー

RootNode

アイテム

強化学習による評価関数の獲得における報酬設定について

https://ipsj.ixsq.nii.ac.jp/records/69716

名前 / ファイル	ライセンス	アクション
IPSJ-GI10024008.pdf (188.8 kB)	Copyright (c) 2010 by the Information Processing Society of Japan
オープンアクセス

Item type

SIG Technical Reports(1)

公開日

2010-06-18

タイトル

強化学習による評価関数の獲得における報酬設定について

タイトル

言語

タイトル

Reward setting on reinforcement learning for an evaluation function of games

言語

jpn

キーワード

主題Scheme

Other

主題

その他

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_18gh

資源タイプ

technical report

著者所属

岡山県立大学情報工学部

著者所属(英)

Department of Systems Engineering, Okayama Prefectural University

著者名

但馬, 康宏

著者名(英)

Yasuhiro, Tajima

論文抄録

内容記述タイプ

Other

内容記述

ゲームの評価関数を強化学習を用いて獲得する場合，一般的には終了局面における勝敗を報酬とし，途中局面の報酬を 0 とする手法が知られている．本研究では途中局面に対する報酬をその局面におけるランダムシミュレーションの勝率とし，終了局面における勝敗の報酬の大きさを変化させた場合の違いを検証する．さらにオセロゲーム Zebra において利用されている盤面パターンの評価重みを本手法により学習し，実験的評価とする．

論文抄録(英)

内容記述タイプ

Other

内容記述

Reinforcement learning for an evaluation function of games is applied with zero-reward for intermediate posistions and win/lose rewawrd for the terminal position, in general. In this paper, we propose some reward setting methods for intermediate positions and compare them each other. Then, we evaluate our methods by experiments on othello game Zebra's pattern check parameters.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AA11362144

書誌情報

研究報告ゲーム情報学（GI）

巻 2010-GI-24, 号 8, p. 1-7, 発行日 2010-06-18

Notice

SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc.

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-21 23:48:36.512470

Show All versions

Cite as

但馬, 康宏, 2010: 情報処理学会, 1–7 p.

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

強化学習による評価関数の獲得における報酬設定について

× 但馬, 康宏

× Yasuhiro, Tajima

Versions

Share

Cite as

エクスポート