Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation.

Fujita, Hajime ; Ishii, Shin

In: Neural Computation, Jg. 19 (2007-11-01), Heft 11, S. 3051-3087

academicJournal

Games constitute a challenging domain of reinforcement learning (RL) for acquiring strategies because many of them include multiple players and many unobservable variables in a large state space. The difficulty of solving such realistic multi agent problems with partial observability arises mainly from the fact that the computational cost for the estimation and prediction in the whole state space, including unobservable variables, is too heavy. To overcome this intractability and enable an agent to learn in an unknown environment, an effective approximation method is required with explicit learning of the environmental model. We present a model-based RL scheme for large-scale multi agent problems with partial observability and apply it to a card game, hearts. This game is a well defined example of an imperfect information game and can be approximately formulated as a partially observable Markov decision process (POMDP) for a single learning agent. To reduce the computational cost, we use a sampling technique in which the heavy integration required for the estimation and prediction can be approximated by a plausible number of samples. Computer simulation results show that our method is effective in solving such a difficult, partially observable multi agent problem. [ABSTRACT FROM AUTHOR]

Copyright of Neural Computation is the property of MIT Press and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

Titel:	Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation.
Autor/in / Beteiligte Person:	Fujita, Hajime ; Ishii, Shin
Zeitschrift:	Neural Computation, Jg. 19 (2007-11-01), Heft 11, S. 3051-3087
Veröffentlichung:	2007
Medientyp:	academicJournal
ISSN:	0899-7667 (print)
DOI:	10.1162/neco.2007.19.11.3051
Schlagwort:	REINFORCEMENT learning MACHINE learning MARKOV processes STOCHASTIC processes PROBABILITY theory
Sonstiges:	Nachgewiesen in: Complementary Index Sprachen: English

Klicken Sie ein Format an und speichern Sie dann die Daten oder geben Sie eine Empfänger-Adresse ein und lassen Sie sich per Email zusenden.

BibTeX Citavi, JabRef, u.a.
(Literaturverwaltung)

PDF kein Volltext!
(Merkzettel, Notizen)

RIS Endnote, Citavi u.a.
(Literaturverwaltung)

MODS
(XML zur Weiterverarbeitung)

oder

Wählen Sie das für Sie passende Zitationsformat und kopieren Sie es dann in die Zwischenablage, lassen es sich per Mail zusenden oder speichern es als PDF-Datei.

Gewünschter Zitations-Stil:

oder

Bitte prüfen Sie, ob die Zitation formal korrekt ist, bevor Sie sie in einer Arbeit verwenden. Benutzen Sie gegebenenfalls den "Exportieren"-Dialog, wenn Sie ein Literaturverwaltungsprogramm verwenden und die Zitat-Angaben selbst formatieren wollen.