Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation.
In: Neural Computation, Jg. 19 (2007-11-01), Heft 11, S. 3051-3087
academicJournal
Zugriff:
Games constitute a challenging domain of reinforcement learning (RL) for acquiring strategies because many of them include multiple players and many unobservable variables in a large state space. The difficulty of solving such realistic multi agent problems with partial observability arises mainly from the fact that the computational cost for the estimation and prediction in the whole state space, including unobservable variables, is too heavy. To overcome this intractability and enable an agent to learn in an unknown environment, an effective approximation method is required with explicit learning of the environmental model. We present a model-based RL scheme for large-scale multi agent problems with partial observability and apply it to a card game, hearts. This game is a well defined example of an imperfect information game and can be approximately formulated as a partially observable Markov decision process (POMDP) for a single learning agent. To reduce the computational cost, we use a sampling technique in which the heavy integration required for the estimation and prediction can be approximated by a plausible number of samples. Computer simulation results show that our method is effective in solving such a difficult, partially observable multi agent problem. [ABSTRACT FROM AUTHOR]
Copyright of Neural Computation is the property of MIT Press and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Titel: |
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation.
|
---|---|
Autor/in / Beteiligte Person: | Fujita, Hajime ; Ishii, Shin |
Zeitschrift: | Neural Computation, Jg. 19 (2007-11-01), Heft 11, S. 3051-3087 |
Veröffentlichung: | 2007 |
Medientyp: | academicJournal |
ISSN: | 0899-7667 (print) |
DOI: | 10.1162/neco.2007.19.11.3051 |
Schlagwort: |
|
Sonstiges: |
|