Abstract:
Learning in a partially observable and nonstationary environment is still one of the challenging problems in the area of multiagent (MA) learning. Reinforcement learning ...Show MoreMetadata
Abstract:
Learning in a partially observable and nonstationary environment is still one of the challenging problems in the area of multiagent (MA) learning. Reinforcement learning is a generic method that suits the needs of MA learning in many aspects. This paper presents two new multiagent based domain independent coordination mechanisms for reinforcement learning; multiple agents do not require explicit communication among themselves to learn coordinated behavior. The first coordination mechanism is the perceptual coordination mechanism, where other agents are included in state descriptions and coordination information is learned from state transitions. The second is the observing coordination mechanism, which also includes other agents in state descriptions and additionally the rewards of nearby agents are observed from the environment. The observed rewards and agent's own reward are used to construct an optimal policy. This way, the latter mechanism tends to increase region-wide joint rewards. The selected experimented domain is adversarial food-collecting world (AFCW), which can be configured both as single and multiagent environments. Function approximation and generalization techniques are used because of the huge state space. Experimental results show the effectiveness of these mechanisms.
Published in: IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) ( Volume: 30, Issue: 4, November 2000)
DOI: 10.1109/5326.897075
References is not available for this document.
Select All
1.
F. Polat and A. Guvenir, "A conflict resolution based decentralized multi-agent problem solving model" in Artificial Social Systems, Germany, Berlin:Springer-Verlag, 1994.
2.
S. Benson, Reacting Planning and Learning in an Autonomous Agent, 1995.
3.
L. Baird, "Residual algorithms: Reinforcement learning with function approximation", Proc. Int. Conf. Machine Learning, 1995.
4.
R. S. Sutton, "Generalization in reinforcement learning: Successful examples using sparse coarse coding" in Advances in Neural Information Processing Systems, 1996.
5.
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MA, Cambridge:MIT Press, 1998.
6.
G. A. Rummery, Problem Solving with Reinforcement Learning, 1995.
7.
" On multiagent Q -learning in a semi-competitive domain " in Adaption and Learning in Multi-Agent Systems, Germany, Berlin:Springer-Verlag, 1996.
8.
C. J. C. H. Watkins and P. Dayan, "Q -learning ", Mach. Learn., vol. 8, pp. 279-292, 1992.
9.
L. P. Kaelbling, M. L. Littman and A. W. Moore, "Reinforcement learning: A survey", J. Artif. Intell. Res., vol. 4, pp. 237-285, 1996.
10.
P. Langley, Elements of Machine Learning, CA, San Mateo:Morgan Kaufmann, 1996.
11.
J. W. Sheppard, Multi-Agent Reinforcement Learning in Markov Games, 1997.
12.
S. P. Singh, Learning to Solve Markov Decision Processes, 1994.
13.
L. P. Kaelbling, "Planning and acting in partially observable stochastic domains", Artif. Intell., vol. 101, 1998.
14.
N. Meuleau, "Learning finite-state controllers for partially observable environments", Proc. Int. Conf. Uncertainty Artificial Intelligence, 1999.
15.
"Introduction to Monte Carlo methods" in Learning in Graphical Models, MA, Cambridge:MIT Press, pp. 175-204, 1999.
16.
L. J. Lin, "Self-improving reactive agents based on reinforcement learning planning and teaching", Mach. Learn., vol. 8, pp. 293-321, 1992.
17.
P. Cichosz, Reinforcement learning by truncating temporal differences, 1997.
18.
J. A. Boyan, Modular neural networks for learning context-dependent game strategies, 1992.
19.
M. J. Zurada, Introduction to Artificial Neural Systems, New York:West, 1993.
20.
R. H. Crites and A. G. Barto, "Improving elevator performance using reinforcement learning" in Advances in Neural Information Processing Systems, MA, Cambridge:MIT Press, vol. 8, pp. 1017-1023, 1996.
21.
L. Gambardella and M. Dorigo, "Ant-Q: A reinforcement learning approach to the traveling salesman problem", IEEE Trans. Syst. Man Cybern. B, vol. 26, no. 1, pp. 29-41, 1995.
22.
J. Hu and M. P. Wellman, "Multiagent reinforcement learning and stochastic games", Games Econ. Behav., 1998.
23.
M. L. Littman, "Markov games as a framework for multi-agent reinforcement learning", Proc. Int. Conf. Machine Learning, 1994.
24.
V. Miagkikh and W. Punch, "Global search in combinatorial optimization using reinforcement learning algorithms", Proc. 1999 Congr. Evolutionary Computation, vol. 1, pp. 189-196, 1999.
25.
T. W. Sandholm and R. H. Crites, "Multiagent reinforcement learning in the iterated prisoners dilemma", Biosystems, vol. 37, pp. 147-166.
26.
S. P. Singh and D. Bertsekas, "Reinforcement learning for dynamic channel allocation in cellular telephone systems", Proc. Advanced Neural Information Processing Systems, pp. 974-980, 1996.
27.
M. Asada, "Agents that learn from other competitive agents", Proc. Machine Learning Workshop Agents That Learn from Other Agents, 1995.
28.
F. Polat, "A negotiation platform for cooperating multi-agent systems", Int. J. Concurrent Eng., no. 3, 1993.
29.
D. C. Parkes and L. H. Ungar, "Learning and adaption in multiagent systems", Proc. AAAI Multiagent Learning Workshop, 1997.
30.
A. Schaerf, Y. Shoham and M. Tennenholtz, "Adaptive load balancing: A study in multi-agent learning", J. Artif. Intell. Res., vol. 2, pp. 475-500, 1995.