Abstract:
Learning in a partially observable and nonstationary environment is still one of the challenging problems in the area of multiagent (MA) learning. Reinforcement learning ...Show MoreMetadata
Abstract:
Learning in a partially observable and nonstationary environment is still one of the challenging problems in the area of multiagent (MA) learning. Reinforcement learning is a generic method that suits the needs of MA learning in many aspects. This paper presents two new multiagent based domain independent coordination mechanisms for reinforcement learning; multiple agents do not require explicit communication among themselves to learn coordinated behavior. The first coordination mechanism is the perceptual coordination mechanism, where other agents are included in state descriptions and coordination information is learned from state transitions. The second is the observing coordination mechanism, which also includes other agents in state descriptions and additionally the rewards of nearby agents are observed from the environment. The observed rewards and agent's own reward are used to construct an optimal policy. This way, the latter mechanism tends to increase region-wide joint rewards. The selected experimented domain is adversarial food-collecting world (AFCW), which can be configured both as single and multiagent environments. Function approximation and generalization techniques are used because of the huge state space. Experimental results show the effectiveness of these mechanisms.
Published in: IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) ( Volume: 30, Issue: 4, November 2000)
DOI: 10.1109/5326.897075
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Function Approximation ,
- Multi-agent Reinforcement Learning ,
- Learning Algorithms ,
- General Method ,
- Transition State ,
- Dynamic Environment ,
- Lookup Table ,
- Description Of Conditions ,
- Coordination Mechanisms ,
- Autonomous Agents ,
- Coordinate Information ,
- Partial Observation ,
- Learning Agent ,
- Piece Of Food ,
- Time Step ,
- Monte Carlo Simulation ,
- Temporal Differences ,
- Types Of Sensors ,
- Number Of Agents ,
- Multi-agent Systems ,
- State-action Pair ,
- Rational Policy ,
- Rational Agents ,
- Part Of Environment ,
- Policy Network ,
- Markov Decision Process ,
- Dynamic Programming Method ,
- Policy Cooperation ,
- Behavior Of Agents ,
- Performance Of Agents
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Function Approximation ,
- Multi-agent Reinforcement Learning ,
- Learning Algorithms ,
- General Method ,
- Transition State ,
- Dynamic Environment ,
- Lookup Table ,
- Description Of Conditions ,
- Coordination Mechanisms ,
- Autonomous Agents ,
- Coordinate Information ,
- Partial Observation ,
- Learning Agent ,
- Piece Of Food ,
- Time Step ,
- Monte Carlo Simulation ,
- Temporal Differences ,
- Types Of Sensors ,
- Number Of Agents ,
- Multi-agent Systems ,
- State-action Pair ,
- Rational Policy ,
- Rational Agents ,
- Part Of Environment ,
- Policy Network ,
- Markov Decision Process ,
- Dynamic Programming Method ,
- Policy Cooperation ,
- Behavior Of Agents ,
- Performance Of Agents