Journals & Magazines >IEEE Transactions on Neural N... >Volume: 35 Issue: 12

When to Switch: Planning and Learning for Partially Observable Multi-Agent Pathfinding

Download PDF
Download References
Request Permissions
Save to
Alerts

The general pipeline of the switching approach is as follows. The observation space of the environment consists of two matrices that encode obstacles and agents, as well ...

Abstract:

Multi-agent pathfinding (MAPF) is a problem that involves finding a set of non-conflicting paths for a set of agents confined to a graph. In this work, we study a MAPF se...Show More

Metadata

Abstract:

Multi-agent pathfinding (MAPF) is a problem that involves finding a set of non-conflicting paths for a set of agents confined to a graph. In this work, we study a MAPF setting, where the environment is only partially observable for each agent, i.e., an agent observes the obstacles and other agents only within a limited field-of-view. Moreover, we assume that the agents do not communicate and do not share knowledge on their goals, intended actions, etc. The task is to construct a policy that maps the agent’s observations to actions. Our contribution is multifold. First, we propose two novel policies for solving partially observable MAPF (PO-MAPF): one based on heuristic search and another one based on reinforcement learning (RL). Next, we introduce a mixed policy that is based on switching between the two. We suggest three different switch scenarios: the heuristic, the deterministic, and the learnable one. A thorough empirical evaluation of all the proposed policies in a variety of setups shows that the mixing policy demonstrates the best performance is able to generalize well to the unseen maps and problem instances, and, additionally, outperforms the state-of-the-art counterparts (PRIMAL2 and PICO). The source-code is available at https://github.com/AIRI-Institute/when-to-switch.

The general pipeline of the switching approach is as follows. The observation space of the environment consists of two matrices that encode obstacles and agents, as well ...

Published in: IEEE Transactions on Neural Networks and Learning Systems ( Volume: 35, Issue: 12, December 2024)

Page(s): 17411 - 17424

Date of Publication: 31 August 2023

ISSN Information:

PubMed ID: 37651484

DOI: 10.1109/TNNLS.2023.3303502

Contents

I. Introduction

Multi-agent pathfinding (MAPF) is a challenging problem with topical applications in robotics, video games, logistics, etc. Typically, in MAPF, agents are confined to a graph, and at each timestep, an agent can either move to an adjacent vertex or stay put [1]. The task of each agent is to reach a predefined goal vertex. If the graph is undirected, the solution can be found in polynomial time [2] while finding the optimal solution with respect to a range of the objective functions is NP-hard [3]. Moreover, if the graph is directed, even the decision variant of MAPF is intractable [4].

References is not available for this document.

When to Switch: Planning and Learning for Partially Observable Multi-Agent Pathfinding

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

When to Switch: Planning and Learning for Partially Observable Multi-Agent Pathfinding

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References