Loading [MathJax]/extensions/MathMenu.js
When to Switch: Planning and Learning for Partially Observable Multi-Agent Pathfinding | IEEE Journals & Magazine | IEEE Xplore

When to Switch: Planning and Learning for Partially Observable Multi-Agent Pathfinding


The general pipeline of the switching approach is as follows. The observation space of the environment consists of two matrices that encode obstacles and agents, as well ...

Abstract:

Multi-agent pathfinding (MAPF) is a problem that involves finding a set of non-conflicting paths for a set of agents confined to a graph. In this work, we study a MAPF se...Show More

Abstract:

Multi-agent pathfinding (MAPF) is a problem that involves finding a set of non-conflicting paths for a set of agents confined to a graph. In this work, we study a MAPF setting, where the environment is only partially observable for each agent, i.e., an agent observes the obstacles and other agents only within a limited field-of-view. Moreover, we assume that the agents do not communicate and do not share knowledge on their goals, intended actions, etc. The task is to construct a policy that maps the agent’s observations to actions. Our contribution is multifold. First, we propose two novel policies for solving partially observable MAPF (PO-MAPF): one based on heuristic search and another one based on reinforcement learning (RL). Next, we introduce a mixed policy that is based on switching between the two. We suggest three different switch scenarios: the heuristic, the deterministic, and the learnable one. A thorough empirical evaluation of all the proposed policies in a variety of setups shows that the mixing policy demonstrates the best performance is able to generalize well to the unseen maps and problem instances, and, additionally, outperforms the state-of-the-art counterparts (PRIMAL2 and PICO). The source-code is available at https://github.com/AIRI-Institute/when-to-switch.
The general pipeline of the switching approach is as follows. The observation space of the environment consists of two matrices that encode obstacles and agents, as well ...
Published in: IEEE Transactions on Neural Networks and Learning Systems ( Volume: 35, Issue: 12, December 2024)
Page(s): 17411 - 17424
Date of Publication: 31 August 2023

ISSN Information:

PubMed ID: 37651484
No metrics found for this document.

I. Introduction

Multi-agent pathfinding (MAPF) is a challenging problem with topical applications in robotics, video games, logistics, etc. Typically, in MAPF, agents are confined to a graph, and at each timestep, an agent can either move to an adjacent vertex or stay put [1]. The task of each agent is to reach a predefined goal vertex. If the graph is undirected, the solution can be found in polynomial time [2] while finding the optimal solution with respect to a range of the objective functions is NP-hard [3]. Moreover, if the graph is directed, even the decision variant of MAPF is intractable [4].

Usage
Select a Year
2025

View as

Total usage sinceAug 2023:714
020406080JanFebMarAprMayJunJulAugSepOctNovDec73700000000000
Year Total:143
Data is updated monthly. Usage includes PDF downloads and HTML views.
Contact IEEE to Subscribe

References

References is not available for this document.