Neural Combinatorial Optimization for Robot Fleet Management

Providing robotic services often involves the coordination and management of a fleet of robots to optimize how the tasks are assigned, the robots scheduled and the routes planned. These are challenging combinatorial optimization problems due to their large scale, many operational constraints and real-world uncertainties. The same optimization problems also need to be solved several times (e.g. every day, every hour) for a specific service with different inputs. Deep learning-based heuristics exploit the distribution of these inputs to provide tailored heuristics for each use-case. This is really not practical as not only do you need to collect enough data but if the service is even slightly updated (e.g. resulting in some additional constraints) then you need to re-learn a new heuristic from scratch. To circumvent this limitation of task-specific learning, we’re exploring foundation models for combinatorial optimization where the idea is to learn a model on a variety of optimization problems so that it can easily be adapted to new problems.

Towards a Foundation Model for Combinatorial Optimization

Many Combinatorial Optimization (CO) problems have been extensively studied in the Operations Research community resulting in some powerful solutions. Based on these problem-specific solutions we’re working on teaching a neural model to solve a wide range of CO problems. GOAL (Generalist combinatorial Optimization Agent Learning) consists of a single transformer-based backbone and light-weight problem-specific adapters, mostly for input and output processing. It’s the first neural model that is able to solve a variety of CO problems spanning routing, scheduling, packing and graph optimization problems.

GOAL is based on our Bisimulation Quotienting approach for efficient Neural Combinatorial Optimization which is a generic framework to design efficient Markov Decision Processes (MDP) for combinatorial optimization. The particularity of these MDPs is that they exploit the widespread recursive property of CO problems. The intuition is that, for many problems (e.g. all those solvable by dynamic programming), when we start constructing a partial solution, we’re faced with a remaining subproblem of the same nature as the original one. Here’s what the construction process looks like for some classic CO problems.

For the Traveling Salesman Problem (left): given a set of nodes, at each step the model returns a probability distribution over the next node to add to the current path until a complete tour is formed. For the Vehicle Routing Problem (middle): given customer nodes with demands and a vehicle with a maximal capacity, the model computes the trajectory of the vehicle step by step For the Knapsack Problem (right): given a set of items with values (x-dimension) and weights (y-dimension), the model selects a subset of items of maximal value while respecting a capacity constraint.

Multi-robot orchestration

We’re working on the orchestration of robots in NAVER’s latest datacenter, GAK Sejong, the largest in Asia with a capacity of 600,000 server units and one designed to use robots to do a lot of heavy lifting and for general efficiency.

In the warehouse, SeRo robots are responsible for fetching servers from the shelves and loading them onto GaRo robots which transport them to the server rooms. Task requests arrive as tickets, each specifying a set of assets to pick up and load onto a parked GaRo robot. These tickets are generated dynamically throughout the day, creating a stream of interdependent pick-up and delivery tasks for the SeRo robots. We were interested in how the fleet of SeRo robots could navigate collision-free paths to perform their tasks as efficiently as possible in the warehouse. While this problem is a variant of Multi-Agent Path Finding (MAPF) which has been extensively studied, our use-case presents unique challenges. The warehouse is a very constrained space and the SeRo robots are large, making it impossible for two robots to pass each other in most areas. Additionally, due to the heavy weight of the robots and the servers they transport, their precise dynamics must also be taken into account. Planning with the simple constant speed approximation commonly found in the MAPF literature would have resulted in real robot collisions.

To solve this MAPF problem, we’ve proposed an extension to the standard Prioritized Planning algorithm to deal with the inter-dependent tasks, Interleaved Prioritized Planning and a novel Via-Point Star (VP*) algorithm to compute an optimal dynamics-compliant robot trajectory to visit a sequence of goal locations while avoiding moving obstacles. Below is an illustration of our approach in simulation with 10 robots.

Result of the IPP-VP* algorithm for planning the trajectories of 10 robots in the warehouse. The current pick-up and drop-off locations for a robot are represented by dots of the same color as the robot.

After testing our algorithms in the data center, we realized that even in such a controlled setting, with high-fidelity robots, unexpected variations often lead to significant discrepancies between plans and real robot behavior. We’re now focusing on refining our methods to explicitly account for the uncertainty. This will not only strengthen our plans in the data center but also enable us to apply our multi-robot coordination algorithms to a broader range of real-world environments.

Related publications

Multi-agent path finding with real robot dynamics and interdependent tasks for automated warehouses, ECAI 2024

GOAL: a Generalist Combinatorial Optimization Agent Learner, ICLR 2025

Bisimulation Quotienting approach for efficient Neural Combinatorial Optimization, NeurIPS 2023

Neural Combinatorial Optimization for Robot Fleet Management

Towards a Foundation Model for Combinatorial Optimization

Multi-robot orchestration

Related publications

NAVER FRANCE Gender Equality 2024

All

Publications

Blog

News

Code & Data

Careers

People

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

NAVER FRANCE Gender Equality 2023

Action

Neural Combinatorial Optimization for Robot Fleet Management

Towards a Foundation Model for Combinatorial Optimization

Multi-robot orchestration

Related publications

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings