SED2AM: Solving Multi-Trip Time-Dependent Vehicle Routing Problem Using Deep Reinforcement Learning

CoRR（2025）

University of Calgary | National Research Council Canada

Cited 0|Views2

Abstract

Deep reinforcement learning (DRL)-based frameworks, featuring Transformer-style policy networks, have demonstrated their efficacy across various vehicle routing problem (VRP) variants. However, the application of these methods to the multi-trip time-dependent vehicle routing problem (MTTDVRP) with maximum working hours constraints – a pivotal element of urban logistics – remains largely unexplored. This paper introduces a DRL-based method called the Simultaneous Encoder and Dual Decoder Attention Model (SED2AM), tailored for the MTTDVRP with maximum working hours constraints. The proposed method introduces a temporal locality inductive bias to the encoding module of the policy networks, enabling it to effectively account for the time-dependency in travel distance or time. The decoding module of SED2AM includes a vehicle selection decoder that selects a vehicle from the fleet, effectively associating trips with vehicles for functional multi-trip routing. Additionally, this decoding module is equipped with a trip construction decoder leveraged for constructing trips for the vehicles. This policy model is equipped with two classes of state representations, fleet state and routing state, providing the information needed for effective route construction in the presence of maximum working hours constraints. Experimental results using real-world datasets from two major Canadian cities not only show that SED2AM outperforms the current state-of-the-art DRL-based and metaheuristic-based baselines but also demonstrate its generalizability to solve larger-scale problems.

Translated text

Key words

Multi-Trip Time Dependent Vehicle Routing Problem,Combinatorial Optimization,Deep Reinforcement Learning,Attention Model

Bibtex

AI Read Science

AI Summary

AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.

Example

Background

Key content

Introduction

Methods

Results

Related work

Fund

Key content

Pretraining has recently greatly promoted the development of natural language processing (NLP)
We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance

Try using models to generate summary,it takes about 60s

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Data Disclaimer

The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn

Chat Paper

【要点】：本文提出SED2AM，一种基于深度强化学习的模型，用于解决带最大工作时间约束的多趟次时间依赖车辆路径问题（MTTDVRP），通过引入时间局部性诱导偏置和双解码器结构，实现更有效的路径规划。

【方法】：SED2AM方法使用Transformer风格的策略网络，并在编码模块中引入时间局部性诱导偏置，以考虑旅行时间和距离的时间依赖性；同时，解码模块包含车辆选择解码器和行程构建解码器，分别用于选择车辆并构建行程。

【实验】：实验使用来自加拿大两大城市的真实世界数据集，结果显示SED2AM模型优于当前最先进的基于DRL和启发式算法的基线，并且证明了其在解决更大规模问题上的泛化能力。

去 AI 文献库对话