WeChat Mini Program
Old Version Features

State-Aware Value Function Approximation with Attention Mechanism for Restless Multi-armed Bandits.

IJCAI 2021(2021)

Huawei Noah’s Ark Lab | Huawei Noah’s Ark Lab The University of Hong Kong | Huawei Noah’s Ark Lab University College London

Cited 3|Views31
Abstract
The restless multi-armed bandit (RMAB) problem is a generalization of the multi-armed bandit with non-stationary rewards. Its optimal solution is intractable due to exponentially large state and action spaces with respect to the number of arms. Existing approximation approaches, e.g., Whittle's index policy, have difficulty in capturing either temporal or spatial factors such as impacts from other arms. We propose considering both factors using the attention mechanism, which has achieved great success in deep learning. Our state-aware value function approximation solution comprises an attention-based value function approximator and a Bellman equation solver. The attention-based coordination module capture both spatial and temporal factors for arm coordination. The Bellman equation solver utilizes the decoupling structure of RMABs to acquire solutions with significantly reduced computation overheads. In particular, the time complexity of our approximation is linear in the number of arms. Finally, we illustrate the effectiveness and investigate the properties of our proposed method with numerical experiments.
More
Translated text
Key words
Agent-based and Multi-agent Systems: Multi-agent Planning,Agent-based and Multi-agent Systems: Resource Allocation,Planning and Scheduling: Planning and Scheduling,Planning and Scheduling: Markov Decisions Processes
PDF
Bibtex
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
Summary is being generated by the instructions you defined