Conferences >53rd IEEE Conference on Decis...

Kernel-based reinforcement learning for traffic signal control with adaptive feature selection

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Reinforcement learning in a large-scale system is computationally challenging due to the curse of the dimensionality. One approach is to approximate the Q-function as a f...Show More

Metadata

Abstract:

Reinforcement learning in a large-scale system is computationally challenging due to the curse of the dimensionality. One approach is to approximate the Q-function as a function of a state-action related feature vector, then learn the parameters instead. Although assumptions from the priori knowledge can potentially explore an appropriate feature vector, selecting a biased one that insufficiently represents the system usually leads to the poor learning performance. To avoid this disadvantage, this paper introduces kernel methods to implicitly propose a learnable feature vector instead of a pre-selected one. More specifically, the feature vector is estimated from a reference set which contains all critical state-action pairs observed so far, and it can be updated by either adding a new pair or replace an existing one in the reference set. Thus the approximate Q-function keeps adjusting itself as the knowledge about the system accumulates via observations. Our algorithm is designed in both batch mode and online mode in the context of the traffic signal control. In addition, the convergence of this algorithm is experimentally supported. Furthermore, some regularization methods are proposed to avoid overfitting of Q-function on the noisy observations. Finally, A simulation on the traffic signal control in a single intersection is provided, and the performance of this algorithm is compared with Q-learning, in which the Q-function is numerically estimated for each state-action pair without approximation.

Published in: 53rd IEEE Conference on Decision and Control

Date of Conference: 15-17 December 2014

Date Added to IEEE Xplore: 12 February 2015

ISBN Information:

Print ISSN: 0191-2216

DOI: 10.1109/CDC.2014.7039557

Conference Location: Los Angeles, CA, USA

Contents

I. Introduction

Reinforcement learning (RL) aims to learn the optimal policy from interactions with the environment. RL is formalized in the framework of the Markov decision process (MDP) where the learner gains decision-supporting knowledge about the underlying structures of the environment from a sequence of observations [1]. In online mode, the learner updates its knowledge after each observation with the temporal difference (TD); while in batch mode, it learns in a single step when enough observations are collected. Q-Iearning is a typical RL algorithm in which the learner accumulates the knowledge about the underlying Q-function (or action-function) to determine the optimal policy [2].

References is not available for this document.

Kernel-based reinforcement learning for traffic signal control with adaptive feature selection

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Kernel-based reinforcement learning for traffic signal control with adaptive feature selection

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References