Charging station recommendation

From RL for transport research
Jump to navigation Jump to search

Contributors: Zheng Li and Qi Luo.

This page mainly based on the work by Xu [1] and Lin [2].

Problem Statement[edit]

A more efficient and clean transportation system is necessary for reducing fossil consumption and carbon emissions. Electric vehicles (EVs) directly utilize electricity for mobility and can result in reducing emissions from the high efficiency of energy conversion and utilization of renewable energy. Thus, replacing internal combustion engine vehicles with EVs could be a feasible scheme to contribute to a sustainable, low carbon society. Based on the GPS trajectory data of urban mobility, previous related work provides insights for the transportation electrification process and lays a good theoretical foundation for further deployment.

However, an EV charging network has dual attributes of power and transportation networks, the access of massive EVs results in the power grids and transportation networks being deeply coupled, bringing a series of new challenges. Simultaneous charging of a large number of EVs will bring a negative impact on power distribution networks and reduce their stability. Besides, the long charging period of EVs is likely to induce queuing in charging stations, and in turn, causes traffic jams. Compared with conventional charging demands, demand for fast charging is usually generated during driving, with the characteristics of strong power demand and high randomness of spatial–temporal distribution. A large number of fast charging loads can severely impact the electrified transportation system. For example, the level 3 chargers utilized in fast charging stations may inject both the low-order and high-order harmonics into power grids. Therefore, how to design an effective charging guidance strategy for EVs with fast charging demand is an important issue, which directly relates to the stability of the whole coupled power-transportation system and the satisfaction of EV consumers.

The charging station recommendation usually aims to minimize the overall charging time consisting of driving time, queuing time, and charging time. From the users’ perspective, quality of experience of charging is the main concerning. Thus, it is critical to recommend EVs with an optimal charging station to improve user quality of experience by minimizing overall charging time. Moreover, the charging station recommendation affects more or less other issues. For instance, it affects EV’s routing optimization and charging station placement.

Connection to RL method[edit]

Reinforcement Learning (RL) based charging station recommendation researches always formulate the problem as Markov Decision Process (MDP). In the charging station recommendation system, upon receiving a charging request from an EV, the system needs to decide to which charging station the EV should be guided. This decision is made based on the system’s current state which includes for example charging stations’ queuing information and distance information. An optimal decision can certainly minimize the overall charging time which consists of driving time, queuing time, and charging time. For each decision, the system can receive a reward which is inversely proportional to the overall charging time. Meanwhile, the system evolves to the next state. Therefore, this whole procedure can be modeled as an MDP problem, of which the key ingredients are of a set of decision periods, system states, available actions, rewards, and state/action dependent transition probabilities. According to above analysis, the MDP modeling captures well the characteristics of the charge station recommendation.

Recommendation and charging process.png

Figure 1. Recommendation and charging process [2].

Main results[edit]

Graph reinforcement learning[edit]

To solve the charge station recommendation problem, an attention-based deep graph reinforcement learning method is developed in Xu [1], the architecture of our method is illustrated in Figure 2. A physical connection-based graph formulation method with type-specific feature projection is established to integrate the information of transportation network, fast charging stations, and power grid comprehensively. Graph attention networks are adopted to learn the coupled system state representation effectively. Deep-Q network (DQN) is introduced to deal with the delayed action execution, where the prioritized replay is adopted to facilitate the training. Besides, an attention-prioritized cache construction method is developed based on the origin DQN, to improve the evaluation of recommendation action by preferential selecting sequences of experience with more recommendations. A boundary action regulation strategy is also established to prevent the agent from invalid actions. Dueling deep Q-network is utilized as the basic deep reinforcement learning (DRL) framework.

The overall architecture of the proposed method1.png

Figure 2. The overall architecture of the proposed method [1].

Multiple-phase MDP model[edit]

The MDP modeling captures well the characteristics of the CS recommendation. However, MDP has the issue of ‘curse of dimensionality’ which becomes even worse in the charging station recommendation scenario, where there exist large state/action spaces. Moreover, the MDP modeling requires a priori knowledge about all probability distributions of the model, which could not be held in practice. To address these issues, Lin [2] bring post decision state (PDS) and introduce a new state called intermediate decision state into model, which is called as Multiple-phase MDP (MMDP) model. Using these two additional states, the state transition of MDP is decomposed into several phases and thereby the complexities of state space and state transition can be significantly reduced. Afterwards, Lin [2] then propose an online learning based algorithm to solve the formulated MMDP model.


  1. 1.0 1.1 1.2 P. Xu et al., “Real-time fast charging station recommendation for electric vehicles in coupled power-transportation networks: A graph reinforcement learning method,” Int. J. Electr. Power Energy Syst., vol. 141, no. April, p. 108030, 2022.
  2. 2.0 2.1 2.2 2.3 H. Lin, X. Lin, H. Labiod, and L. Chen, “Toward Multiple-Phase MDP Model for Charging Station Recommendation,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 8, pp. 10583–10595, 2021.