Travel demand modeling

From RL for transport research
Jump to navigation Jump to search

Contributors: Zhicheng Jin and Qi Luo.

This page mainly based on the survey by Yanbo Pang[1].

Problem statement[edit]

Over the last few decades, researchers have made significant progress in modeling and estimating the travel demand. Generally, travel demand models are developed to forecast the response of transportation demand to changes in the attributes of people using the transportation system. These travel demand models are used to predict the travel characteristics and utilization of transport services under alternative socioeconomic scenarios, for alternative transport services as well as for land-use configurations [2]. In detail, a procedure involving of four separate steps is developed; first, estimating the total inflow and outflow of each zone in the target area (trip generation); second, assigning trips to each zone pair (trip distribution); third, determining transport mode of each trip (mode choice), and finally, assigning trips to the road network (trip assignment) [3].

However, trip-based travel demand approaches failed to model individual daily schedules with trip-chain structure because all the trips generated from the models are separated, and there is no behavior rule to combine them in a rational order. To fill this gap, an activity-based travel demand approach was developed [4]. Therefore, the travel demand is a result of participating in activities at different places and times. Recently, the most contemporary travel demand models developed and employed by regional transportation agencies are the activity-based approach, which employ self-reported surveys to collect information on individuals’ socio-demographics, their daily activity purposes, times, and locations. Travel behaviors are regarded as derivatives of activities at different places such as home, work, shopping, and others. Primarily, the discrete choice models are widely used for modeling activity choice, departure time choice, transport model, and other behavioral factors. However, developing activity-based models requires detailed travel behavior surveys; moreover, the data collection is usually expensive and involves significant delay. Because of these limitations, only a typical day’s travel demand can be modeled and estimated, which matches the current requirements from the demand side. Furthermore, most of activity-based models consist of multiple modules such as population synthesis, daily travel pattern, workplace choice, tour generation, trip mode choice, trip time and so on, which increases data requirements, model complexity, and computational burden [1].

Connection to RL method[edit]

With the development of the technologies such as IoT and ICT, individual travel footprints can be sensed and recorded using numerous services and devices. Due to the popularization of smartphones, GPS and CDRs data are also widely used and collected by mobile phone carriers. By leveraging these powerful data sources, researchers can apply machine learning techniques to human mobility modeling and analysis. Recently, several works have adapted the RL approach to model and synthesize daily activity schedules [5][6][7]. However, for an extended period, their applications are limited to domains in which agents behave in low-dimensional state spaces with a well-defined reward function.

Another promising method is IRL. The objective of IRL is to infer the underlying reward structure guiding an agent's behavior based on observations as well as a model of the environment[8]. This may be done either to learn the reward structure for modeling purposes or to provide a method to enable the agent to imitate a demonstrator's specific behavior. Several prior works in this domain rely on the parameterization of the reward function based on the hand-crafted features. Recently, few studies have attempted to fill the gap between traditional discrete choice models[9], human preferences and RL. Notably, researchers in the transportation domain have extracted activity sequences from the CDRs data and integrated an activity-based approach with IRL to infer the structural model of travel behaviors [10]. However, the formulation of the model only considers limited activity types (home, work, travel by car, and travel by bus) as states, and transitions between such states are considered as actions. Features such as location choice and spatial mobility patterns are not leveraged and incorporated with models to directly reconstruct the individual daily trajectories directly.

Main results[edit]

In this study[1], a RL-based human mobility model was studied for replicating the people mass movement at the scale of a metropolitan area. They introduced IRL to recover human travel behavior preferences that can capture the spatio-temporal pattern and the context features of human mobility using real GPS trajectories, and produce synthetic trajectories that report locations and transport mode choice over time. The generated synthetic trajectory dataset can be utilized to capture and predict within-day people mass movement in various cities where the demonstrated trajectories are available. As shown in Figure 1, the end-to-end framework comprises of three parts: developed data processing, agent modeling with parameter training, and agent-based travel micro-simulation.

Figure 1 Diagram of the modeling framework Travel demand 1.png

  1. 1.0 1.1 1.2 Pang Y, Kashiyama T, Yabe T, et al. Development of people mass movement simulation framework based on reinforcement learning[J]. Transportation research part C: emerging technologies, 2020, 117: 102706.
  2. Adler, T., & Ben-Akiva, M. (1979). A theoretical and empirical model of trip chaining behavior. Transportation Research Part B: Methodological, 13(3), 243-257.
  3. Kitamura, R. (1984). Incorporating trip chaining into analysis of destination choice. Transportation Research Part B: Methodological, 18(1), 67-81.
  4. Kitamura, R., & Fujii, S. (1998). Two computational process models of activity-travel behavior. Theoretical foundations of travel choice modeling, 251-279.
  5. Janssens, D., Lan, Y., Wets, G., & Chen, G. (2007). Allocating time and location information to activity–travel patterns through reinforcement learning. Knowledge-Based Systems, 20(5), 466-477.
  6. Xiong, Y. (2014). Modelling individual and household activity: travel scheduling behaviours in stochastic transportation networks.
  7. Yang, M., Yang, Y., Wang, W., Ding, H., & Chen, J. (2014). Multiagent-based simulation of temporal-spatial characteristics of activity-travel patterns using interactive reinforcement learning. Mathematical Problems in Engineering, 2014.
  8. Arora, S., & Doshi, P. (2021). A survey of inverse reinforcement learning: Challenges, methods and progress. Artificial Intelligence, 297, 103500.
  9. Ermon, S., Xue, Y., Toth, R., Dilkina, B., Bernstein, R., Damoulas, T., . . . Barrett, C. (2015). Learning large-scale dynamic discrete choice models of spatio-temporal preferences with application to migratory pastoralism in East Africa. Paper presented at the Twenty-Ninth AAAI Conference on Artificial Intelligence.
  10. Feygin, S. (2018). Inferring Structural Models of Travel Behavior: An Inverse Reinforcement Learning Approach: University of California, Berkeley.