Demand forecasting

From RL for transport research
Jump to navigation Jump to search

This page is mainly based on the study by Hassan [1].


Freight forecasting is essential for managing, planning operating and optimizing the use of resources. Multiple market factors contribute to the highly variable nature of freight flows, which calls for adaptive and responsive forecasting models. While accurate forecasts of demands are beneficial to effective decisions to match the supplies with the demands, it is difficult to give a guaranteed estimate on freight market in the complex real-world with multi-party games environment and randomness.

Connection to RL[edit]

Hassan[1]presents a demand forecasting methodology that supports freight operation planning over short to long-term horizons. The method combines time series models and machine learning algorithms in a Reinforcement Learning framework applied over a rolling horizon. The objective is to develop an efficient method that reduces the prediction error by taking full advantage of the traditional time series models and machine learning models.

Assume that we have K oracles or forecasters who provide a forecast for each time period in a given market—in this case, the number of containers to be moved in a given week in that market. No forecast is perfect, and sometimes a forecast will hit the mark or get very close, whereas at other times it may miss the mark by varying degrees. Not knowing the past performance of the oracles, the operator (using the forecasts as a basis for planning decisions) may take a simple average of the forecasts as the “consensus” forecast. However, recognizing that some forecasters outperform others in certain instances or for different periods of time, the operator will place different weights on the respective forecasts in forming the “final” forecast.

Reinforcement Learning (RL) provides a mechanism for (1) dynamically learning about the respective performance of the different forecasters based on the quality of their past forecasts, placing greater emphasis on more recent instances, and (2) weighing the respective forecasts accordingly at each instance. In this context, the forecasters, or agents, consist of different statistical or Machine Learning models developed using the same training (estimation) data set, and implemented in a rolling horizon framework to provide forecasts at different time scales. These are run in parallel, and compared against actual realizations once those have materialized, providing an automated basis for scoring the forecasters’ respective performance for use in the RL weight updating mechanism. The challenge consists in specifying the reward function and formulating the updating mechanism. The overall approach is summarized in Fig. 1.

FIGURE 1 Stages in the presented methodology.


Market lane clustering[edit]

In this study, a market is broken down into various clusters. For each cluster, a set of forecasting models are selected to comprise the components that are later used in the RL approach to generate the final forecast. Therefore, the predicted demand for a cluster is the weighted average of the predictions of the individual component models and the weights are updated continuously using RL.

Forecasting over different time scales[edit]

For the short-term forecast, a suitable number of time series models are chosen for each cluster based on the trend characteristics found during the pre-processing stage. However, time series models have low accuracy when it comes to forecasting over long horizons (Nguyen and Chan, 2004), so an alternative to long-term weekly forecasts is devised. Consequently, instead of directly forecasting weeks far in advance, we forecast the number of moves on a monthly basis, i.e., the number of moves that will occur in the next two months separately. Afterwards, the number of moves in each week is calculated as a weighted average of: (a) No Information/ Equal Allocation (NIA) Model, and (b) Monthly to Weekly Mapping (MWM) Model with the weights (Ω) update using an RL mechanism similar to the one previously described.

Main result[edit]

The proposed approach is tested on a dataset provided by an intermodal company operating in the USA. The dataset includes information for each move by origin and destination. Moreover, for each movement, the dataset has information regarding the commodity type and order placement, pickup and delivery times. The data used in the model is the weekly number of moves from 2013 to 2017. The information presented below is for a market referred to here as X (for data confidentiality purposes).

Short-term model[edit]

The forecasts are generated at the lane-commodity cluster level with positive deviation penalty equal to the negative deviation penalty, shown in Fig. 2.

FIGURE 2 Short-term weekly forecasts for market X - forecasts for 2017.


Long-term model[edit]

Fig. 3 presents the actual and long-term weekly forecasts generated through the proposed method. Both forecasts tend to stabilize the overall trend and present a more stable basis upon which medium range operational decisions can be made.

FIGURE 3 Long-term weekly forecasts for market X - forecasts for 2017.



The overall framework is tested using market data for a US intermodal company. The margin of error is around 5% and 15% for short-term T + 1 and T + 2 forecasts respectively. The margin of error in the long-term weekly forecasts is around 14% in all forecasting periods in 2017 and for T + 2 and T + 3 forecasts. Furthermore, analyzing the forecasted trends, the results reveal that the predictions of the proposed framework can capture and adjust to recent fluctuation in the market. This indicates that RL coupled with rolling horizon approach has potential benefits in improving forecast quality


An overview of the use of reinforcement learning for forecasting demand for transportation services and the models used in forecasting freight flow are presented in Table 1.

Table 3
Paper Method(s) Application Insights
Gosavii et al[2]..(2002) Formulated the problem as a semi-Markov decision problem aiming to maximize the average reward.

Objective values defined for each state-action pair were updated within a neural network scheme

Revenue management problem for a single flight leg The proposed method outperforms the Expected Marginal Seat Revenue (EMSR), a heuristic that is widely used in the industry.
Weigang[3]. et al. (2008) Simulate future airspace demand to identify capacity requirements in each sector for different periods of time.

Decision support process is designed as a Markov Decision Chain (MDC), the state information from MDC is transferred to a reinforcement learning model in which an action is selected, executed, and its corresponding outcomes are used to update the learning process.

Air traffic flow management decisions Using traffic flow information for Brazil, the model-suggested course of action was close to reality and even led to improvement in certain instances.
Tumer et al[4].(2009) Multi-agent model that responded quickly to weather and airport conditions to limit the local delays.

These agents learn continuously through reinforcement learning and provide air traffic controllers with recommendations and decisions.

Air traffic management systems (ATM) A simulation based on US airspace showed that the proposed model could improve ATM while retaining current flow management procedures, without significant policy shifts. Comparison of simulation results to outputs of a Monte Carlo estimation procedure revealed that the adaptive agents outperformed the reference case.
Garrido et al[5]..(2000) Multinomial probit (MNP) model for freight demand analysis and flow distribution prediction that captures general spatial and temporal correlation patterns.

Given order patterns and information regarding socioeconomic activity, the model forecasts freight flow over space and time for operational and tactical level planning

Motor carrier company dataset of all shipments picked up in the state of Texas between June 1994 and July 1995. While predicted probabilities differed from those in the forecasting sample, the modified probit model succeeded in ranking and identifying sites with a higher probability of generating shipments at a given time.
Moscoso-López et al[6]..(2016) Compared the performance of Artificial Neural Networks (ANN) and Support Vector Machines (SVMs) models in predicting freight volume

Given order patterns and information regarding socioeconomic activity, the model forecasts freight flow over space and time for operational and tactical level planning

Fresh vegetable transportation through RO-RO operations in the Port of Algeciras Bay The SVMs models performed slightly better than ANN in forecasting the volume of fresh vegetables moved on each day.


  1. 1.0 1.1 Hassan L A H, Mahmassani H S, Chen Y. Reinforcement learning framework for freight demand forecasting to support operational planning decisions[J]. Transportation Research Part E: Logistics and Transportation Review, 2020, 137: 101926.
  2. Gosavii A, Bandla N, Das T K. A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking[J]. IIE transactions, 2002, 34(9): 729-742.
  3. Weigang L, de Souza B B, Crespo A M F, et al. Decision support system in tactical air traffic flow management for air traffic flow controllers[J]. Journal of Air Transport Management, 2008, 14(6): 329-336.
  4. Tumer K, Agogino A. Improving air traffic management with a learning multiagent system[J]. IEEE Intelligent Systems, 2009, 24(1): 18-21.
  5. Garrido R A, Mahmassani H S. Forecasting freight transportation demand with the space–time multinomial probit model[J]. Transportation Research Part B: Methodological, 2000, 34(5): 403-418.
  6. Moscoso-López J A, Turias I J T, Come M J, et al. Short-term forecasting of intermodal freight using ANNs and SVR: case of the Port of Algeciras Bay[J]. Transportation research procedia, 2016, 18: 108-114.