Pricing and bidding in freight transporation

From RL for transport research
Jump to navigation Jump to search

This page mainly based on the Survey by Yan [1].

Problem statement[edit]

With the increase in the use of the internet, online freight transportation procurement solutions have become effective marketplaces bringing together several carriers and shippers.[]In the spot market, shippers have the delivery requirement and pay carriers to perform the delivery services; carriers bid to receive orders from shippers and aim to gain profits; administrators are obliged to ensure the pollutants generated during transport stay within a certain level and avoid congestion as possible[2].

Connection to RL methods[edit]

The bidding market of freight logistics is a multi-party game. Therefore, RL is usually used as aan gent of one or more parties in the model. Different modeling methods usually lead to different research trends. When shipper and carrier are modeled as agents, bidding strategies and market equilibrium are the focus of attention. When intermediate administrators are modeled as agents, researchers often pay more attention to social welfare brought by pricing strategies.

Shippers and carriers as RL agents[edit]

This kind of research always model the RL environment as a multi-agent framework containing a carrier, shipper and broker. For each individual transport job (e.g., a smart container), the shipper and carrier pose bid- and ask prices respectively. The neutral broker agent—with a business model inspired by transport matching platforms and highly decentralized financial markets—matches these bids and asks at a batch level. The shipper and carrier compete against each other, actively learning strategies to maximize their own reward given the deployed strategy of the opponent[3].

FIGURE 1 Visual representation of the bid-ask system. For each job, the shipper and carrier pose a bid and ask price respectively. The broker assigns jobs based on bid-ask spread. For the carrier and shipper, the process isessentially a black box[3].


Administrators as RL agents[edit]

Toe’s[4][5] team study the effect of city logistics measures consisting of the joint delivery systems, an urban distribution center, and parking space restriction .etc. To study the behavior of urban freight stakeholders and their interaction, which is affected by the policy measures, the Multi-Agent Systems (MAS) modelling approach is used to represent their multi-objective environment. They discusses the MAS in the context of city logistics measures that are aimed at changing the stakeholders’ behavior. The interaction among the stakeholders can be described using MAS interaction model as shown in Fig. 2.

FIGURE 2 Stakeholders’ interaction within the MAS model[4].


FIGURE 3 MAS framework for evaluating freight vehicle road pricing [5].


Benefits of UCCs[edit]

The effects of city logistics solutions are uncertain due to fluctuating demand, parking issues and multiple agents within the system. Moreover, city logistics involves many stakeholders such as shippers, freight carriers, customers and administrators. Their day-to-day interaction further adds uncertainty in the city logistics environment, especially if their decisions are obscure from each other, even though their decisions and their effects are interlinked and influenced by each other. To balance the economic, social, and environmental benefits amongst these stakeholders numerous city logistics solutions have been proposed and implemented in several cities, including the Joint Delivery Systems (JDS) with Urban Consolidation Centers (UCC)[6].

Therefore, contributes by developing a learning model framework for freight carriers and a UCC operator, based RL, which is capable of adapting within an uncertain environment such as fluctuating demand, parking costs and multi-agent behavior in city logistics systems.

FIGURE 4 Interaction and behavior of agents in the implementation of JDS[6].


Main result[edit]

The research on bidding and pricing strategies in the freight transportation process was rich. Usually, they used multi-agent methods to model the game environment. The conclusions can be divided into two categories, one in the algorithm level, the other in management.

Insights in algorithm[edit]

These studies proves the effectiveness of designed algorithm in their model, and gives some inspiration for the future RL algorithm design in this field[7][8][9][10].

Table 3
Paper Environment agent(s) Method(s) Insights
Wong et al.(2010) Infrastructure and train service providers DQN The proposed algorithm improves profit
Heeswijk et al. (2019) UCCs ADP ADP is better than heuristics
Firdausiyah et al.(2020) Carriers and UCC operators ADP + DQN ADP is better than DQN
Guo et al.(2021) Carriers DQN + Tabular QL Learning is better than no learning

Insights in management[edit]

These studies simulate the real world through algorithm simulation, and get some management conclusions[4] [11] [12]. .

Table 3
Paper Environment agent(s) Method(s) Insights
Wangapisit et al.(2014) UCC operators, carriers, and administrator DQN Subsidising UCCs can reduce emission
Teo et al. (2014) Administrator DQN Cordon-based charging escalates the delivery costs
Heeswijk et al.(2020) Smart containers Policy-Grident Smart containers are efficient


  1. Yan Y, Chow A H F, Ho C P, et al. Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities[J]. Transportation Research Part E: Logistics and Transportation Review, 2022, 162: 102712.
  2. Olcaytu E, Kuyzu G. Synergy-based bidding method for simultaneous freight transportation auctions[J]. Transportation research procedia, 2018, 30: 295-303.
  3. 3.0 3.1 van Heeswijk W J A. Strategic bidding in freight transport using deep reinforcement learning[J]. Annals of operations research, 2022: 1-38.
  4. 4.0 4.1 4.2 Wangapisit O, Taniguchi E, Teo J S E, et al. Multi-agent systems modelling for evaluating joint delivery systems[J]. Procedia-Social and Behavioral Sciences, 2014, 125: 472-483.
  5. 5.0 5.1 Teo J S E, Taniguchi E, Qureshi A G. Evaluating city logistics measure in e-commerce with multiagent systems[J]. Procedia-Social and Behavioral Sciences, 2012, 39: 349-359.
  6. 6.0 6.1 Firdausiyah N, Taniguchi E, Qureshi A G. Modeling city logistics using adaptive dynamic programming based multi-agent simulation[J]. Transportation Research Part E: Logistics and Transportation Review, 2019, 125: 74-96.
  7. Wong S K, Ho T K. Intelligent negotiation behaviour model for an open railway access market[J]. Expert Systems with Applications, 2010, 37(12): 8109-8118.
  8. Van Heeswijk W J A, Mes M R K, Schutten J M J. The delivery dispatching problem with time windows for urban consolidation centers[J]. Transportation science, 2019, 53(1): 203-221.
  9. Firdausiyah N, Taniguchi E, Qureshi A G. Multi-agent simulation-Adaptive dynamic programming based reinforcement learning for evaluating joint delivery systems in relation to the different locations of urban consolidation centres[J]. Transportation Research Procedia, 2020, 46: 125-132.
  10. Guo C, Thompson R G, Foliente G, et al. Reinforcement learning enabled dynamic bidding strategy for instant delivery trading[J]. Computers & Industrial Engineering, 2021, 160: 107596.
  11. Teo J S E, Taniguchi E, Qureshi A G. Evaluation of load factor control and urban freight road pricing joint schemes with multi-agent systems learning models[J]. Procedia-Social and Behavioral Sciences, 2014, 125: 62-74.
  12. Heeswijk W. Smart containers with bidding capacity: A policy gradient algorithm for semi-cooperative learning[C]//International Conference on Computational Logistics. Springer, Cham, 2020: 52-67