A reinforcement learning approach for supply chain management (original) (raw)
The paper presents a decentralized supply chain management approach based on reinforcement learning. Our supply chain scenario consists of loosely coupled yield optimizing scheduling agents trying to learn an optimal acceptance strategy for the offered jobs. The optimizer constructs a mandatory schedule by inserting the requested jobs, which arrive stochastically from the customers, gradually into a production queue if the job yields a sufficient return. To reduce complexity the agents are divided into three components. A supply chain interface, classifying job offers, a reinforcement learning algorithm component, which makes the acceptance decision and a deterministic scheduling component, which processes the jobs and generates a preliminary state space compression. The reinforcement learning algorithm accepts offers according to their delivery due date, the job price, the timeout penalty cost, and the information provided by the scheduling component. The tasks are finally executed on the suppliers machine following the queue's schedule. In a performance comparison of our yield optimizing agent it turns out that the reinforcement learning solution outperforms the simple acceptance heuristic for all training states. production and transportation costs using currency exchange rates, tariffs, production-, inventory, late delivery and transportation costs, the RLA chooses between three possible suppliers and one of two transportation modes. SMART is tested by applying various demand patterns to the supply chain, grounding on a Erlang probability distribution modified by diverse mean and deviation parameters. Compared with two heuristics, one called LH standing for local heuristic preferring a inner country production and distribution policy and another denoted as BH balanced heuristic while issuing demand mainly to warehouses with low capacity load, the SMART allocation mechanism provides the highest reward. Tier x+1 Agentx+1,