An improved deep Q-learning algorithm for a trade-off between energy consumption and productivity in batch scheduling (2024)


Authors: Xu Zheng and Zhen Chen

Published: 17 April 2024 Publication History

  • 0citation
  • 0
  • Downloads


Total Citations0Total Downloads0

Last 12 Months0

Last 6 weeks0

  • Get Citation Alerts

    New Citation Alert added!

    This alert has been successfully added and will be sent to:

    You will be notified whenever a record that you have chosen has been cited.

    To manage your alert preferences, click on the button below.

    Manage my Alerts

    New Citation Alert!

    Please log in to your account

      • View Options
      • References
      • Media
      • Tables
      • Share


    The single-batch machine, commonly found in industrial manufacturing, can concurrently process a group of jobs in variable-speed batches, leading to fluctuating levels of both energy consumption and processing time. Identifying an optimal balance between total energy consumption and makespan presents a challenge due to the intricate nature of their relationship in real-world scenarios. This study introduces a mixed-integer programming model designed to determine optimal machine operating states for diverse workloads, meeting customer requirements while concurrently achieving energy savings for sustainability. An energy-aware batch scheduling deep Q-learning network (EBSDQN) framework has been created, encompassing sequencing policies, batching rules, and speed adjustment policies. This framework is explicitly crafted to tackle the NP-hard problem, a challenge that commonly perplexes commercial solvers like Gurobi when seeking optimal solutions within a 360-second time frame for small-scale instances. The EBSDQN design is fortified with efficient action decoding policies, refined reward assessments, exploitative neighborhood rules, and a streamlined buffer training process. This approach not only conserves training time but also adapts dynamically in accordance with the principles of the Markov Decision Process (MDP) and deep neural network. The developed algorithm exhibits impressive robustness, reaching convergence after 600 episodes of training. In separate comparisons with a commercial solver, two single-objective algorithms, and two multi-objective algorithms, our algorithm consistently demonstrates superior overall performance across the same instances. In the worst-case analysis, further examination delves into the influences of job features. In summary, this study underscores the high sensitivity of both T E C and C m a x to job processing time distribution, suggesting the prioritization of T E C over C m a x in optimization for sustainable planning and operations. Furthermore, this research has the potential to accelerate the integration of artificial intelligence into the manufacturing sector.


    A batch scheduling problem in terms of energy-efficient is investigated.

    A mixed-integer linear programming model is proposed.

    The excellent performance of the energy-aware deep Q network is confirmed.

    The developed framework is easily transferred to solve scheduling problems.



    Alon T., Haviv M., Discrete-time strategic job arrivals to a single machine with waiting and lateness penalties, European Journal of Operational Research 303 (1) (2022) 480–486.


    Beldar P., Moghtader M., Giret A., Ansaripoor A.H., Non-identical parallel machines batch processing problem with release dates, due dates and variable maintenance activity to minimize total tardiness, Computers & Industrial Engineering 168 (2022).


    Blonder B., Lamanna C., Violle C., Enquist B.J., The n-dimensional hypervolume, Global Ecology and Biogeography 23 (5) (2014) 595–609.


    Brammer J., Lutz B., Neumann D., Permutation flow shop scheduling with multiple lines and demand plans using reinforcement learning, European Journal of Operational Research 299 (1) (2022) 75–86.


    Chen W., Wang J., Yu G., Energy-efficient scheduling for an energy-intensive industry under punitive electricity price, Journal of Cleaner Production 373 (2022).


    Chen W., Zheng M., Tian N., Ding X., Li N., Zhang W., Project-based sustainable timing series decision-making for pavement maintenance using multi-objective optimization: An innovation in traditional solutions, Journal of Cleaner Production (2023).


    Clautiaux F., Detienne B., Lefebvre H., A two-stage robust approach for minimizing the weighted number of tardy jobs with objective uncertainty, Journal of Scheduling 26 (2) (2023) 169–191.


    Fleszar K., A MILP model and two heuristics for the Bin Packing Problem with Conflicts and Item Fragmentation, European Journal of Operational Research 303 (1) (2022) 37–53.


    Fowler J.W., Mönch L., A survey of scheduling with parallel batch (p-batch) processing, European Journal of Operational Research 298 (1) (2022) 1–24.


    Gafarov E.R., Dolgui A., On lower and upper bounds for single machine parallel batch scheduling, Optimization Letters 16 (2022) 2557–2567.


    Ikura Y., Gimple M., Efficient scheduling algorithms for a single batch processing machine, Operations Research Letters 5 (2) (1986) 61–65.


    Karimi-Mamaghan M., Mohammadi M., Meyer P., Karimi-Mamaghan A.M., Talbi E.-G., Machine learning at the service of meta-heuristics for solving combinatorial optimization problems: A state-of-the-art, European Journal of Operational Research 296 (2) (2022) 393–422.


    Kashan A.H., Ozturk O., Improved MILP formulation equipped with valid inequalities for scheduling a batch processing machine with non-identical job sizes, Omega 112 (2022).


    Khakifirooz M., Fathi M., Dolgui A., Pardalos P.M., Scheduling in Industrial environment toward future: insights from Jean-Marie Proth, International Journal of Production Research (2023) 1–27.


    Kim Y.J., Jang J.W., Kim D.S., Kim B.S., Batch loading and scheduling problem with processing time deterioration and rate-modifying activities, International Journal of Production Research 60 (5) (2022) 1600–1620.


    Lee C.-Y., Minimizing makespan on a single batch processing machine with dynamic job arrivals, International Journal of Production Research 37 (1) (1999) 219–236.


    Lee Y.H., Lee S., Deep reinforcement learning based scheduling within production plan in semiconductor fabrication, Expert Systems with Applications 191 (2022).


    Lin R., Wang J.-Q., Oulamara A., Online scheduling on parallel-batch machines with periodic availability constraints and job delivery, Omega 116 (2023).


    Lu S., Kong M., Zhou Z., Liu X., Liu S., A hybrid metaheuristic for a semiconductor production scheduling problem with deterioration effect and resource constraints, Operational Research 22 (5) (2022) 5405–5440.


    Lv L., Shen W., An improved NSGA-II with local search for multi-objective integrated production and inventory scheduling problem, Journal of Manufacturing Systems 68 (2023) 99–116.


    Martinelli R., Mariano F.C.M.Q., Martins C.B., Single machine scheduling in make to order environments: a systematic review, Computers & Industrial Engineering (2022).


    Martinovic J., Strasdat N., de Carvalho J.V., Furini F., A combinatorial flow-based formulation for temporal bin packing problems, European Journal of Operational Research 307 (2) (2023) 554–574.


    Mathirajan M., Sujan R., Rani M.V., Dhaval P., A machine learning algorithm for scheduling a burn-in oven problem, International Journal of Industrial and Systems Engineering 43 (1) (2023) 20–58.


    Mazyavkina N., Sviridov S., Ivanov S., Burnaev E., Reinforcement learning for combinatorial optimization: A survey, Computers & Operations Research 134 (2021).


    Mokhtari-Moghadam A., Pourhejazy P., Gupta D., Integrating sustainability into production scheduling in hybrid flow-shop environments, Environmental Science and Pollution Research (2023) 1–19.


    Panzer M., Bender B., Deep reinforcement learning in production systems: a systematic literature review, International Journal of Production Research 60 (13) (2022) 4316–4341.


    Park M.-J., Ham A., Energy-aware flexible job shop scheduling under time-of-use pricing, International Journal of Production Economics 248 (2022).


    Parsons E.M., Shaik S.Z., Additive manufacturing of aluminum metal matrix composites: Mechanical alloying of composite powders and single track consolidation with laser powder bed fusion, Additive Manufacturing 50 (2022).


    Pinedo M.L., Scheduling: Theory, algorithms, and systems, 5th ed., Springer Publishing Company, Incorporated, 2016.


    Pinto A.R.F., Nagano M.S., A comprehensive review of batching problems in low-level picker-to-parts systems with order due dates: Main gaps, trade-offs, and prospects for future research, Journal of Manufacturing Systems 65 (2022) 1–18.


    Potts C.N., Kovalyov M.Y., Scheduling with batching: A review, European Journal of Operational Research 120 (2) (2000) 228–249.


    Qian X., Xingong Z., Online scheduling of two-machine flowshop with lookahead and incompatible job families, Journal of Combinatorial Optimization 45 (1) (2023) 50.


    Qiao D., Wang Y., Pei J., Bai W., Wen X., Research on green single machine scheduling based on improved ant colony algorithm, Measurement and Control 55 (1–2) (2022) 35–48.


    Rossit D.A., Tohmé F., Frutos M., A data-driven scheduling approach to smart manufacturing, Journal of Industrial Information Integration 15 (2019) 69–79.


    Shahidi-Zadeh B., Tavakkoli-Moghaddam R., Taheri-Moghadam A., Rastgar I., Solving a bi-objective unrelated parallel batch processing machines scheduling problem: A comparison study, Computers & Operations Research 88 (2017) 71–90.


    Tian Z., Zheng L., Single machine parallel-batch scheduling under time-of-use electricity prices: New formulations and optimisation approaches, European Journal of Operational Research (2023).


    Toksarı M.D., Toğa G., Single batch processing machine scheduling with sequence-dependent setup times and multi-material parts in additive manufacturing, CIRP Journal of Manufacturing Science and Technology 37 (2022) 302–311.


    Tsao Y.-C., Thanh V.-V., Hwang F.-J., Energy-efficient single-machine scheduling problem with controllable job processing times under differential electricity pricing, Resources, Conservation and Recycling 161 (2020).


    Uzsoy R., Scheduling a single batch processing machine with non-identical job sizes, The International Journal of Production Research 32 (7) (1994) 1615–1635.


    Uzsoy R., Yang Y., Minimizing total weighted completion time on a single batch processing machine, Production and Operations Management 6 (1) (1997) 57–73.


    Wang D., Pulido J., Grosset P., Tian J., Jin S., Tang H., et al., AMRIC: A novel in situ lossy compression framework for efficient I/O in adaptive mesh refinement applications, 2023, arXiv preprint arXiv:2307.09609.


    Wei Z., Liao W., Zhang L., Hybrid energy-efficient scheduling measures for flexible job-shop problem with variable machining speeds, Expert Systems with Applications 197 (2022).

    Digital Library


    Xue S., Li Z., Wu R., Zhu T., Yuan Y., Ni C., Few-shot learning for small impurities in tobacco stems with improved YOLOv7, IEEE Access (2023).


    Zhang X., Chen L., Gendreau M., Langevin A., A branch-and-cut algorithm for the vehicle routing problem with two-dimensional loading constraints, European Journal of Operational Research 302 (1) (2022) 259–269.


    Zhang J., Cheng L., Liu C., Zhao Z., Mao Y., Cost-aware scheduling systems for real-time workflows in cloud: An approach based on Genetic Algorithm and Deep Reinforcement Learning, Expert Systems with Applications 234 (2023).


    Zhang L., Yu J., Zhang Y., Pareto-optimal algorithms for scheduling games on parallel-batching machines with activation cost, Asia-Pacific Journal of Operational Research 38 (05) (2021).


    Zhao F., Di S., Wang L., A hyperheuristic with Q-learning for the multiobjective energy-efficient distributed blocking flow shop scheduling problem, IEEE Transactions on Cybernetics (2022).


    Zhao F., Zhang H., Wang L., A pareto-based discrete jaya algorithm for multiobjective carbon-efficient distributed blocking flow shop scheduling problem, IEEE Transactions on Industrial Informatics (2022).


    Zhou H., Pang J., Chen P.-K., Chou F.-D., A modified particle swarm optimization algorithm for a batch-processing machine scheduling problem with arbitrary release times and non-identical job sizes, Computers & Industrial Engineering 123 (2018) 67–81.


    • An improved multi-objective evolutionary algorithm based on decomposition for solving re-entrant hybrid flow shop scheduling problem with batch processing machines


      • The features of re-entrance, batch processing and limited buffer are considered.


      Cold-drawn seamless steel pipes play a pivotal role in special industries. However, scheduling plan, as the core of the production, is manually made, which leads to the issue of low production efficiency. The change in processing time ...

      Read More

    • An energy-efficient bi-objective no-wait permutation flowshop scheduling problem to minimize total tardiness and total energy consumption


      • Total tardiness & total energy consumption minimization are addressed for NWPFSP.


      In manufacturing scheduling, sustainability concerns that raise from the service-oriented performance criteria have seldom been studied in the literature. This study aims to fill this gap in the literature by integrating the different ...

      Read More

    • Research on overall energy consumption optimization method for data center based on deep reinforcement learning

      With the rapid development of cloud computing, there are more and more large-scale data centers, which makes the energy management of data centers more complex. In order to achieve better energy-saving effect, it is necessary to solve the problems of ...

      Read More


    Information & Contributors


    Published In

    An improved deep Q-learning algorithm for a trade-off between energy consumption and productivity in batch scheduling (3)

    Computers and Industrial Engineering Volume 188, Issue C

    Feb 2024

    1029 pages


    Issue’s Table of Contents

    Elsevier Ltd.


    Pergamon Press, Inc.

    United States

    Publication History

    Published: 17 April 2024

    Author Tags

    1. Sustainability
    2. Energy consumption
    3. Energy-efficient scheduling
    4. Batch processing machine
    5. Deep reinforcement learning


    • Research-article


    An improved deep Q-learning algorithm for a trade-off between energy consumption and productivity in batch scheduling (4)

    Other Metrics

    View Article Metrics

    Bibliometrics & Citations


    Article Metrics

    • Total Citations

    • Total Downloads

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics

    View Author Metrics


    View Options

    View options

    Get Access

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    Get this Publication





    An improved deep Q-learning algorithm for a trade-off between energy consumption and productivity in batch scheduling (2024)
    Top Articles
    Latest Posts
    Article information

    Author: Otha Schamberger

    Last Updated:

    Views: 5524

    Rating: 4.4 / 5 (75 voted)

    Reviews: 90% of readers found this page helpful

    Author information

    Name: Otha Schamberger

    Birthday: 1999-08-15

    Address: Suite 490 606 Hammes Ferry, Carterhaven, IL 62290

    Phone: +8557035444877

    Job: Forward IT Agent

    Hobby: Fishing, Flying, Jewelry making, Digital arts, Sand art, Parkour, tabletop games

    Introduction: My name is Otha Schamberger, I am a vast, good, healthy, cheerful, energetic, gorgeous, magnificent person who loves writing and wants to share my knowledge and understanding with you.