Skip to content
APCM Europe 2023 Presents at APCM Europe 2023

Jasper van Heugten, Director of Advanced Technology at, will be presenting at the Advanced Process Control and Manufacturing (APCM) Conference on Thursday March 30.

Jasper’s talk titled “ Maestro: Optimized fab scheduling using reinforcement learning” highlights how our flagship semiconductor product leverages the latest developments in machine learning to improve the scheduling strategies in semiconductor manufacturing.

The APCM Conference is directed to manufacturers, suppliers and the scientific community of semiconductor, photovoltaic, LED, flat panel, MEMS, and other related industries. The topics are focused on current challenges and future needs of Advanced Process Control and Manufacturing Effectiveness.

For more information and registration to the conference see the official website.

From both Jasper and Subham Rath will be attending, if you want to meet them during the conference, then drop us a note via or reach out during the Q&A session after the talk.

Title: Maestro: Optimized fab scheduling using reinforcement learning

The multi-KPI optimization of scheduling and dispatching in a semiconductor fab is a notoriously hard and time consuming problem. Our Maestro solution focuses on a tractable method for optimising scheduling in a dynamic fab. Changing priorities in the fab on a day-to-day basis requires flexible optimization methods to respond to these changes.

We focus on the optimization of scheduling, and with our solution fab operators can augment their current workflow to quickly and efficiently create optimised schedules. These schedules are based on a user-defined set of KPIs, e.g. related to cycle times and on-time delivery.In contrast to classical optimization techniques, our solution relies on deep reinforcement learning to efficiently interpret and scale to the complex dynamics in the fab in order to arrive at highly optimised schedules.

Reinforcement Learning, a branch of machine learning, trains an agent to make decisions based on previously seen information or scenarios. This information is typically generated on the fly using simulator software or extracted from historical data. The agent gets feedback on the decisions it has made by inspecting the outcome of the decisions (typically called actions) and turning those into a mathematical formulation called the ‘reward’. The reward represents the KPIs that the fab operator is interested in. This dynamic feedback process is presented in the diagram below. After the training process is complete the agent can be queried for suggestions.

The agent that the solution produces can generate near-real-time scheduling suggestions to support the operators in their decision-making process and quickly adapt to changing demands and unexpected events. In this way, the large computational power of deep reinforcement learning can be utilised to present the fab operators with multiple solutions on the basis of different priorities, which the operator can then apply given their particular knowledge of the situation.


How can we help?

Reach out – we’d love to hear about how we can help.

We use cookies and similar technologies to enable services and functionality on our site and to understand your interaction with our service. By clicking on accept, you agree to our use of such technologies for analytics. See Privacy Policy

Leave this field blank