DIFFERENCES IN SOLUTION STRATEGY BETWEEN SIMULATION AND MATHEMATICAL PROGRAMMING
Beyond simplistic averaging methods that can be performed in a spreadsheet, there are only two accurate approaches to biologics
manufacturing analysis—discrete event simulation and mathematical programming. Here, we compare and contrast these approaches.
The mathematical programming paradigm presented here represents a marked departure from traditional discrete event simulation
Discrete Event Simulation
Discrete event simulation has been around for many years and a number of commercial simulator systems are available. These
simulators work by initializing the conditions of the simulation, e.g., starting with all vessels idle and empty, and then
defining events that alter this simulation state. The state is defined at a particular time—the simulation or current time.
The simulation maintains the state only at the current value of tc. This state includes all of the material inventories at tc, and a list of events that are posted to be processed at times t > tc. When the next (earliest) event is handled, the state of the simulation may change and tc is advanced. The handling of an event will generate new events for future processing. For example, an event that initiates
a batch task will create the event marking the end of that task as well as events that consume the task ingredients and produce
the products at their respective times. Events must be processed in a chronological sequence, from earliest to latest. Because
the state of the simulation exists only at the current value of tc, it is necessary for the simulation to produce a record of the time-dependent data as it progresses, so that a Gantt chart
and material inventory traces will be available. Thus, the simulation can only see the current time; it cannot, for example,
foresee the inventory levels of materials that will exist at future times. Events and data lying to the left of the current
time t < tc, are part of the history and cannot be changed. In addition, new events may only be created for t > tc. As we discuss below, this is a serious limitation.
The mathematical programming approach used here operates in a completely different fashion from discrete event simulation.
The state of the model refers to and represents the model data at all times between the model start time and the horizon.
The solver data structures maintain the information that describe all activities present in the solution, on all equipment
over the entire timeline. These data, of course, represent the familiar Gantt chart. However, the solver data structures also
maintain the material inventories and location for all materials over the entire timeline. These data represent a complete
time history of inventory for each material. They also imply blocks on the Gantt chart that represent storage activities
for vessels that have non-zero inventories over the timeline. This contrasts to simulation-based methods, and it affords the
ability to handle more complex problems with improved computational speed and flexibility.
Simulation versus Mathematical Methods
Now that we understand the differences in the way data are stored and processed in these two different approaches, we can
contrast how they behave in practice. Two major problems arise when applying simulation methods to a process such as the one
discussed in this paper. The first is the sheer level of complexity of the process. Writing rules for dealing with hundreds
of pieces of equipment and tasks is both tedious and error prone.
The second problem, however, is more severe. Because events must be processed in chronological order, if an event at time
t has been processed, then no events at any earlier time may be considered. Consider what happens when a demand is scheduled
in this process. The culture is initialized in the inoculation area, then moves sequentially through the plant to the bioreactors
and purification systems, emerging as a final product some days later. It is quite natural to view the progress of an individual
seed batch as it moves through the plant, much as a graduating class might ideally progress en bloc through a university. Our mathematical programming method addresses the entire timeline at all points in the solution algorithm
execution. This means that in considering the scheduling of a batch, we have placed tasks on the timeline that are days or
even weeks later than when the inoculation took place for that batch. When we consider an adjacent batch on the timeline,
the solver is able to explore, going back on the timeline and placing new inoculation activities at times much earlier than
the final tasks of the leftward batch. We can do this because our approach has no concept of present time. We are free to
range over the entire timeline and insert activities anywhere they are feasible. This is possible because our solution represents
not just the state of the plant at a particular time, but at all times. We represent and can account for all material inventories
and locations over the entire horizon of the schedule. Of course, if events that happen on day one change inventory levels
and alter the value of inventory available on subsequent days, the system must account for this causality to avoid generating
infeasible solutions. This functionality is provided by the core VirtECS solver that obeys the UDM formulation given above,
much as a commercial LP solver ensures that linear constraints are satisfied. Because we can range over the entire timeline,
care must be taken to honor the time-phased discovery of stochastic information. If the intent is to model actual plant behavior,
the algorithm cannot make decisions based on data that would not yet be known at a given time point. For example, the titer
of a bioreactor batch is not known until it finishes. If a column repack fails, this information only becomes available at
the point of failure. The algorithm described here takes care not to cheat the physics by using data to alter events that
occurred on the timeline before the knowledge of that data was available.
Considering a demand within a mathematical programming framework is a simpler operation than doing so within a simulation.
The outer algorithm contains logic that examines the availability times on the equipment and selects an appropriate equipment
set for processing and storage. Then, a single call to the core solver can cause the demand to be scheduled, honoring all
of the constraints in the UDM formulation. Such a demand-by-demand solution strategy ranges over the timeline from left to
right and then back again until a complete solution is generated. For a single-product facility, such an approach works well.
After completion of the UDM subproblem solution method in the core solver, control returns to the outer algorithm where the
results can be evaluated and additional specialized tasks inserted if need be, before the next demand is considered. This
is the point at which the history of the timeline is examined and process-specific constraints are enforced. For example,
the cumulative loading on each column is examined by computing the weighted total of all tasks since the last column repack.
If this value exceeds the specified limit, a repack task is scheduled on that column before subsequent demands are considered.