Understanding Hybrid Models in Biochemical Process Systems
It’s important to know what we know, and don’t know
The fundamental function of biochemical process systems has been studied for years, or better centuries, in the science and engineering fields. Much of the fundamental understanding is universally applicable and can be expressed in terms of material, population, energy, or other balances.
The specifics of certain processes are much less understood, and experimentation along with more data-driven modeling approaches is required, e.g. to understand which temperatures, pH, etc. are ideal to maximize production. However, despite these areas of uncertainty, one should never forget about what is known about the fundamentals. For instance, if we have 1000 cells in 1 liter and we add 1 more liter of culture medium, you can determine the concentration directly without a need for a data-driven approach – so why disregard alike knowledge when applying design of experiment studies?
Why use Hybrid models at all?
While it might be clear that the use of purely mechanistic models leave degree of uncertainty remaining, much is written about the power of machine learning models and AI – so why not simply use these to solve for X? The reality is that machine learning models need data, and lots of it, before they enter their optimal sweet spot. More data means more experiments, which can be excessively costly on time and resources. However, when combined with mechanistic models, which describe areas of knowledge, the unknowns are significantly reduced, and data requirements (and thus experimentation) becomes significantly lower.
When and how to use a Hybrid model within process development?
Aiming at a better understanding of how a process will evolve under varying conditions is a clear case for the application of hybrid models. Take for instance an upstream bioprocess, where a product is produced by cells. Applying the material balances for this process directly provides a valid frame against which any additional knowledge can be mapped. The only variables in these balances that are unknown are related to the biology – namely the specific rates and their dependence on changes in the process parameters such as temperature, or pH. Therefore, the only experimentation needed, when set up in an optimal fashion, are those which deliver this insight- read our latest publication on the comparison of strategies for iterative model-based upstream bioprocess development for further insight. Once set-up the hybrid model can then be used to simulate what would happen to the evolution of the process when changing, for example, temperature.
Figure 1: Material Balances

Figure 2: Simulation of process evolution with varying temperature

Alternatively, the mode of operation could be changed. As the feeds are explicitly incorporated in the material balances, the impact of changing the feedrate or feed composition can directly be simulated, not requiring the execution of additional experiments, another potential use case.
Broadly applicable to a diverse range of process challenges
These examples are just a few of the applications where hybrid models can drive significant efficiencies and deeper understanding. Their flexible nature is able to work through many process challenges across scale-up, process characterization, process validation etc. Crucially for industry, access to high performing hybrid models is becoming easier than ever before as service providers, such as DataHow, begin to integrate them as part of software solutions.