Possible insights for strain engineering from precision cell simulators
Strain development takes approximately 5 years and $50 million to bring a product to market. Can precision cell simulators offer new and actionable insights?

by Laurence Yang

Challenge for Strain Development
Average time and investment to bring a product to market through strain development:
5 Years
$50M
Failed scale-up runs remain a major factor (Abatte et al., 2023):
  • strains are typically selected using small-scale experimental data
  • strains may genetically escape the fitness cost of diverting carbon and energy to the product
Both factors reduce KPIs.
To scale up without losing performance, strain engineering cycles need to consider fitness and scale-up potential in addition to titer and yield (Abatte et al., 2023).
ME-Models: Precision Cell Simulators
Given ongoing challenges, here are some ideas for strain engineering. I will draw specifically from precision cell simulators called ME-models (metabolism and macromolecular expression).

For full disclosure, these insights haven't been tested in industrial settings yet. Their complexity makes them slow to adopt even in academia. And only a select few companies have attempted to apply them consistently in industry settings. I'm trying to change this by making insights accessible (and the simulators themselves).
What is a ME-model?
ME (Metabolism and macromolecular Expression)-models are multi-scale genome-scale models that integrate cellular metabolism with macromolecular expression. They expand metabolic network models with first-principles reconstruction of transcription, translation, and post-translational modification pathways. Proteome allocation constraints are built in, as is variable biomass composition in different environments.
Some notable extensions for E. coli:
  • DynamicME: Enables dynamic simulation and refinement of integrated metabolism and protein expression models
  • StressME (iZY1689-StressME): recent comprehensive ME model of E. coli that combines three stress response models: FoldME (thermal stress), OxidizeME (oxidative stress), AcidifyME (acid stress). It includes 1,689 genes, 1,578 proteins, 1,673 metabolites, 1,692 complexes, and 36,735 reactions.
ME-models compute holistic factors affecting cell fitness beyond redirecting carbon and energy. They consider various aspects of cellular metabolism and protein expression, including:
  • Cost of expressing protein, in great detail from nutrient requirements, making the protein, making more ribosome and RNAP, allocating finite ribosome and RNAP to express specific proteins
  • Taking up finite protein budget (volume, mass)
  • Protein folding and homeostasis requirements and chaperone load
  • Cost of detoxifying reactive oxygen species or mitigating acid stress
This holistic accounting of fitness burdens leads to insight #1:
Insight #1. The most limiting burden to host fitness changes with current strain performance
Strain engineering needs to address distinct fitness limitations as strain performance improves:
1
Initial Phase (Low Production)
Carbon & Energy
  • Primary limitation is basic growth and maintenance
  • Cell maintains normal homeostasis
  • Minimal stress response activation
2
Mid-Phase (Moderate Production)
Resource Competition
  • Transcription/translation machinery becomes limiting
  • Metabolic precursor pools deplete
  • Energy constraints emerge from product synthesis
3
Advanced Phase (High Production)
Cellular Infrastructure
  • Protein folding capacity saturates
  • Membrane transport systems overload
  • Chaperone networks become overwhelmed
4
Final Phase (Maximum Production)
Stress Management
  • Oxidative stress accumulates
  • pH homeostasis destabilizes
  • Toxic intermediate buildup
This progression suggests that optimization strategies should match the current limiting factor rather than trying to solve all bottlenecks simultaneously.
It's important to note that this progression also depends on the product.
Protein production likely faces immediate translational bottlenecks, while small molecule production initially struggles with carbon partitioning before protein expression becomes limiting.
Next, we focus on insights gleaned from dynamic metabolism - proteome simulations via DynamicME.
Insight #2: cells need time to shift their proteome
This is because reallocating expressed proteins takes hours or longer depending on host, and on protease activity (DynamicME)
The theoretically optimal way for a cell to reallocate its proteome by expressing new proteins and degrading old ones is sequentially expressing the most limiting proteins one by one (Pavlov and Ehrenberg, 2013).
And if proteases are in short supply, the cell still needs to free up more proteome space as you can’t pack infinite amount of protein in a cell. The only way is to dilute out existing proteins – a slow process proportional to cell division (growth) rate.
One practical implication of this: pre-culture conditions can impact the first hours or days of bioreactor performance.
Using DynamicME, we can inspect this phenomenon at single protein and per-minute resolution through simulation.
Insight #3. Many proteins expressed by microbial hosts are useless – or are they?
Many proteins expressed by microbial hosts appear to be useless at first glance. In fact, up to nearly half the proteome mass is potentially unused for E. coli in any given growth condition (O'Brien et al., 2016). Reducing this unused proteome is one of the first traits that lab-evolved strains of E. coli exhibit to improve growth rate.
However, this proteome pre-allocation turns out to be a fitness strategy for generalists evolved to survive in multiple conditions. The regulatory network has evolved to maintain elevated levels of:
  • Proteins for consuming energy-rich carbon sources (even when absent)
  • Stress response systems (even in low-stress conditions)
  • Excess proteostasis machinery
This is achieved by the general stress response sigma factor RpoS.
Fitness benefits of proteome pre-allocation
Pre-allocated proteins serve as a "reserve capacity" for alternative substrate metabolism, environmental stress responses, and protein quality control systems - similar to maintaining emergency reserves that enable rapid adaptation when needed.
This strategy reflects an evolved trade-off between growth efficiency and environmental responsiveness in generalist organisms.
Understanding this trade-off may inform strategies for optimizing protein expression while maintaining the strain's robustness and adaptability.
Ultimately, this tradeoff exists because of Insight #1 and #2: cells have finite resources/space,/capacity, and they can't instanteously shift their proteome.
Takeaways
  • Detect evolving fitness limitations as strain performance improves, and tailor optimization strategies to them
  • Consider how optimizing pre-culture can improve a bioreactor run's performance
  • Deliberately balance pre-allocated proteins for strain robustness versus peak performance
Want to discuss these strategies for your situation?
Book a timeslot here