Biologically informed data generation

Better Biological Data → Better Protein Design

We help you collect large, context-rich datasets (binding, solubility, stability, expression) and feed them into AI to accurately predict and design novel proteins.

Talk to us See results Cell-free systems Gene-specific hypermutation High-throughput screens
High-throughput screening & readouts
Context-aware data (salt, temp, folding)
AI-ready standardized datasets
Cell-Free Protein Expression

The problem

Models trained on sparse or out-of-context biology miss what matters: folding, degradation, and functional performance in real conditions. Relying on predictions alone leaves value in underexplored sequence space.

  • Limited and biased training data → unreliable zero-shot performance.
  • Context sensitivity (salt, temperature, cofactors) is rarely captured.
  • Most pipelines validate too late, after costly scale-up.

Our approach

Pair design with massive, standardized data collection: generate sequence diversity, run high-throughput functional screens, validate in cell-free systems, and continuously feed results back into AI models.

Design seeds
Gene-specific hypermutation
HTP screens
Cell-Free validation
Model update

What we build

Our Partners

Logos of partner companies

Outcome-driven solutions

Gene-specific hypermutation

Generate vast, targeted sequence diversity efficiently to explore neighborhoods around promising scaffolds without prohibitive synthesis costs.

High-throughput screening

Display-based and biochemical assays produce rich functional readouts at scale, suitable for supervised learning.

Cell-free validation

Rapid, small-scale expression in bacterial, yeast, and mammalian cell-free systems to measure folding, solubility, and activity before fermentation.

Data standardization

Clean schemas, QC, and metadata (buffers, temps, salts) make datasets plug-and-play for model training.

Model integration

Iterative retraining with active learning prioritizes experiments that maximize information gain.

Seamless handoff

From bench-scale validation to fermentation with minimal re-engineering.

Selected outcomes

  • Improved expression/folding/solubility rates after data-guided redesign.
  • Agreement between cell-free and cell-based (mammalian expressed) binding in pilot sets.
  • Faster design-to-validation cycles via active learning.

Full datasets and methods available upon request.

Liberum's Performance Metrics

Who this helps

Discovery teams

Rapidly explore sequence neighborhoods and prioritize designs likely to express and function.

Protein engineers

Close the loop between design and experiment with standardized, AI-ready datasets.

Platform leaders

De-risk scale-up by validating in cell-free systems before fermentation and downstream work.

Work with us

Have a target or need rapid exploration around a scaffold? Let's talk.

Talk to a technical advisor

Message us