Services

Five domains — from quick answer to working capability.

Five overlapping practice areas, offered along a spectrum — from a quick answer to a specific question, through a defined build that becomes yours, to working as the embedded computational core of your team. In most engagements they connect: a protein design project also touches data integration; a vaccine project usually needs AI scoring; an AWS setup exists to run them reproducibly.

01 / Protein

Protein, peptide & de-novo design

Most biotech development needs some form of protein engineering — whether that's generating mutants with altered solubility or glycosylation, designing fusion constructs, choosing optimal peptide fragments for epitope mapping, or, increasingly, designing entirely new binders from scratch.

I work across both ends: classical structure-guided engineering, and the modern AI stack (AlphaFold for structure, RFdiffusion and related diffusion models for de-novo backbones, ProteinMPNN for sequence design, ESM-family models for scoring and likelihood). The interesting work is usually in combining them honestly — knowing where each method is reliable and where it isn't.

Rational mutant & variant design
Fusion & multi-domain constructs
De novo binder design
Epitope-focused fragment selection
Stability & solubility engineering
Glycoengineering & PTM analysis
Developability & liability scans
Freedom-to-operate aware design

02 / Vaccines & mRNA

Vaccine & mRNA platform engineering

Computational vaccinology has been my core practice for over a decade — from a published end-to-end workflow (Söllner et al., Immunome Research, 2010) to ongoing client projects across infectious disease and oncology. I help teams move from pathogen sequence to a defensible antigen short-list: reverse vaccinology, T-cell and B-cell epitope prediction, HLA and population coverage, strain consensus design.

For mRNA platforms I add tissue-specific bicodon optimization and UTR selection, and can help convert existing protein-format designs into properly engineered mRNA constructs — or build out the design layer of an entire proprietary platform.

Reverse vaccinology pipelines
T-cell & B-cell epitope prediction
HLA & strain coverage analysis
Population centroid & consensus antigens
Bicodon & codon optimization
UTR selection & engineering
Protein → mRNA conversion
mRNA platform design layer

03 / AI & ML

Custom AI & ML for biology

Off-the-shelf models are extraordinary, but they aren't always the right answer for a specific question. When a project needs a custom predictor, a calibrated scoring function, a fine-tuned protein language model, or an evaluation harness that's actually trustworthy on your data — that's what this service is for.

This includes building agentic / LLM-based R&D workflows: tools that triage literature, draft hypothesis lists, summarise experimental results, or sit between scientists and a complex analysis pipeline. Currently a growing share of new engagements.

Custom predictors & classifiers
Protein language model fine-tuning
Calibrated scoring & ranking
Benchmarking & validation
LLM & agentic workflows
RAG over scientific corpora
Active learning loops
Model-result interpretation

04 / Data & Cloud

Biomedical data integration & AWS infrastructure

A recurring theme across nearly all projects: the data you need is real, but it lives in fifteen places with five different schemas. I've repeatedly integrated sources such as PubMed abstracts, MeSH controlled vocabularies, drug and small-molecule target databases, the US patent full-text corpus and bespoke client data — either by pre-existing identifiers or by automated full-text mining — into queryable structures and knowledge graphs that downstream analysis can actually use.

For compute-heavy work (especially AI inference and training), I can also set up reproducible AWS environments — from a single GPU instance for an experiment to a properly versioned, containerised pipeline you and your team can run independently afterwards.

Bio-data warehousing & ETL
Knowledge graphs (genes / proteins / drugs / disease)
PubMed & literature mining
Patent full-text mining
Reproducible pipeline engineering
AWS GPU & batch compute setup
Containerised, versioned workflows
Handover & team training

05 / Strategy & IP

Strategic consulting, literature reviews & IP

Sometimes the most useful deliverable isn't code or a model — it's a clear, written assessment of what the literature actually says, what the IP landscape looks like, or whether a particular computational result should change your decision. I offer this both as standalone engagements and embedded within larger projects.

This also covers drug repositioning analysis using integrated protein/drug/disease networks — identifying compounds with mechanistic rationale for new indications — which has been a productive line of work across several past projects.

Custom biomedical literature reviews
IP background & freedom-to-operate searches
Bioinformatic result interpretation
Method scope & utility advice
Drug repositioning analysis
Network-based hypothesis generation
Pre-clinical R&D decision support
Innovation & grant-narrative support

Decision guide

Build it, or just buy the answer?

A simple test for what to bring in-house: build the things you'll (1) do repeatedly, (2) that sit close to your USP, and (3) that you'd hate to depend on an outside party for. Everything else — one-offs, commodity steps, things that will never recur — is usually cheaper to outsource and move on.

And it depends on where you're going: a team optimizing for a fast acquisition may rationally buy more and build less; a team building a durable platform should own its core. The right answer follows your vision and what's efficient for it — not a rule of thumb. Helping you make that call honestly is part of the work, even when the answer is "don't build this, just buy it."

Engagement models

Three ways to work together.

Most engagements fall into one of these three shapes. The through-line is how much capability you walk away with — from a one-off answer to a working capability built into your team. We'll figure out which one fits in the first call.

Everything I build for you is yours, documented and transferable. The only thing I keep is any general tooling I bring with me — so you benefit from it without having to reinvent it.

Embedded collaboration (ongoing). I act as the computational core of your team while deliberately building the capability into it, with the goal of working myself out of the day-to-day.
Defined build (weeks – months) — most common starting point. We build a working capability — a pipeline, a model, a workflow — that becomes yours.
Scoped project (days). A specific question answered — worked transparently and fully documented, so that if you have (or want to build) the capacity in-house, the process is yours to take up. If you'd rather I simply deliver the result, that's fine too.