Sven Sven
Cosmo Santoni
Olaf Olaf
Cosmo Santoni
Imperial College London · PhD Researcher · Research Software Engineer

Machine Learning
Research

I am a PhD student researcher and Research Software Engineer at Imperial College London, specialising in foundational machine learning and applied mathematics for scientific computing, working on sequential neural surrogates, state-space models, and simulation-based inference. I also work as a full-stack Machine Learning Research Software Engineer supporting the WHO national malaria control programmes. Founder & CEO of HiddenState. Previously at Cambridge and DFKI. Work spanning NeurIPS, Nature and The Lancet.

Partners & Collaborators

Partners & Collaborators
  • Imperial College London
  • Cambridge University
  • Google JAX
  • World Health Organization
  • DFKI
  • Imperial College London I-X
  • SAGE (Scientific Advisory Group for Emergencies)
  • Data.org
  • Public Health Agency of Canada
  • NHS
  • UK Health Security Agency (UKHSA)
  • Parkinson's UK
  • Toulouse INP
★ Web Ring ★

Software Engineering

OPEN SOURCE
MERGED INTO GOOGLE

mamba2-jax

JAX/Flax Mamba-2 SSM for causal LM and time-series. Contributed to Google's jax-ml/bonsai model zoo.

OPEN SOURCE
62 stars

Claude Code CMV

Version control for AI conversations. Snapshot, branch, and trim context like virtual memory.

Publications & Preprints

Preprint 2026
Compiler-First State Space Duality and Portable O(1) Autoregressive Caching for Inference
C. Santoni
Technical Report 2026
Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents
C. Santoni
NeurIPS 2025 2025
Tokenised Flow Matching for Hierarchical Simulation-Based Inference
G. Charles, C. Santoni, S. Flaxman, E. Semenova

Research Focus

I make expensive computations fast and tractable. I build neural surrogates that replace costly simulations with learned emulators, design state-space architectures for efficient language and time-series modelling, and optimise LLM training through multi-fidelity Bayesian methods.

Research Area

Neural Surrogates

Learning to emulate computationally expensive simulations using deep temporal models. Achieving 125,000x speedup at 99.8% accuracy for epidemiological agent-based models used in WHO pandemic preparedness.

Research Area

State-Space Models

Efficient state-space architectures (Mamba-2) in JAX for causal language modelling and time-series forecasting. Contributions merged into Google's jax-ml/bonsai model zoo with up to 28x inference speedup through state-space caching.

Research Area

Simulation-Based Inference

Developing methods for likelihood-free inference in complex stochastic systems. Publications in Nature and The Lancet. NeurIPS 2025 workshop on tokenised flow matching for hierarchical SBI.

Research Area

ML Systems Optimisation

Making ML models run faster through compiler-level optimisation. Replacing custom CUDA kernels with portable XLA compilation, achieving hardware-agnostic deployment across CPUs, GPUs, and TPUs from a single codebase.

Research Area

Bayesian Optimisation

Sample-efficient multi-fidelity optimisation for expensive black-box functions. Applied to LLM data mixture discovery, jointly optimising over model scale and training duration via learning curve extrapolation.

Methods & Tools

Technical Stack

Frameworks: JAX, PyTorch, Flax, CUDA.
Focus: State-space models, temporal emulation, Bayesian inference, DuckDB analytics.
Recognition: SAGE Award from UK Chief Scientific Officers.

Research goal:
make simulations learnable, make inference tractable, make science faster.

Let's Connect

Email

Always open to interesting problems and people.