Cosmo Santoni - Applied Mathematics & Machine Learning

Machine Learning
Research

I am a PhD researcher and software engineer at Imperial College London, specialising in foundational machine learning and applied mathematics for scientific computing, working on sequential neural surrogates, state-space models, and simulation-based inference. I also work as a full-stack Machine Learning Research Software Engineer supporting the WHO national malaria control programmes. Previously at Cambridge and DFKI. Work spanning NeurIPS, Nature and The Lancet.

Friends

Partners & Collaborators

★ Web Ring ★

I make expensive computations fast and tractable. I build neural surrogates that replace costly simulations with learned emulators, design state-space architectures for efficient language and time-series modelling, and optimise LLM training through multi-fidelity Bayesian methods.

Research Area

Neural Surrogates

Learning to emulate computationally expensive simulations using deep temporal models. Achieving 125,000x speedup at 99.8% accuracy for epidemiological agent-based models used in WHO pandemic preparedness.

Research Area

State-Space Models

Efficient state-space architectures (Mamba-2) in JAX for causal language modelling and time-series forecasting. Contributions merged into Google's jax-ml/bonsai model zoo with up to 28x inference speedup through state-space caching.

Research Area

Simulation-Based Inference

Developing methods for likelihood-free inference in complex stochastic systems. Publications in Nature and The Lancet. NeurIPS 2025 workshop on tokenised flow matching for hierarchical SBI.

Research Area

Continual Learning

Methods for learning sequentially without catastrophic forgetting. Enabling models to adapt over time as new data arrives while retaining previously acquired knowledge.

Research Area

Bayesian Optimisation

Sample-efficient multi-fidelity optimisation for expensive black-box functions. Applied to LLM data mixture discovery, jointly optimising over model scale and training duration via learning curve extrapolation.

Methods & Tools

Technical Stack

Frameworks: JAX, PyTorch, Flax, CUDA.
Focus: State-space models, temporal emulation, Bayesian inference, DuckDB analytics.
Recognition: SAGE Award from UK Chief Scientific Officers.

Research goal:
make simulations learnable, make inference tractable, make science faster.

Machine Learning
Research

Partners & Collaborators

Selected Research Outputs

MINTverse

mamba2-jax