MINTverse

Hours become milliseconds.

The most advanced temporal neural simulation platform ever engineered.

Impossible

speed. 12 milliseconds. Per emulation.

125,000× faster than agent-based systems.

Remarkable

precision 99.8% R². Neural emulation.

DuckDB-powered data pipeline.

Effortless

deployment; 100 simulations per second.

Perfect GPU utilization. Pure parallel execution.

Skip movie

Intelligence without limits.

One model. Any scale. Infinite possibility.

Precision. Redefined.

Every bit. Every run. Deterministic to the last electron.

Engineered for change.

Your laptop. Their cloud. Same performance.

MINTverse

85% faster than a blink.*

R²

Near-parity
with agent-based models

faster

24 minutes → 12 milliseconds
Neural emulation breakthrough

per second

Real-time simulations
on consumer hardware

Available Now

About & Contact ↗ Benchmarks ↗ View documentation ↗

Ownership & license. MINTverse authored by Cosmo Santoni; © Imperial College London. Released as open source (see repository LICENSE). No patient-level data ships with the software; models are trained/evaluated on simulated and/or de-identified aggregates.

Latency context. Average blink: 100–150 ms. MINTverse: ~12 ms per full-horizon scenario inference (single-GPU, batched). Faster than most database queries, API calls, or even a DNS lookup. In the time it takes to query a database, MINTverse runs a complete simulation."

Neural emulation breakthrough.

24 minutes → 12 milliseconds

Engineered at Imperial. Deployed globally.

Engineering Excellence

12ms latency

Faster than GPT-4 inference

• Custom RNN architecture with 99.8% R²
• PyTorch → CUDA pipeline optimization
• DuckDB: 574M points

Production Scale

100/sec throughput

On consumer hardware

• 8 deployed production systems
• 250,000 validation sequences
• Real-time decision support

Recognition

SAGE Award

UK Chief Scientific Officers

• The Lancet & Nature publications
• Imperial College London PhD
• Speaker at WHO, MSF, data.org events

I make the impossible inevitable.

Cosmo Santoni — Applied Mathematics & Machine Learning PhD candidate at Imperial College London.

I transform 24-minute computations into 12ms inference. From pure mathematics to production deployment. Full-stack ML engineering with a single obsession: speed without compromise.

PyTorch. CUDA. DuckDB. React. Whatever it takes to build systems that save lives at scale.

I don't just publish papers. I build production systems that policy-makers & governments use to make critical decisions. From mathematical theory to 125,000× speedups through custom neural architectures. End-to-end: models, APIs, deployment, and yes—even this site.

Download CV GitHub LinkedIn

125,000× faster.

Real benchmarks. Real hardware. Real breakthrough.

Single-run latency (p50) — lower is better

Median wall time

23.6 min

ABM (CPU)

22 ms

LSTM (CPU)

12 ms

LSTM (GPU)

Agent-Based Model

Neural Network (CPU & GPU)

Batch Processing (p50) - lower is better

Median wall time for 16 simulations

6.2 hours

ABM (CPU)

371 ms

LSTM (CPU)

168 ms

LSTM (GPU)

Agent-Based Model

Neural Network (CPU & GPU)

Capacity: time to 10,000 simulations

Best measured configuration

48.3 days

ABM (CPU)

3.38 min

LSTM (CPU)

1.73 min

LSTM (GPU)

Agent-Based Model

Neural Network (CPU & GPU)

Method. Same scenarios & time horizon for both systems. ABM reports wall time for 8 stochastic replicates per scenario (full 8-rep bundle). Emulator (LSTM) reports p50 over 9 runs with the fastest dropped; GPU is a single device with 3 warm-ups. CPU-single uses torch.set_num_threads(1); CPU-parallel uses future::multisession.

Determinism. torch.backends.cudnn.deterministic=TRUE, torch.backends.cudnn.benchmark=FALSE. A non-deterministic “turbo” profile exists but is not used for the reported numbers.

Environment. Intel Core Ultra 9 185H (16C/22T), 62 GiB LPDDR5-7467, NVIDIA RTX 3500 Ada (12 GiB), Fedora Linux 41, NVIDIA 575.64.03. ABM CPU workers ≤4 to avoid thermal throttling; GPU: CUDA_VISIBLE_DEVICES=0.

Data pipeline (DuckDB). RDS outputs flattened into a single simulation_results table. Local queries (no external DB), with PRAGMA threads and PRAGMA memory_limit set per machine. Derived targets (prevalence, cases/1000) computed at load. Empirical (full corpus ≈574,095,360 rows; DuckDB v1.1.3-dev165; threads=16; memory_limit='24GB'): p50 ≈ 0.50 s for COUNT(*), 1.46 s for 30-day cases (GROUP BY), 11.4 s for a 7-step rolling prevalence window.

Full benchmark methodology ↗