Projects

Stochastic Methods for Data Science: An in-progress book that provides an introduction to the interplay between stochastic process theory and algorithms in data science, with a focus (large-scale) stochastic optimization and Markov chain Monte Carlo. It is designed to be accessible to advanced undergraduates, graduate students, and researchers working in machine learning, statistics, and related fields.

VIABEL: A Python package that provides two core features:

  1. Easy-to-use, lightweight, flexible variational inference algorithms that are agnostic to how the model is constructed (just provide a log density and its gradient).

  2. Post hoc diagnostics for the accuracy of continuous approximations to (unnormalized) distributions. A canonical application is to diagnose the accuracy of variational approximations.

ShorTeX: A LaTeX package that aims to streamline LaTeX writing, particularly math. It automatically includes and configures commons packages, and provides functionality to, among other things, (1) make LaTeX math code shorter and more readable, (2) avoid the verbose commands and boilerplate common in LaTeX, and (3) avoid multi-key presses (curly braces, capital letters, etc.) where reasonable. It is being developed by myself, Trevor Campbell, and Jeffrey Negrea.

Preprints & Working Papers

Robust discovery of mutational signatures using power posteriors

bioRxiv 2026.03.05.707509, 2026.

Preprint

Mapping the North American Terrestrial Carbon Cycle: A Process-based Reanalysis Using State Data Assimilation

bioRxiv 2026.02.25.708030, 2026.

Preprint

Robust Model Selection for Discovery of Latent Mechanistic Processes

arXiv:2602.22062 [stat.ME], 2026.

Preprint PDF

Propagating Surrogate Uncertainty in Bayesian Inverse Problems

arXiv:2601.03532 [stat.ME], 2026.

Preprint PDF

Quantitative Error Bounds for Scaling Limits of Stochastic Iterative Algorithms

arXiv:2501.12212 [stat.ML], 2025.

Preprint PDF

Robust discovery of mutational signatures using power posteriors

bioRxiv 2024.10.23.619958, 2024.

Preprint

Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics

arXiv:2207.12395 [stat.CO], 2022.

Preprint PDF

Publications

More Publications

Calibrated Model Criticism Using Split Predictive Checks

Journal of the American Statistical Association, 2026+.

Preprint PDF

Tuning-Free Coreset Markov Chain Monte Carlo via Hot DoG

Proc. of the 41st Conference on Uncertainty in Artificial Intelligence, 2025.

PDF

Independent finite approximations for Bayesian nonparametric inference

Bayesian Analysis 19(4): 1187-1224, 2024.

PDF

A Framework for Improving the Reliability of Black-box Variational Inference

Journal of Machine Learning Research 25(219): 1−71, 2024.

PDF

Reproducible Parameter Inference Using Bagged Posteriors

Electronic Journal of Statistics 18(1): 1549–1585, 2024.

PDF

Structurally Aware Robust Model Selection for Mixtures

arXiv:2403.00687 [stat.ME], 2024.

Preprint PDF

A Targeted Accuracy Diagnostic for Variational Approximations

In Proc. of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS), Valencia, Spain. PMLR: Volume 108, 2023.

Preprint PDF

Reproducible Model Selection Using Bagged Posteriors

Bayesian Analysis 18(1): 79-104, 2023.

PDF

The Mutational Signature Comprehensive Analysis Toolkit (musicatk) for the Discovery, Prediction, and Exploration of Mutational Signatures

Cancer Research 81(23), 2021.

PDF

Challenges and Opportunities in High-dimensional Variational Inference

In Proc. of the 35th Annual Conference on Neural Information Processing Systems (NeurIPS), 2021.

Preprint PDF

Thesis

Scaling Bayesian inference: theoretical foundations and practical methods

Ph.D. thesis, Massachusetts Institute of Technology, 2018.

PDF

Miscellanea

Structurally Aware Robust Model Selection for Mixtures

arXiv:2403.00687 [stat.ME], 2024.

Preprint PDF

The feasibility of targeted test-trace-isolate for the control of SARS-CoV-2 variants

F1000Research 10(291), 2021.

Preprint

Reconstructing probabilistic trees of cellular differentiation from single-cell RNA-seq data

arXiv:1811.11790 [q-bio.QM], 2018.

Preprint PDF

Practical bounds on the error of Bayesian posterior approximations: A nonasymptotic approach

arXiv:1809.09505 [stat.TH], 2018.

Preprint PDF

Detailed Derivations of Small-variance Asymptotics for some Hierarchical Bayesian Nonparametric Models

arXiv:1501.00052 [stat.ML], 2014.

Preprint PDF

Infinite Structured Hidden Semi-Markov Models

arXiv:1407.0044 [stat.ME], 2014.

Preprint PDF

Recent & Upcoming Talks

More Talks

Reproducible Statistical Inference
Dec 15, 2024
Gaussian Process Surrogates for Bayesian Inverse Problems
Oct 9, 2024
Reproducible Statistical Inference
Mar 13, 2024
Robust, structurally-aware inference for mixture models
May 18, 2023
Trustworthy variational inference
Oct 21, 2022
Algorithmically robust, general-purpose variational inference
Apr 13, 2022

Short Bio

Dr. Jonathan Huggins is an Assistant Professor of Mathematics & Statistics and of Computing & Data Sciences at Boston University. His group’s research lies at the intersection of statistics and machine learning, with a focus on developing methods that are mathematically principled, scalable, and useful in practice. A central theme is that uncertainty quantification should remain trustworthy even when models are imperfect and inference is approximate — challenges that are especially acute in scientific applications involving heterogeneous data, latent structure, and substantial computational constraints. Our work spans four interconnected areas: (1) scalable generalized Bayesian learning, including theory and methods for robust, reproducible inference under model misspecification; (2) automation and validation of posterior approximation algorithms; (3) discovery of interpretable latent structure in complex scientific data; and (4) large-scale data assimilation and forecasting. Current applied work is focused on developing computational methods and software tools for large-scale ecological and Earth science forecasting and for scientific discovery from high-throughput genomic data. Increasingly, these efforts are also motivating a systems-oriented direction aimed at making end-to-end probabilistic workflows more scalable, transparent, and reproducible.

Jonathan is also a Data Science Faculty Fellow and an affiliated faculty member of the Department of Computer Science, the BU URBAN Program, and the BU Program in Bioinformatics. He is a recipient of an NSF CAREER award (2024) and a Blackwell–Rosenbluth Award (2023), which recognizes outstanding junior Bayesian researchers based on their overall contribution to the field and to the community. His research has been supported by the National Institutes of Health, the National Science Foundation, and the Department of Defense.

Prior to joining BU, Jonathan was a Postdoctoral Research Fellow in the Department of Biostatistics at Harvard. He completed his Ph.D. in Computer Science at the Massachusetts Institute of Technology in 2018. Previously, he received a B.A. in Mathematics from Columbia University (Summa Cum Laude) and an S.M. in Computer Science from the Massachusetts Institute of Technology.

Contact