Random feature Stein discrepancies


Computable Stein discrepancies have been deployed for a variety of applications, including sampler selection in posterior inference, approximate Bayesian inference, and goodness-of-fit testing. Existing convergence-determining Stein discrepancies admit strong theoretical guarantees but suffer from a computational cost that grows quadratically in the sample size. While linear-time Stein discrepancies have been proposed for goodness-of-fit testing, they exhibit avoidable degradations in testing power—even when power is explicitly optimized. To address these shortcomings, we introduce feature Stein discrepancies (${\Phi}$SDs), a new family of quality measures that can be cheaply approximated using importance sampling. We show how to construct ${\Phi}$SDs that provably determine the convergence of a sample to its target and develop high-accuracy approximations—random ${\Phi}$SDs (R${\Phi}$SDs)—which are computable in near-linear time. In our experiments with sampler selection for approximate posterior inference and goodness-of-fit testing, R${\Phi}$SDs typically perform as well or better than quadratic-time KSDs while being orders of magnitude faster to compute.

In Proc. of the 32nd Annual Conference on Neural Information Processing Systems (NIPS)