CmdStanModel-method-variational.Rd
The variational
method of a CmdStanModel
object runs
Stan's variational Bayes (ADVI) algorithms.
CmdStan can fit a variational approximation to the posterior. The
approximation is a Gaussian in the unconstrained variable space. Stan
implements two variational algorithms. The algorithm="meanfield"
option
uses a fully factorized Gaussian for the approximation. The
algorithm="fullrank"
option uses a Gaussian with a full-rank covariance
matrix for the approximation.
-- CmdStan Interface User's Guide
$variational( data = NULL, seed = NULL, refresh = NULL, init = NULL, algorithm = NULL, iter = NULL, grad_samples = NULL, elbo_samples = NULL, eta = NULL, adapt_engaged = NULL, adapt_iter = NULL, tol_rel_obj = NULL, eval_elbo = NULL, output_samples = NULL )
The following arguments can
be specified for any of the fitting methods (sample
, optimize
,
variational
). Arguments left at NULL
default to the default used by the
installed version of CmdStan.
data
(multiple options): The data to use:
A named list of R objects like for RStan;
A path to a data file compatible with CmdStan (R dump or JSON). See the appendices in the CmdStan manual for details on using these formats.
seed
: (positive integer) A seed for the (P)RNG to pass to CmdStan.
refresh
: (non-negative integer) The number of iterations between
screen updates.
init
: (multiple options) The initialization method:
A real number x>0
initializes randomly between [-x,x]
(on the
unconstrained parameter space);
0
initializes to 0
;
A character vector of data file paths (one per chain) to initialization files.
variational
methodIn addition to the
arguments above, the variational
method also has its own set of
arguments. These arguments are described briefly here and in greater detail
in the CmdStan manual. Arguments left at NULL
default to the default used
by the installed version of CmdStan.
algorithm
: (string) The algorithm. Either "meanfield"
or "fullrank"
.
iter
: (positive integer) The maximum number of iterations.
grad_samples
: (positive integer) The number of samples for Monte Carlo
estimate of gradients.
elbo_samples
: (positive integer) The number of samples for Monte Carlo
estimate of ELBO (objective function).
eta
: (positive real) The stepsize weighting parameter for adaptive
stepsize sequence.
adapt_engaged
: (logical) Do warmup adaptation?
adapt_iter
: (positive integer) The maximum number of adaptation
iterations.
tol_rel_obj
: (positive real) Convergence tolerance on the relative norm
of the objective.
eval_elbo
: (positive integer) Evaluate ELBO every Nth iteration.
output_samples:
(positive integer) Number of posterior samples to
draw and save.
The variational
method returns a CmdStanVB
object.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan doc (html or pdf): mc-stan.org/users/documentation/
CmdStan doc (pdf): (github.com/stan-dev/cmdstan/releases/).
Other CmdStanModel methods: CmdStanModel-method-compile
,
CmdStanModel-method-optimize
,
CmdStanModel-method-sample
# \dontrun{ # Set path to cmdstan # Note: if you installed CmdStan via install_cmdstan() with default settings # then default below should work. Otherwise use the `path` argument to # specify the location of your CmdStan installation. set_cmdstan_path(path = NULL)#># Create a CmdStan model object from a Stan program, # here using the example model that comes with CmdStan stan_program <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(stan_program) mod$print()#> data { #> int<lower=0> N; #> int<lower=0,upper=1> y[N]; #> } #> parameters { #> real<lower=0,upper=1> theta; #> } #> model { #> theta ~ beta(1,1); #> for (n in 1:N) #> y[n] ~ bernoulli(theta); #> }# Compile to create executable mod$compile()#> Running make /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli #> make: `/Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli' is up to date.# Run sample method (MCMC via Stan's dynamic HMC/NUTS), # specifying data as a named list (like RStan) standata <- list(N = 10, y =c(0,1,0,0,0,0,0,0,0,1)) fit_mcmc <- mod$sample(data = standata, seed = 123, num_chains = 2)#> method = sample (Default) #> sample #> num_samples = 1000 (Default) #> num_warmup = 1000 (Default) #> save_warmup = 0 (Default) #> thin = 1 (Default) #> adapt #> engaged = 1 (Default) #> gamma = 0.050000000000000003 (Default) #> delta = 0.80000000000000004 (Default) #> kappa = 0.75 (Default) #> t0 = 10 (Default) #> init_buffer = 75 (Default) #> term_buffer = 50 (Default) #> window = 25 (Default) #> algorithm = hmc (Default) #> hmc #> engine = nuts (Default) #> nuts #> max_depth = 10 (Default) #> metric = diag_e (Default) #> metric_file = (Default) #> stepsize = 1 (Default) #> stepsize_jitter = 0 (Default) #> id = 1 #> data #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b022add5243.data.R #> init = 2 (Default) #> random #> seed = 123 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> #> Gradient evaluation took 2e-05 seconds #> 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds. #> Adjust your expectations accordingly! #> #> #> Iteration: 1 / 2000 [ 0%] (Warmup) #> Iteration: 100 / 2000 [ 5%] (Warmup) #> Iteration: 200 / 2000 [ 10%] (Warmup) #> Iteration: 300 / 2000 [ 15%] (Warmup) #> Iteration: 400 / 2000 [ 20%] (Warmup) #> Iteration: 500 / 2000 [ 25%] (Warmup) #> Iteration: 600 / 2000 [ 30%] (Warmup) #> Iteration: 700 / 2000 [ 35%] (Warmup) #> Iteration: 800 / 2000 [ 40%] (Warmup) #> Iteration: 900 / 2000 [ 45%] (Warmup) #> Iteration: 1000 / 2000 [ 50%] (Warmup) #> Iteration: 1001 / 2000 [ 50%] (Sampling) #> Iteration: 1100 / 2000 [ 55%] (Sampling) #> Iteration: 1200 / 2000 [ 60%] (Sampling) #> Iteration: 1300 / 2000 [ 65%] (Sampling) #> Iteration: 1400 / 2000 [ 70%] (Sampling) #> Iteration: 1500 / 2000 [ 75%] (Sampling) #> Iteration: 1600 / 2000 [ 80%] (Sampling) #> Iteration: 1700 / 2000 [ 85%] (Sampling) #> Iteration: 1800 / 2000 [ 90%] (Sampling) #> Iteration: 1900 / 2000 [ 95%] (Sampling) #> Iteration: 2000 / 2000 [100%] (Sampling) #> #> Elapsed Time: 0.014348 seconds (Warm-up) #> 0.021322 seconds (Sampling) #> 0.03567 seconds (Total) #> #> method = sample (Default) #> sample #> num_samples = 1000 (Default) #> num_warmup = 1000 (Default) #> save_warmup = 0 (Default) #> thin = 1 (Default) #> adapt #> engaged = 1 (Default) #> gamma = 0.050000000000000003 (Default) #> delta = 0.80000000000000004 (Default) #> kappa = 0.75 (Default) #> t0 = 10 (Default) #> init_buffer = 75 (Default) #> term_buffer = 50 (Default) #> window = 25 (Default) #> algorithm = hmc (Default) #> hmc #> engine = nuts (Default) #> nuts #> max_depth = 10 (Default) #> metric = diag_e (Default) #> metric_file = (Default) #> stepsize = 1 (Default) #> stepsize_jitter = 0 (Default) #> id = 2 #> data #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b022add5243.data.R #> init = 2 (Default) #> random #> seed = 124 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> #> Gradient evaluation took 1.9e-05 seconds #> 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds. #> Adjust your expectations accordingly! #> #> #> Iteration: 1 / 2000 [ 0%] (Warmup) #> Iteration: 100 / 2000 [ 5%] (Warmup) #> Iteration: 200 / 2000 [ 10%] (Warmup) #> Iteration: 300 / 2000 [ 15%] (Warmup) #> Iteration: 400 / 2000 [ 20%] (Warmup) #> Iteration: 500 / 2000 [ 25%] (Warmup) #> Iteration: 600 / 2000 [ 30%] (Warmup) #> Iteration: 700 / 2000 [ 35%] (Warmup) #> Iteration: 800 / 2000 [ 40%] (Warmup) #> Iteration: 900 / 2000 [ 45%] (Warmup) #> Iteration: 1000 / 2000 [ 50%] (Warmup) #> Iteration: 1001 / 2000 [ 50%] (Sampling) #> Iteration: 1100 / 2000 [ 55%] (Sampling) #> Iteration: 1200 / 2000 [ 60%] (Sampling) #> Iteration: 1300 / 2000 [ 65%] (Sampling) #> Iteration: 1400 / 2000 [ 70%] (Sampling) #> Iteration: 1500 / 2000 [ 75%] (Sampling) #> Iteration: 1600 / 2000 [ 80%] (Sampling) #> Iteration: 1700 / 2000 [ 85%] (Sampling) #> Iteration: 1800 / 2000 [ 90%] (Sampling) #> Iteration: 1900 / 2000 [ 95%] (Sampling) #> Iteration: 2000 / 2000 [100%] (Sampling) #> #> Elapsed Time: 0.01225 seconds (Warm-up) #> 0.019663 seconds (Sampling) #> 0.031913 seconds (Total) #>#> Running bin/stansummary \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv #> Inference for Stan model: bernoulli_model #> 2 chains: each with iter=(1000,1000); warmup=(0,0); thin=(1,1); 2000 iterations saved. #> #> Warmup took (0.014, 0.012) seconds, 0.027 seconds total #> Sampling took (0.021, 0.020) seconds, 0.041 seconds total #> #> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat #> lp__ -7.3 2.7e-02 7.3e-01 -8.9 -7.0 -6.8 738 18018 1.0e+00 #> accept_stat__ 0.92 3.1e-03 1.3e-01 0.64 0.97 1.0 1670 40735 1.0e+00 #> stepsize__ 0.92 1.7e-03 1.7e-03 0.92 0.92 0.92 1.0 24 1.4e+12 #> treedepth__ 1.3 1.1e-02 4.7e-01 1.0 1.0 2.0 1968 48012 1.0e+00 #> n_leapfrog__ 2.4 2.5e-02 1.0e+00 1.0 3.0 3.0 1640 40018 1.0e+00 #> divergent__ 0.00 0.0e+00 0.0e+00 0.00 0.00 0.00 1000 24399 nan #> energy__ 7.8 4.0e-02 1.0e+00 6.8 7.5 9.8 647 15780 1.0e+00 #> theta 0.24 4.6e-03 1.2e-01 0.077 0.22 0.47 720 17570 1.0e+00 #> #> Samples were drawn using hmc with nuts. #> For each parameter, N_Eff is a crude measure of effective sample size, #> and R_hat is the potential scale reduction factor on split chains (at #> convergence, R_hat=1). #># Run optimization method (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file (readable by CmdStan) my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.R") fit_optim <- mod$optimize(data = my_data_file, seed = 123)#> Warning: Optimization method is experimental and the structure of returned object may change.#> method = optimize #> optimize #> algorithm = lbfgs (Default) #> lbfgs #> init_alpha = 0.001 (Default) #> tol_obj = 9.9999999999999998e-13 (Default) #> tol_rel_obj = 10000 (Default) #> tol_grad = 1e-08 (Default) #> tol_rel_grad = 10000000 (Default) #> tol_param = 1e-08 (Default) #> history_size = 5 (Default) #> iter = 2000 (Default) #> save_iterations = 0 (Default) #> id = 1 #> data #> file = /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli.data.R #> init = 2 (Default) #> random #> seed = 123 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-optimize-1.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> Initial log joint probability = -9.51104 #> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes #> 6 -5.00402 0.000103557 2.55661e-07 1 1 9 #> Optimization terminated normally: #> Convergence detected: relative gradient magnitude is below tolerance#> Estimates from optimization:#> theta lp__ #> 0.20000 -5.00402# Run variational Bayes method (default is meanfield ADVI) fit_vb <- mod$variational(data = standata, seed = 123)#> Warning: Variational inference method is experimental and the structure of returned object may change.#> method = variational #> variational #> algorithm = meanfield (Default) #> meanfield #> iter = 10000 (Default) #> grad_samples = 1 (Default) #> elbo_samples = 100 (Default) #> eta = 1 (Default) #> adapt #> engaged = 1 (Default) #> iter = 50 (Default) #> tol_rel_obj = 0.01 (Default) #> eval_elbo = 100 (Default) #> output_samples = 1000 (Default) #> id = 1 #> data #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b022843c2b1.data.R #> init = 2 (Default) #> random #> seed = 123 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> ------------------------------------------------------------ #> EXPERIMENTAL ALGORITHM: #> This procedure has not been thoroughly tested and may be unstable #> or buggy. The interface is subject to change. #> ------------------------------------------------------------ #> #> #> #> Gradient evaluation took 2.1e-05 seconds #> 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds. #> Adjust your expectations accordingly! #> #> #> Begin eta adaptation. #> Iteration: 1 / 250 [ 0%] (Adaptation) #> Iteration: 50 / 250 [ 20%] (Adaptation) #> Iteration: 100 / 250 [ 40%] (Adaptation) #> Iteration: 150 / 250 [ 60%] (Adaptation) #> Iteration: 200 / 250 [ 80%] (Adaptation) #> Success! Found best value [eta = 1] earlier than expected. #> #> Begin stochastic gradient ascent. #> iter ELBO delta_ELBO_mean delta_ELBO_med notes #> 100 -6.258 1.000 1.000 #> 200 -6.475 0.517 1.000 #> 300 -6.228 0.358 0.040 #> 400 -6.220 0.269 0.040 #> 500 -6.379 0.220 0.034 #> 600 -6.195 0.188 0.034 #> 700 -6.262 0.163 0.030 #> 800 -6.345 0.144 0.030 #> 900 -6.201 0.131 0.025 #> 1000 -6.307 0.119 0.025 #> 1100 -6.290 0.020 0.023 #> 1200 -6.238 0.017 0.017 #> 1300 -6.182 0.014 0.013 #> 1400 -6.167 0.014 0.013 #> 1500 -6.219 0.012 0.011 #> 1600 -6.164 0.010 0.009 MEDIAN ELBO CONVERGED #> #> Drawing a sample of size 1000 from the approximate posterior... #> COMPLETED.#> Running bin/stansummary \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv #> Warning: non-fatal error reading adapation data #> Inference for Stan model: bernoulli_model #> 1 chains: each with iter=(1001); warmup=(0); thin=(0); 1001 iterations saved. #> #> Warmup took (0.00) seconds, 0.00 seconds total #> Sampling took (0.00) seconds, 0.00 seconds total #> #> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat #> lp__ 0.00 0.0e+00 0.00 0.00 0.00 0.0e+00 500 inf nan #> log_p__ -7.2 2.5e-02 0.72 -8.6 -7.0 -6.8e+00 789 inf 1.0e+00 #> log_g__ -0.54 2.9e-02 0.76 -2.1 -0.27 -1.5e-03 679 inf 1.0e+00 #> theta 0.26 4.2e-03 0.12 0.091 0.23 4.9e-01 823 inf 1.0e+00 #> #> Samples were drawn using meanfield with . #> For each parameter, N_Eff is a crude measure of effective sample size, #> and R_hat is the potential scale reduction factor on split chains (at #> convergence, R_hat=1). #># For models fit using MCMC, if you like working with RStan's stanfit objects # then you can create one with rstan::read_stan_csv() if (require(rstan, quietly = TRUE)) { stanfit <- rstan::read_stan_csv(fit_mcmc$output_files()) print(stanfit) }#> Inference for Stan model: bernoulli-stan-sample-1. #> 2 chains, each with iter=2000; warmup=1000; thin=1; #> post-warmup draws per chain=1000, total post-warmup draws=2000. #> #> mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat #> theta 0.24 0.00 0.12 0.06 0.14 0.22 0.32 0.52 720 1 #> lp__ -7.32 0.03 0.73 -9.37 -7.53 -7.05 -6.81 -6.75 737 1 #> #> Samples were drawn using NUTS(diag_e) at Mon Oct 14 21:41:32 2019. #> For each parameter, n_eff is a crude measure of effective sample size, #> and Rhat is the potential scale reduction factor on split chains (at #> convergence, Rhat=1).# }