simulate_data() takes a specified stan model and allows the user to simulate data from it based on specified parameter values. The user then specifies which data they wish to save and how many simulations they wish to run. The data will be saved as individual .rds files in the directory specified by path.

By default an object of class stansim_data will be returned, providing an index of the saved data that can then be provided directly to a stansim() call.

To allow for simulated data to be directly fed into stan model that simulated them as input data, the sim_drop argument is provided. If sim_drop is true then any stan data object with a name beginning with "sim_" will have this string removed from it's name. For example, the simulated data "sim_x" would be returned simply as "x". This helps avoid the issue of overlapping data names for both input and output

simulate_data(file, data_name = paste0("Simdata_", Sys.time()),
  input_data = NULL, vars = "all", param_values = NULL, nsim = 1,
  path = NULL, seed = floor(stats::runif(1, 1, 1e+05)),
  return_object = TRUE, use_cores = 1, sim_drop = TRUE,
  recursive = TRUE)



A character string containing either the file location of the model code (ending in ".stan"), a character string containing the model specification or the name of a character string object in the workspace.


A name attached to the stansim_data object to help identify it. It is strongly recommended that an informative name is assigned. This will also be the name stem for the saved .rds files.


Values for the data field in the provided stan model. Values must be provided for all entries even if they are not used in the 'generate quantities' model section producing the simulated data.


The names of the stan variables to return. Defaults to "all", otherwise a vector of variable names should be provided.


A list containing the named values for the stan model parameters used to simulate data. If a parameter's value is not specified here it will be initialised randomly. Recommended to specify all parameter values.


The number of simulated datasets to produce.


The name of the directory to save the simulated data to, if this doesn't exist it will be created. Defaults to NULL in which the datasets are saved to the working directory


Set a seed for the function.


if FALSE then no stansim_data object is returned.


Number of cores to use when running in parallel.


If TRUE then any simulated data objects beginning in "sim_" will have this removed. So "sim_x" becomes "x".


logical. Should elements of the path other than the last be created? If true, like the Unix command mkdir -p.


An object of S3 class stansim_data or NULL.