Class mcmc_para_table (o2scl)

O2scl : Class List

template<class func_t, class fill_t, class data_t, class vec_t = ubvector, class stepper_t = mcmc_stepper_rw<func_t, data_t, vec_t>>
class mcmc_para_table : public o2scl::mcmc_para_base<func_t, std::function<int(const ubvector&, double, size_t, int, bool, data_t&)>, data_t, ubvector, mcmc_stepper_rw<func_t, data_t, ubvector>>

A generic MCMC simulation class writing data to a o2scl::table_units object.

This class performs a MCMC simulation and stores the results in a o2scl::table_units object. The user must specify the column names and units in set_names_units() before mcmc() is called.

The function add_line is the measurement function of type measure_t in the parent. The overloaded function mcmc() in this class works a bit differently in that it takes a function object (type fill_t) of the form

int fill_func(const vec_t &pars, double log_weight, 
std::vector<double> &line, data_t &dat);
which should store any auxillary values stored in the data object to line, in order to be added to the table.

The output table will contain the parameters, the logarithm of the function (called “log_wgt”) and a multiplying factor called “mult”. This “fill” function is called only when a step is accepted and the multiplier for that row is set to 1. If a future step is rejected, then the multiplier is increased by one, rather than adding the same row to the table again.

There is some output which occurs in addition to the output from o2scl::mcmc_para_base depending on the value of o2scl::mcmc_para_base::verbose . If there is a misalignment between the number of columns in the table and the number of data points in any line, then some debugging information is sent to cout. When verbose is 2 or larger, … (FIXME)

Idea for Future:

Verbose output may need improvement

Use reorder_table() and possibly reblock() to create a full post-processing function.

Note

This class is experimental.

Subclassed by o2scl::mcmc_para_cli< func_t, fill_t, data_t, vec_t, stepper_t >

Settings

bool table_sequence

If true, ensure sure walkers and OpenMP threads are written to the table with equal spacing between rows (default true)

size_t file_update_iters

Iterations between file updates (default 0 for no file updates)

double file_update_time

Time between file updates (default 0.0 for no file updates)

size_t table_prealloc

Number of rows to allocate for the table before the MCMC run.

int table_io_chunk

The number of tables to combine before I/O (default 1)

bool store_rejects

If true, store MCMC rejections in the table (default false)

bool check_rows

If true, check rows (default true)

inline virtual void write_files(bool sync_write = false)

Write MCMC tables to files.

inline mcmc_para_table()

Basic usage

inline virtual void set_names_units(std::vector<std::string> names, std::vector<std::string> units)

Set the table names and units.

inline virtual void initial_points_file_last(std::string fname, size_t n_param_loc, size_t offset = 5)

Read initial points from the last points recorded in file named fname.

The values of o2scl::mcmc_para_base::n_walk and o2scl::mcmc_para_base::n_threads, must be set to their correct values before calling this function. This function requires that a table is present in fname which stores parameters in a block of columns and has columns named mult, thread, walker, and log_wgt. This function does not double check that the columns in the file associated with the parameters have the correct names.

inline virtual void initial_points_file_dist(std::string fname, size_t n_param_loc, size_t offset = 5)

Read initial points from file named fname, distributing across the chain if necessary.

The values of o2scl::mcmc_para_base::n_walk and o2scl::mcmc_para_base::n_threads, must be set to their correct values before calling this function. This function requires that a table is present in fname which stores parameters in a block of columns. This function does not double check that the columns in the file associated with the parameters have the correct names.

inline virtual void initial_points_file_best(std::string fname, size_t n_param_loc, double thresh = 1.0e-6, size_t offset = 5)

Read initial points from the best points recorded in file named fname.

Before calling this function, the values the values of o2scl::mcmc_para_base::n_walk and o2scl::mcmc_para_base::n_threads should have been set by the user to the correct values for the future call to mcmc_para_base::mcmc() .

In order for this function to succeed, a table must be present (the function just reads the first o2scl::table_units or o2scl::table_units object it can find) in the HDF5 file named fname. This table should store parameters in a block of columns beginning with column offset. It should also contain a separate column named log_wgt for the log likelihood. The table must have at least as unique rows as input points required by the mcmc_para_base::mcmc() function, i.e. the product of o2scl::mcmc_para_base::n_walk and o2scl::mcmc_para_base::n_threads. Rows are presumed to be identical if all of their values differ by a value less than thresh, which defaults to \( 10^{-6} \).

This function does not double check that the columns in the file associated with the parameters have the correct names. The values in the “walker”, “thread”, and “rank” columns in the table, if present, are ignored by this function.

inline virtual int mcmc_fill(size_t n_params_local, vec_t &low, vec_t &high, std::vector<func_t> &func, std::vector<fill_t> &fill, std::vector<data_t> &data)

Perform an MCMC simulation.

Perform an MCMC simulation over n_params parameters starting at initial point init, limiting the parameters to be between low and high, using func as the objective function and calling the measurement function meas at each MC point.

inline std::shared_ptr<o2scl::table_units<>> get_table()

Get the output table.

inline void set_table(std::shared_ptr<o2scl::table_units<>> &t)

Set the output table.

inline void get_chain_sizes(std::vector<size_t> &chain_sizes)

Determine the chain sizes.

Idea for Future:

This algorithm could be improved by started from the end of the table and going backwards instead of starting from the front of the table and going forwards.

inline virtual void read_prev_results(o2scl_hdf::hdf_file &hf, size_t n_param_loc, std::string name = "")

Read previous results (number of threads and walkers must be set first)

Note

By default, this tries to obtain the initial points for the next call to mcmc() by the previously accepted point in the table.

Note

This function requires a table correctly stored with the right column order

inline virtual void critical_extra(size_t i_thread)

Additional code to execute inside the OpenMP critical section.

inline virtual int add_line(const vec_t &pars, double log_weight, size_t walker_ix, int func_ret, bool mcmc_accept, data_t &dat, size_t i_thread, fill_t &fill)

A measurement function which adds the point to the table.

inline virtual void mcmc_cleanup()

Perform cleanup after an MCMC simulation.

inline virtual void ac_coeffs(size_t icol, std::vector<double> &ac_coeff_avg, int loc_verbose = 0)

Compute autocorrelation coefficient for column with index icol averaging over all walkers and all threads.

inline virtual void reorder_table()

Reorder the table by thread and walker index.

inline void reblock(size_t n_blocks)

Reaverage the data into blocks of a fixed size in order to avoid autocorrelations.

This function is useful to remove autocorrelations to the table so long as the autocorrelation length is shorter than the block size. This function does not compute the autocorrelation length to check that this is the case.

Note

The number of blocks n_blocks must be larger than the current table size. This function expects to find a column named “mult” which contains the multiplicity of each column, as is the case after a call to mcmc_para_base::mcmc().

Protected Types

std::function< int(const vec_t &, double, size_t, int, bool, data_t &) internal_measure_t )

Measurement functor type for the parent.

typedef mcmc_para_base<func_t, internal_measure_t, data_t, vec_t, stepper_t> parent_t

Type of parent class.

Protected Functions

inline virtual int mcmc_init()

MCMC initialization function.

This function sets the column names and units.

inline virtual int fill_line(const vec_t &pars, double log_weight, std::vector<double> &line, data_t &dat, size_t i_walker, fill_t &fill)

Fill line with data for insertion into the table.

inline virtual void file_header(o2scl_hdf::hdf_file &hf)

Initial write to HDF5 file.

Protected Attributes

std::vector<std::string> col_names

Column names.

std::vector<std::string> col_units

Column units.

size_t n_params

Number of parameters.

std::shared_ptr<o2scl::table_units<>> table

Main data table for Markov chain.

bool first_write

If true, the HDF5 I/O initial info has been written to the file (set by mcmc() )

std::vector<int> walker_accept_rows

For each walker and thread, record the last row in the table which corresponds to an accept.

std::vector<int> walker_reject_rows

For each walker and thread, record the last row in the table which corresponds to an reject.

vec_t low_copy

A copy of the lower limits for HDF5 output.

vec_t high_copy

A copy of the upper limits for HDF5 output.

size_t last_write_iters

Total number of MCMC acceptances over all threads at last file write() (default 0)

double last_write_time

Time at last file write() (default 0.0)

bool prev_read

If true, previous results have been read.

This is set to true by read_prev_results() and set back to false after mcmc_init() is called.