High-Level Interface

This page details the most important methods and classes provided by the nautilus module.

class nautilus.Sampler(prior, likelihood, n_dim=None, n_live=2000, n_update=None, enlarge_per_dim=1.1, n_points_min=None, split_threshold=100, n_networks=4, neural_network_kwargs={}, prior_args=[], prior_kwargs={}, likelihood_args=[], likelihood_kwargs={}, n_batch=None, n_like_new_bound=None, vectorized=False, pass_dict=None, pool=None, seed=None, blobs_dtype=None, filepath=None, resume=True)

Initialize the sampler.

Parameters:
priorfunction or nautilus.Prior

Prior describing the mapping of the unit hypercube to the parameters.

likelihoodfunction

Function returning the natural logarithm of the likelihood.

n_dimint, optional

Number of dimensions of the likelihood function. If not specified, it will be inferred from the prior argument. But this requires prior to be an instance of nautilus.Prior.

n_liveint, optional

Number of so-called live points. New bounds are constructed so that they encompass the live points. Default is 3000.

n_updateNone or int, optional

The maximum number of additions to the live set before a new bound is created. If None, use n_live. Default is None.

enlarge_per_dimfloat, optional

Along each dimension, outer ellipsoidal bounds are enlarged by this factor. Default is 1.1.

n_points_minint or None, optional

The minimum number of points each ellipsoid should have. Effectively, ellipsoids with less than twice that number will not be split further. If None, uses n_points_min = n_dim + 50. Default is None.

split_threshold: float, optional

Threshold used for splitting the multi-ellipsoidal bound used for sampling. If the volume of the bound prior enlarging is larger than split_threshold times the target volume, the multi-ellipsiodal bound is split further, if possible. Default is 100.

n_networksint, optional

Number of networks used in the estimator. Default is 4.

neural_network_kwargsdict, optional

Non-default keyword arguments passed to the constructor of MLPRegressor.

prior_argslist, optional

List of extra positional arguments for prior. Only used if prior is a function.

prior_kwargsdict, optional

Dictionary of extra keyword arguments for prior. Only used if prior is a function.

likelihood_argslist, optional

List of extra positional arguments for likelihood.

likelihood_kwargsdict, optional

Dictionary of extra keyword arguments for likelihood.

n_batchint or None, optional

Number of likelihood evaluations that are performed at each step. If likelihood evaluations are parallelized, should be multiple of the number of parallel processes. If None, will be the smallest multiple of the pool size used for likelihood calls that is at least 100. Default is None.

n_like_new_boundNone or int, optional

The maximum number of likelihood calls before a new bounds is created. If None, use 10 times n_live. Default is None.

vectorizedbool, optional

If True, the likelihood function can receive multiple input sets at once. For example, if the likelihood function receives arrays, it should be able to take an array with shape (n_points, n_dim) and return an array with shape (n_points). Similarly, if the likelihood function accepts dictionaries, it should be able to process dictionaries where each value is an array with shape (n_points). Default is False.

pass_dictbool or None, optional

If True, the likelihood function expects model parameters as dictionaries. If False, it expects regular numpy arrays. Default is to set it to True if prior was a nautilus.Prior instance and False otherwise.

poolNone, object, int or tuple, optional

Pool used for parallelization of likelihood calls and sampler calculations. If None, no parallelization is performed. If an integer, the sampler will use a multiprocessing.Pool object with the specified number of processes. Finally, if specifying a tuple, the first one specifies the pool used for likelihood calls and the second one the pool for sampler calculations. Supported pools include instances of multiprocessing.Pool and dask.distributed.client.Client. Default is None.

seedNone or int, optional

Seed for random number generation used for reproducible results accross different runs. If None, results are not reproducible. Default is None.

blobs_dtypeobject or None, optional

Object that can be converted to a data type object describing the blobs. If None, this will be inferred from the first blob. Default is None.

filepathstring, pathlib.Path or None, optional

Path to the file where results are saved. Must have a ‘.h5’ or ‘.hdf5’ extension. If None, no results are written. Default is None.

resumebool, optional

If True, resume from previous run if filepath exists. If False, start from scratch and overwrite any previous file. Default is True.

Raises:
ValueError

If prior is a function and n_dim is not given or pass_struct is set to True. If the dimensionality of the problem is less than 2.

run(f_live=0.01, n_shell=1, n_eff=10000, n_like_max=inf, discard_exploration=False, timeout=inf, verbose=False)

Run the sampler until convergence.

Parameters:
f_livefloat, optional

Maximum fraction of the evidence contained in the live set before building the initial shells terminates. Default is 0.01.

n_shellint, optional

Minimum number of points in each shell. The algorithm will sample from the shells until this is reached. Default is 1.

n_efffloat, optional

Minimum effective sample size. The algorithm will sample from the shells until this is reached. Default is 10000.

n_like_maxint, optional

Maximum total (accross multiple runs) number of likelihood evaluations. Regardless of progress, the sampler will not start new likelihood computations if this value is reached. Note that this value includes likelihood calls from previous runs, if applicable. Default is infinity.

discard_explorationbool, optional

Whether to discard points drawn in the exploration phase. This is required for a fully unbiased posterior and evidence estimate. Default is False.

timeoutfloat, optional

Timeout interval in seconds. The sampler will not start new likelihood computations if this limit is reached. Unlike for n_like_max, this maximum only refers to the current function call. Default is infinity.

verbosebool, optional

If True, print information about sampler progress. Default is False.

Returns:
successbool

Whether the run finished successfully without stopping prematurely. False if the run finished because the n_like_max or timeout limits were reached and True otherwise.

posterior(return_as_dict=None, equal_weight=False, equal_weight_boost=1.0, return_blobs=False)

Return the posterior sample estimate.

Parameters:
return_as_dictbool or None, optional

If True, return points as a dictionary. If None, will default to False unless one uses custom prior that only returns dictionaries. Default is None.

equal_weightbool, optional

If True, return an equal-weighted posterior. This is done by randomly sampling each point from the unequal-weighted posterior proportional to its weight. Note that this effectively downgrades the posterior. For high-precision estimates of the posterior, use the unequal-weighted posterior. Default is False.

equal_weight_boostfloat, optional

For the equal-weighted posterior, each point is sampled n times, where n is drawn from a nearest-integer distribution with mean value w / max(w) * equal_weight_boost. Here, max(w) is the maximum weight across all points in the posterior. For equal_weight_boost`=1, this means that each point is at most sampled once, i.e., the posterior estimate contains no duplicates. For `equal_weight_boost > 1, duplicates are possible but the equal-weighted posterior is a better approximation to the unequal-weight posterior. Note that the number of points returned is, on average, proportional to equal_weight_boost. Default is 1.0.

return_blobsbool, optional

If True, return the blobs. Default is False.

Returns:
pointsnumpy.ndarray or dict

Points of the posterior.

log_wnumpy.ndarray

Weights of each point of the posterior.

log_lnumpy.ndarray

Logarithm of the likelihood at each point of the posterior.

blobsnumpy.ndarray, optional

Blobs for each point of the posterior. Only returned if return_blobs is True.

Raises:
ValueError

If return_as_dict or return_blobs are True but the sampler has not been run in a way that that’s possible.

property n_eff

Estimate the total effective sample size \(N_{\rm eff}\).

Returns:
n_efffloat

Estimate of the total effective sample size \(N_{\rm eff}\).

property log_z

Estimate the global evidence \(\log \mathcal{Z}\).

Returns:
log_zfloat or None

Estimate of the global evidence \(\log \mathcal{Z}\).

property eta

Estimate the asymptotic sampling efficiency \(\eta\).

The asymptotic sampling efficiency is defined as \(\eta = \lim_{N_{\rm like} \to \infty} N_{\rm eff} / N_{\rm like}\). This is set after the exploration phase. However, the estimate will be updated based on what is found in the sampling phase.

Returns:
etafloat

Estimate of the asymptotic sampling efficiency.

shell_bound_occupation(fractional=True)

Determine how many points of each shell are also part of each bound.

Parameters:
fractionalbool, optional

Whether to return the absolute or fractional dependence. Default is True.

Returns:
mnumpy.ndarray

Two-dimensional array with occupation numbers. The element at index \((i, j)\) corresponds to the occupation of points in shell shell \(i\) that also belong to bound \(j\). If fractional is True, this is the fraction of all points in shell \(i\) and otherwise it is the absolute number.

class nautilus.Prior

Initialize a prior without any parameters.

add_parameter(key=None, dist=(0, 1))

Add a model parameter to the prior.

Parameters:
keystr or None

Name of the model parameter. If None, the key name will be x_i, where i is a number.

distfloat, tuple, str or object

Distribution the parameter should follow. If a float, the parameter is fixed to this value and will not be fitted in any analysis. If a tuple, it gives the lower and upper bound of a uniform distribution. If a string, the parameter will always be equal to the named model parameter. Finally, if an object, it must have a isf attribute, i.e. the inverse survival function.

Raises:
TypeError

If key or dist is not the correct type.

ValueError

If a new key already exists in the key list or if dist is a string but does not refer to a previously defined key.