High-Level Interface¶

This page details the most important methods and classes provided by the nautilus module.

class nautilus.Sampler(prior, likelihood, n_dim=None, n_live=2000, n_update=None, enlarge_per_dim=1.1, n_points_min=None, split_threshold=100, periodic=None, n_networks=4, neural_network_kwargs={}, prior_args=[], prior_kwargs={}, likelihood_args=[], likelihood_kwargs={}, n_batch=None, n_like_new_bound=None, vectorized=False, pass_dict=None, pool=None, seed=None, blobs_dtype=None, filepath=None, resume=True)¶

Initialize the sampler.

Parameters:

priorfunction or nautilus.Prior: Prior describing the mapping of the unit hypercube to the parameters.
likelihoodfunction: Function returning the natural logarithm of the likelihood.
n_dimint, optional: Number of dimensions of the likelihood function. If not specified, it will be inferred from the prior argument. But this requires prior to be an instance of nautilus.Prior.
n_liveint, optional: Number of so-called live points. New bounds are constructed so that they encompass the live points. Default is 3000.
n_updateNone or int, optional: The maximum number of additions to the live set before a new bound is created. If None, use n_live. Default is None.
enlarge_per_dimfloat, optional: Along each dimension, outer ellipsoidal bounds are enlarged by this factor. Default is 1.1.
n_points_minint or None, optional: The minimum number of points each ellipsoid should have. Effectively, ellipsoids with less than twice that number will not be split further. If None, uses n_points_min = n_dim + 50. Default is None.
split_threshold: float, optional: Threshold used for splitting the multi-ellipsoidal bound used for sampling. If the volume of the bound prior enlarging is larger than split_threshold times the target volume, the multi-ellipsiodal bound is split further, if possible. Default is 100.
periodicnumpy.ndarray or None, optional: Indices of the parameters that are periodic. Default is None.
n_networksint, optional: Number of networks used in the estimator. Default is 4.
neural_network_kwargsdict, optional: Non-default keyword arguments passed to the constructor of MLPRegressor.
prior_argslist, optional: List of extra positional arguments for prior. Only used if prior is a function.
prior_kwargsdict, optional: Dictionary of extra keyword arguments for prior. Only used if prior is a function.
likelihood_argslist, optional: List of extra positional arguments for likelihood.
likelihood_kwargsdict, optional: Dictionary of extra keyword arguments for likelihood.
n_batchint or None, optional: Number of likelihood evaluations that are performed at each step. If likelihood evaluations are parallelized, should be multiple of the number of parallel processes. If None, will be the smallest multiple of the pool size used for likelihood calls that is at least 100. Default is None.
n_like_new_boundNone or int, optional: The maximum number of likelihood calls before a new bounds is created. If None, use 10 times n_live. Default is None.
vectorizedbool, optional: If True, the likelihood function can receive multiple input sets at once. For example, if the likelihood function receives arrays, it should be able to take an array with shape (n_points, n_dim) and return an array with shape (n_points). Similarly, if the likelihood function accepts dictionaries, it should be able to process dictionaries where each value is an array with shape (n_points). Default is False.
pass_dictbool or None, optional: If True, the likelihood function expects model parameters as dictionaries. If False, it expects regular numpy arrays. Default is to set it to True if prior was a nautilus.Prior instance and False otherwise.
poolNone, object, int or tuple, optional: Pool used for parallelization of likelihood calls and sampler calculations. If None, no parallelization is performed. If an integer, the sampler will use a multiprocessing.Pool object with the specified number of processes. Finally, if specifying a tuple, the first one specifies the pool used for likelihood calls and the second one the pool for sampler calculations. Supported pools include instances of multiprocessing.Pool and dask.distributed.client.Client. Default is None.
seedNone or int, optional: Seed for random number generation used for reproducible results accross different runs. If None, results are not reproducible. Default is None.
blobs_dtypeobject or None, optional: Object that can be converted to a data type object describing the blobs. If None, this will be inferred from the first blob. Default is None.
filepathstring, pathlib.Path or None, optional: Path to the file where results are saved. Must have a ‘.h5’ or ‘.hdf5’ extension. If None, no results are written. Default is None.
resumebool, optional: If True, resume from previous run if filepath exists. If False, start from scratch and overwrite any previous file. Default is True.

Raises:

ValueError: If prior is a function and n_dim is not given or pass_struct is set to True. If the dimensionality of the problem is less than 2.

run(f_live=0.01, n_shell=1, n_eff=10000, n_like_max=inf, discard_exploration=False, timeout=inf, verbose=False)¶

Run the sampler until convergence.

Parameters:

f_livefloat, optional: Maximum fraction of the evidence contained in the live set before building the initial shells terminates. Default is 0.01.
n_shellint, optional: Minimum number of points in each shell. The algorithm will sample from the shells until this is reached. Default is 1.
n_efffloat, optional: Minimum effective sample size. The algorithm will sample from the shells until this is reached. Default is 10000.
n_like_maxint, optional: Maximum total (accross multiple runs) number of likelihood evaluations. Regardless of progress, the sampler will not start new likelihood computations if this value is reached. Note that this value includes likelihood calls from previous runs, if applicable. Default is infinity.
discard_explorationbool, optional: Whether to discard points drawn in the exploration phase. This is required for a fully unbiased posterior and evidence estimate. Default is False.
timeoutfloat, optional: Timeout interval in seconds. The sampler will not start new likelihood computations if this limit is reached. Unlike for n_like_max, this maximum only refers to the current function call. Default is infinity.
verbosebool, optional: If True, print information about sampler progress. Default is False.

Returns:

successbool: Whether the run finished successfully without stopping prematurely. False if the run finished because the n_like_max or timeout limits were reached and True otherwise.

posterior(return_as_dict=None, equal_weight=False, equal_weight_boost=1.0, return_blobs=False)¶

Return the posterior sample estimate.

Parameters:

return_as_dictbool or None, optional: If True, return points as a dictionary. If None, will default to False unless one uses custom prior that only returns dictionaries. Default is None.
equal_weightbool, optional: If True, return an equal-weighted posterior. This is done by randomly sampling each point from the unequal-weighted posterior proportional to its weight. Note that this effectively downgrades the posterior. For high-precision estimates of the posterior, use the unequal-weighted posterior. Default is False.
equal_weight_boostfloat, optional: For the equal-weighted posterior, each point is sampled n times, where n is drawn from a nearest-integer distribution with mean value w / max(w) * equal_weight_boost. Here, max(w) is the maximum weight across all points in the posterior. For equal_weight_boost`=1, this means that each point is at most sampled once, i.e., the posterior estimate contains no duplicates. For `equal_weight_boost > 1, duplicates are possible but the equal-weighted posterior is a better approximation to the unequal-weight posterior. Note that the number of points returned is, on average, proportional to equal_weight_boost. Default is 1.0.
return_blobsbool, optional: If True, return the blobs. Default is False.

Returns:

pointsnumpy.ndarray or dict: Points of the posterior.
log_wnumpy.ndarray: Weights of each point of the posterior.
log_lnumpy.ndarray: Logarithm of the likelihood at each point of the posterior.
blobsnumpy.ndarray, optional: Blobs for each point of the posterior. Only returned if return_blobs is True.

Raises:

ValueError: If return_as_dict or return_blobs are True but the sampler has not been run in a way that that’s possible.

property n_eff¶

Estimate the total effective sample size \(N_{\rm eff}\).

Returns:

n_efffloat: Estimate of the total effective sample size \(N_{\rm eff}\).

property log_z¶

Estimate the global evidence \(\log \mathcal{Z}\).

Returns:

log_zfloat or None: Estimate of the global evidence \(\log \mathcal{Z}\).

property eta¶

Estimate the asymptotic sampling efficiency \(\eta\).

The asymptotic sampling efficiency is defined as \(\eta = \lim_{N_{\rm like} \to \infty} N_{\rm eff} / N_{\rm like}\). This is set after the exploration phase. However, the estimate will be updated based on what is found in the sampling phase.

Returns:

etafloat: Estimate of the asymptotic sampling efficiency.

shell_bound_occupation(fractional=True)¶

Determine how many points of each shell are also part of each bound.

Parameters:

fractionalbool, optional: Whether to return the absolute or fractional dependence. Default is True.

Returns:

mnumpy.ndarray: Two-dimensional array with occupation numbers. The element at index \((i, j)\) corresponds to the occupation of points in shell shell \(i\) that also belong to bound \(j\). If fractional is True, this is the fraction of all points in shell \(i\) and otherwise it is the absolute number.

class nautilus.Prior¶

Initialize a prior without any parameters.

add_parameter(key=None, dist=(0, 1))¶

Add a model parameter to the prior.

Parameters:

keystr or None: Name of the model parameter. If None, the key name will be x_i, where i is a number.
distfloat, tuple, str or object: Distribution the parameter should follow. If a float, the parameter is fixed to this value and will not be fitted in any analysis. If a tuple, it gives the lower and upper bound of a uniform distribution. If a string, the parameter will always be equal to the named model parameter. Finally, if an object, it must have a isf attribute, i.e. the inverse survival function.

Raises:

TypeError: If key or dist is not the correct type.
ValueError: If a new key already exists in the key list or if dist is a string but does not refer to a previously defined key.