High-Level Interface¶
This page details the most important methods and classes provided by the nautilus module.
- class nautilus.Sampler(prior, likelihood, n_dim=None, n_live=2000, n_update=None, enlarge_per_dim=1.1, n_points_min=None, split_threshold=100, periodic=None, n_networks=4, neural_network_kwargs={}, prior_args=[], prior_kwargs={}, likelihood_args=[], likelihood_kwargs={}, n_batch=None, n_like_new_bound=None, vectorized=False, pass_dict=None, pool=None, seed=None, blobs_dtype=None, filepath=None, resume=True)¶
Initialize the sampler.
- Parameters:
- priorfunction or nautilus.Prior
Prior describing the mapping of the unit hypercube to the parameters.
- likelihoodfunction
Function returning the natural logarithm of the likelihood.
- n_dimint, optional
Number of dimensions of the likelihood function. If not specified, it will be inferred from the prior argument. But this requires prior to be an instance of nautilus.Prior.
- n_liveint, optional
Number of so-called live points. New bounds are constructed so that they encompass the live points. Default is 3000.
- n_updateNone or int, optional
The maximum number of additions to the live set before a new bound is created. If None, use n_live. Default is None.
- enlarge_per_dimfloat, optional
Along each dimension, outer ellipsoidal bounds are enlarged by this factor. Default is 1.1.
- n_points_minint or None, optional
The minimum number of points each ellipsoid should have. Effectively, ellipsoids with less than twice that number will not be split further. If None, uses n_points_min = n_dim + 50. Default is None.
- split_threshold: float, optional
Threshold used for splitting the multi-ellipsoidal bound used for sampling. If the volume of the bound prior enlarging is larger than split_threshold times the target volume, the multi-ellipsiodal bound is split further, if possible. Default is 100.
- periodicnumpy.ndarray or None, optional
Indices of the parameters that are periodic. Default is None.
- n_networksint, optional
Number of networks used in the estimator. Default is 4.
- neural_network_kwargsdict, optional
Non-default keyword arguments passed to the constructor of MLPRegressor.
- prior_argslist, optional
List of extra positional arguments for prior. Only used if prior is a function.
- prior_kwargsdict, optional
Dictionary of extra keyword arguments for prior. Only used if prior is a function.
- likelihood_argslist, optional
List of extra positional arguments for likelihood.
- likelihood_kwargsdict, optional
Dictionary of extra keyword arguments for likelihood.
- n_batchint or None, optional
Number of likelihood evaluations that are performed at each step. If likelihood evaluations are parallelized, should be multiple of the number of parallel processes. If None, will be the smallest multiple of the pool size used for likelihood calls that is at least 100. Default is None.
- n_like_new_boundNone or int, optional
The maximum number of likelihood calls before a new bounds is created. If None, use 10 times n_live. Default is None.
- vectorizedbool, optional
If True, the likelihood function can receive multiple input sets at once. For example, if the likelihood function receives arrays, it should be able to take an array with shape (n_points, n_dim) and return an array with shape (n_points). Similarly, if the likelihood function accepts dictionaries, it should be able to process dictionaries where each value is an array with shape (n_points). Default is False.
- pass_dictbool or None, optional
If True, the likelihood function expects model parameters as dictionaries. If False, it expects regular numpy arrays. Default is to set it to True if prior was a nautilus.Prior instance and False otherwise.
- poolNone, object, int or tuple, optional
Pool used for parallelization of likelihood calls and sampler calculations. If None, no parallelization is performed. If an integer, the sampler will use a multiprocessing.Pool object with the specified number of processes. Finally, if specifying a tuple, the first one specifies the pool used for likelihood calls and the second one the pool for sampler calculations. Supported pools include instances of multiprocessing.Pool and dask.distributed.client.Client. Default is None.
- seedNone or int, optional
Seed for random number generation used for reproducible results accross different runs. If None, results are not reproducible. Default is None.
- blobs_dtypeobject or None, optional
Object that can be converted to a data type object describing the blobs. If None, this will be inferred from the first blob. Default is None.
- filepathstring, pathlib.Path or None, optional
Path to the file where results are saved. Must have a ‘.h5’ or ‘.hdf5’ extension. If None, no results are written. Default is None.
- resumebool, optional
If True, resume from previous run if filepath exists. If False, start from scratch and overwrite any previous file. Default is True.
- Raises:
- ValueError
If prior is a function and n_dim is not given or pass_struct is set to True. If the dimensionality of the problem is less than 2.
- run(f_live=0.01, n_shell=1, n_eff=10000, n_like_max=inf, discard_exploration=False, timeout=inf, verbose=False)¶
Run the sampler until convergence.
- Parameters:
- f_livefloat, optional
Maximum fraction of the evidence contained in the live set before building the initial shells terminates. Default is 0.01.
- n_shellint, optional
Minimum number of points in each shell. The algorithm will sample from the shells until this is reached. Default is 1.
- n_efffloat, optional
Minimum effective sample size. The algorithm will sample from the shells until this is reached. Default is 10000.
- n_like_maxint, optional
Maximum total (accross multiple runs) number of likelihood evaluations. Regardless of progress, the sampler will not start new likelihood computations if this value is reached. Note that this value includes likelihood calls from previous runs, if applicable. Default is infinity.
- discard_explorationbool, optional
Whether to discard points drawn in the exploration phase. This is required for a fully unbiased posterior and evidence estimate. Default is False.
- timeoutfloat, optional
Timeout interval in seconds. The sampler will not start new likelihood computations if this limit is reached. Unlike for n_like_max, this maximum only refers to the current function call. Default is infinity.
- verbosebool, optional
If True, print information about sampler progress. Default is False.
- Returns:
- successbool
Whether the run finished successfully without stopping prematurely. False if the run finished because the n_like_max or timeout limits were reached and True otherwise.
- posterior(return_as_dict=None, equal_weight=False, equal_weight_boost=1.0, return_blobs=False)¶
Return the posterior sample estimate.
- Parameters:
- return_as_dictbool or None, optional
If True, return points as a dictionary. If None, will default to False unless one uses custom prior that only returns dictionaries. Default is None.
- equal_weightbool, optional
If True, return an equal-weighted posterior. This is done by randomly sampling each point from the unequal-weighted posterior proportional to its weight. Note that this effectively downgrades the posterior. For high-precision estimates of the posterior, use the unequal-weighted posterior. Default is False.
- equal_weight_boostfloat, optional
For the equal-weighted posterior, each point is sampled n times, where n is drawn from a nearest-integer distribution with mean value w / max(w) * equal_weight_boost. Here, max(w) is the maximum weight across all points in the posterior. For equal_weight_boost`=1, this means that each point is at most sampled once, i.e., the posterior estimate contains no duplicates. For `equal_weight_boost > 1, duplicates are possible but the equal-weighted posterior is a better approximation to the unequal-weight posterior. Note that the number of points returned is, on average, proportional to equal_weight_boost. Default is 1.0.
- return_blobsbool, optional
If True, return the blobs. Default is False.
- Returns:
- pointsnumpy.ndarray or dict
Points of the posterior.
- log_wnumpy.ndarray
Weights of each point of the posterior.
- log_lnumpy.ndarray
Logarithm of the likelihood at each point of the posterior.
- blobsnumpy.ndarray, optional
Blobs for each point of the posterior. Only returned if return_blobs is True.
- Raises:
- ValueError
If return_as_dict or return_blobs are True but the sampler has not been run in a way that that’s possible.
- property n_eff¶
Estimate the total effective sample size \(N_{\rm eff}\).
- Returns:
- n_efffloat
Estimate of the total effective sample size \(N_{\rm eff}\).
- property log_z¶
Estimate the global evidence \(\log \mathcal{Z}\).
- Returns:
- log_zfloat or None
Estimate of the global evidence \(\log \mathcal{Z}\).
- property eta¶
Estimate the asymptotic sampling efficiency \(\eta\).
The asymptotic sampling efficiency is defined as \(\eta = \lim_{N_{\rm like} \to \infty} N_{\rm eff} / N_{\rm like}\). This is set after the exploration phase. However, the estimate will be updated based on what is found in the sampling phase.
- Returns:
- etafloat
Estimate of the asymptotic sampling efficiency.
- shell_bound_occupation(fractional=True)¶
Determine how many points of each shell are also part of each bound.
- Parameters:
- fractionalbool, optional
Whether to return the absolute or fractional dependence. Default is True.
- Returns:
- mnumpy.ndarray
Two-dimensional array with occupation numbers. The element at index \((i, j)\) corresponds to the occupation of points in shell shell \(i\) that also belong to bound \(j\). If fractional is True, this is the fraction of all points in shell \(i\) and otherwise it is the absolute number.
- class nautilus.Prior¶
Initialize a prior without any parameters.
- add_parameter(key=None, dist=(0, 1))¶
Add a model parameter to the prior.
- Parameters:
- keystr or None
Name of the model parameter. If None, the key name will be x_i, where i is a number.
- distfloat, tuple, str or object
Distribution the parameter should follow. If a float, the parameter is fixed to this value and will not be fitted in any analysis. If a tuple, it gives the lower and upper bound of a uniform distribution. If a string, the parameter will always be equal to the named model parameter. Finally, if an object, it must have a isf attribute, i.e. the inverse survival function.
- Raises:
- TypeError
If key or dist is not the correct type.
- ValueError
If a new key already exists in the key list or if dist is a string but does not refer to a previously defined key.