This page details the methods and classes provided by the nautilus module.

Top-Level Interface

class nautilus.Sampler(prior, likelihood, n_dim=None, n_live=2000, n_update=None, enlarge_per_dim=1.1, n_points_min=None, split_threshold=100, n_networks=4, neural_network_kwargs={}, prior_args=[], prior_kwargs={}, likelihood_args=[], likelihood_kwargs={}, n_batch=None, n_like_new_bound=None, vectorized=False, pass_dict=None, pool=None, seed=None, blobs_dtype=None, filepath=None, resume=True)

Initialize the sampler.

Parameters:

priorfunction or nautilus.Prior: Prior describing the mapping of the unit hypercube to the parameters.
likelihoodfunction: Function returning the natural logarithm of the likelihood.
n_dimint, optional: Number of dimensions of the likelihood function. If not specified, it will be inferred from the prior argument. But this requires prior to be an instance of nautilus.Prior.
n_liveint, optional: Number of so-called live points. New bounds are constructed so that they encompass the live points. Default is 3000.
n_updateNone or int, optional: The maximum number of additions to the live set before a new bound is created. If None, use n_live. Default is None.
enlarge_per_dimfloat, optional: Along each dimension, outer ellipsoidal bounds are enlarged by this factor. Default is 1.1.
n_points_minint or None, optional: The minimum number of points each ellipsoid should have. Effectively, ellipsoids with less than twice that number will not be split further. If None, uses n_points_min = n_dim + 50. Default is None.
split_threshold: float, optional: Threshold used for splitting the multi-ellipsoidal bound used for sampling. If the volume of the bound prior enlarging is larger than split_threshold times the target volume, the multi-ellipsiodal bound is split further, if possible. Default is 100.
n_networksint, optional: Number of networks used in the estimator. Default is 4.
neural_network_kwargsdict, optional: Non-default keyword arguments passed to the constructor of MLPRegressor.
prior_argslist, optional: List of extra positional arguments for prior. Only used if prior is a function.
prior_kwargsdict, optional: Dictionary of extra keyword arguments for prior. Only used if prior is a function.
likelihood_argslist, optional: List of extra positional arguments for likelihood.
likelihood_kwargsdict, optional: Dictionary of extra keyword arguments for likelihood.
n_batchint or None, optional: Number of likelihood evaluations that are performed at each step. If likelihood evaluations are parallelized, should be multiple of the number of parallel processes. If None, will be the smallest multiple of the pool size used for likelihood calls that is at least 100. Default is None.
n_like_new_boundNone or int, optional: The maximum number of likelihood calls before a new bounds is created. If None, use 10 times n_live. Default is None.
vectorizedbool, optional: If True, the likelihood function can receive multiple input sets at once. For example, if the likelihood function receives arrays, it should be able to take an array with shape (n_points, n_dim) and return an array with shape (n_points). Similarly, if the likelihood function accepts dictionaries, it should be able to process dictionaries where each value is an array with shape (n_points). Default is False.
pass_dictbool or None, optional: If True, the likelihood function expects model parameters as dictionaries. If False, it expects regular numpy arrays. Default is to set it to True if prior was a nautilus.Prior instance and False otherwise.
poolNone, object, int or tuple, optional: Pool used for parallelization of likelihood calls and sampler calculations. If None, no parallelization is performed. If an integer, the sampler will use a multiprocessing.Pool object with the specified number of processes. Finally, if specifying a tuple, the first one specifies the pool used for likelihood calls and the second one the pool for sampler calculations. Default is None.
seedNone or int, optional: Seed for random number generation used for reproducible results accross different runs. If None, results are not reproducible. Default is None.
blobs_dtypeobject or None, optional: Object that can be converted to a data type object describing the blobs. If None, this will be inferred from the first blob. Default is None.
filepathstring, pathlib.Path or None, optional: Path to the file where results are saved. Must have a ‘.h5’ or ‘.hdf5’ extension. If None, no results are written. Default is None.
resumebool, optional: If True, resume from previous run if filepath exists. If False, start from scratch and overwrite any previous file. Default is True.

Raises:

ValueError: If prior is a function and n_dim is not given or pass_struct is set to True. If the dimensionality of the problem is less than 2.

run(f_live=0.01, n_shell=1, n_eff=10000, n_like_max=inf, discard_exploration=False, timeout=inf, verbose=False)

Run the sampler until convergence.

Parameters:

f_livefloat, optional: Maximum fraction of the evidence contained in the live set before building the initial shells terminates. Default is 0.01.
n_shellint, optional: Minimum number of points in each shell. The algorithm will sample from the shells until this is reached. Default is 1.
n_efffloat, optional: Minimum effective sample size. The algorithm will sample from the shells until this is reached. Default is 10000.
n_like_maxint, optional: Maximum total (accross multiple runs) number of likelihood evaluations. Regardless of progress, the sampler will not start new likelihood computations if this value is reached. Note that this value includes likelihood calls from previous runs, if applicable. Default is infinity.
discard_explorationbool, optional: Whether to discard points drawn in the exploration phase. This is required for a fully unbiased posterior and evidence estimate. Default is False.
timeoutfloat, optional: Timeout interval in seconds. The sampler will not start new likelihood computations if this limit is reached. Unlike for n_like_max, this maximum only refers to the current function call. Default is infinity.
verbosebool, optional: If True, print information about sampler progress. Default is False.

Returns:

successbool: Whether the run finished successfully without stopping prematurely. False if the run finished because the n_like_max or timeout limits were reached and True otherwise.

posterior(return_as_dict=None, equal_weight=False, return_blobs=False)

Return the posterior sample estimate.

Parameters:

return_as_dictbool or None, optional: If True, return points as a dictionary. If None, will default to False unless one uses custom prior that only returns dictionaries. Default is None.
equal_weightbool, optional: If True, return an equal weighted posterior. Default is False.
return_blobsbool, optional: If True, return the blobs. Default is False.

Returns:

pointsnumpy.ndarray or dict: Points of the posterior.
log_wnumpy.ndarray: Weights of each point of the posterior.
log_lnumpy.ndarray: Logarithm of the likelihood at each point of the posterior.
blobsnumpy.ndarray, optional: Blobs for each point of the posterior. Only returned if return_blobs is True.

Raises:

ValueError: If return_as_dict or return_blobs are True but the sampler has not been run in a way that that’s possible.

property n_eff

Estimate the total effective sample size \(N_{\rm eff}\).

Returns:

n_efffloat: Estimate of the total effective sample size \(N_{\rm eff}\).

property log_z

Estimate the global evidence \(\log \mathcal{Z}\).

Returns:

log_zfloat or None: Estimate of the global evidence \(\log \mathcal{Z}\).

property eta

Estimate the asymptotic sampling efficiency \(\eta\).

The asymptotic sampling efficiency is defined as \(\eta = \lim_{N_{\rm like} \to \infty} N_{\rm eff} / N_{\rm like}\). This is set after the exploration phase. However, the estimate will be updated based on what is found in the sampling phase.

Returns:

etafloat: Estimate of the asymptotic sampling efficiency.

shell_bound_occupation(fractional=True)

Determine how many points of each shell are also part of each bound.

Parameters:

fractionalbool, optional: Whether to return the absolute or fractional dependence. Default is True.

Returns:

mnumpy.ndarray: Two-dimensional array with occupation numbers. The element at index \((i, j)\) corresponds to the occupation of points in shell shell \(i\) that also belong to bound \(j\). If fractional is True, this is the fraction of all points in shell \(i\) and otherwise it is the absolute number.

class nautilus.Prior

Initialize a prior without any parameters.

add_parameter(key=None, dist=(0, 1))

Add a model parameter to the prior.

Parameters:

keystr or None: Name of the model parameter. If None, the key name will be x_i, where i is a number.
distfloat, tuple, str or object: Distribution the parameter should follow. If a float, the parameter is fixed to this value and will not be fitted in any analysis. If a tuple, it gives the lower and upper bound of a uniform distribution. If a string, the parameter will always be equal to the named model parameter. Finally, if an object, it must have a isf attribute, i.e. the inverse survival function.

Raises:

TypeError: If key or dist is not the correct type.
ValueError: If a new key already exists in the key list or if dist is a string but does not refer to a previously defined key.

Full Documenation

nautilus.sampler

Module implementing the Nautilus sampler.

class nautilus.sampler.Sampler(prior, likelihood, n_dim=None, n_live=2000, n_update=None, enlarge_per_dim=1.1, n_points_min=None, split_threshold=100, n_networks=4, neural_network_kwargs={}, prior_args=[], prior_kwargs={}, likelihood_args=[], likelihood_kwargs={}, n_batch=None, n_like_new_bound=None, vectorized=False, pass_dict=None, pool=None, seed=None, blobs_dtype=None, filepath=None, resume=True)

Bases: object

Initialize the sampler.

Parameters:

priorfunction or nautilus.Prior: Prior describing the mapping of the unit hypercube to the parameters.
likelihoodfunction: Function returning the natural logarithm of the likelihood.
n_dimint, optional: Number of dimensions of the likelihood function. If not specified, it will be inferred from the prior argument. But this requires prior to be an instance of nautilus.Prior.
n_liveint, optional: Number of so-called live points. New bounds are constructed so that they encompass the live points. Default is 3000.
n_updateNone or int, optional: The maximum number of additions to the live set before a new bound is created. If None, use n_live. Default is None.
enlarge_per_dimfloat, optional: Along each dimension, outer ellipsoidal bounds are enlarged by this factor. Default is 1.1.
n_points_minint or None, optional: The minimum number of points each ellipsoid should have. Effectively, ellipsoids with less than twice that number will not be split further. If None, uses n_points_min = n_dim + 50. Default is None.
split_threshold: float, optional: Threshold used for splitting the multi-ellipsoidal bound used for sampling. If the volume of the bound prior enlarging is larger than split_threshold times the target volume, the multi-ellipsiodal bound is split further, if possible. Default is 100.
n_networksint, optional: Number of networks used in the estimator. Default is 4.
neural_network_kwargsdict, optional: Non-default keyword arguments passed to the constructor of MLPRegressor.
prior_argslist, optional: List of extra positional arguments for prior. Only used if prior is a function.
prior_kwargsdict, optional: Dictionary of extra keyword arguments for prior. Only used if prior is a function.
likelihood_argslist, optional: List of extra positional arguments for likelihood.
likelihood_kwargsdict, optional: Dictionary of extra keyword arguments for likelihood.
n_batchint or None, optional: Number of likelihood evaluations that are performed at each step. If likelihood evaluations are parallelized, should be multiple of the number of parallel processes. If None, will be the smallest multiple of the pool size used for likelihood calls that is at least 100. Default is None.
n_like_new_boundNone or int, optional: The maximum number of likelihood calls before a new bounds is created. If None, use 10 times n_live. Default is None.
vectorizedbool, optional: If True, the likelihood function can receive multiple input sets at once. For example, if the likelihood function receives arrays, it should be able to take an array with shape (n_points, n_dim) and return an array with shape (n_points). Similarly, if the likelihood function accepts dictionaries, it should be able to process dictionaries where each value is an array with shape (n_points). Default is False.
pass_dictbool or None, optional: If True, the likelihood function expects model parameters as dictionaries. If False, it expects regular numpy arrays. Default is to set it to True if prior was a nautilus.Prior instance and False otherwise.
poolNone, object, int or tuple, optional: Pool used for parallelization of likelihood calls and sampler calculations. If None, no parallelization is performed. If an integer, the sampler will use a multiprocessing.Pool object with the specified number of processes. Finally, if specifying a tuple, the first one specifies the pool used for likelihood calls and the second one the pool for sampler calculations. Default is None.
seedNone or int, optional: Seed for random number generation used for reproducible results accross different runs. If None, results are not reproducible. Default is None.
blobs_dtypeobject or None, optional: Object that can be converted to a data type object describing the blobs. If None, this will be inferred from the first blob. Default is None.
filepathstring, pathlib.Path or None, optional: Path to the file where results are saved. Must have a ‘.h5’ or ‘.hdf5’ extension. If None, no results are written. Default is None.
resumebool, optional: If True, resume from previous run if filepath exists. If False, start from scratch and overwrite any previous file. Default is True.

Raises:

ValueError: If prior is a function and n_dim is not given or pass_struct is set to True. If the dimensionality of the problem is less than 2.

add_bound(verbose=False)

Try building a new bound from existing points.

If the new bound would be larger than the previous bound, reject the new bound.

Parameters:

verbosebool, optional: If True, print additional information. Default is False.

Returns:

successboolean: Whether a new bound has been added.

add_samples(shell, verbose=False)

Add samples to a shell.

The number of new points added is always equal to the batch size.

Parameters:

shellint: The index of the shell for which to add points.
verbosebool, optional: If True, print additional information. Default is False.

Returns:

n_updateint: Number of new samples with likelihood equal or higher than the likelihood threshold of the bound.

asymptotic_sampling_efficiency()

Estimate the asymptotic sampling efficiency \(\eta\).

The asymptotic sampling efficiency is defined as \(\eta = \lim_{N_{\rm like} \to \infty} N_{\rm eff} / N_{\rm like}\). This is set after the exploration phase. However, the estimate will be updated based on what is found in the sampling phase.

Returns:

etafloat: Estimate of the asymptotic sampling efficiency.

property discard_exploration

Return whether the exploration phase is discarded.

Returns:

discard_explorationbool: Whether the exploration phase is discarded.

effective_sample_size()

Estimate the total effective sample size \(N_{\rm eff}\).

Returns:

n_efffloat: Estimate of the total effective sample size \(N_{\rm eff}\).

property eta

Estimate the asymptotic sampling efficiency \(\eta\).

The asymptotic sampling efficiency is defined as \(\eta = \lim_{N_{\rm like} \to \infty} N_{\rm eff} / N_{\rm like}\). This is set after the exploration phase. However, the estimate will be updated based on what is found in the sampling phase.

Returns:

etafloat: Estimate of the asymptotic sampling efficiency.

evaluate_likelihood(points)

Evaluate the likelihood for a given set of points.

Parameters:

pointsnumpy.ndarray: Points at which to evaluate the likelihood.

Returns:

log_lnumpy.ndarray: Natural log of the likelihood of each point.
blobslist, dict or None: Blobs associated with the points, if any.

Raises:

ValueError: If self.blobs_dtype is not None but the likelihood function does not return blobs.

evidence()

Estimate the global evidence \(\log \mathcal{Z}\).

Returns:

log_zfloat: Estimate of the global evidence \(\log \mathcal{Z}\).

property f_live

Estimate the fraction of the evidence contained in the live set.

This estimate can be used as a stopping criterion.

Returns:

f_livefloat: Estimate of the fraction of the evidence in the live set.

property log_v_live

Estimate the volume that is currently contained in the live set.

Returns:

log_v_livefloat: Estimate of the volume in the live set.

property log_z

Estimate the global evidence \(\log \mathcal{Z}\).

Returns:

log_zfloat or None: Estimate of the global evidence \(\log \mathcal{Z}\).

property n_eff

Estimate the total effective sample size \(N_{\rm eff}\).

Returns:

n_efffloat: Estimate of the total effective sample size \(N_{\rm eff}\).

posterior(return_as_dict=None, equal_weight=False, return_blobs=False)

Return the posterior sample estimate.

Parameters:

return_as_dictbool or None, optional: If True, return points as a dictionary. If None, will default to False unless one uses custom prior that only returns dictionaries. Default is None.
equal_weightbool, optional: If True, return an equal weighted posterior. Default is False.
return_blobsbool, optional: If True, return the blobs. Default is False.

Returns:

pointsnumpy.ndarray or dict: Points of the posterior.
log_wnumpy.ndarray: Weights of each point of the posterior.
log_lnumpy.ndarray: Logarithm of the likelihood at each point of the posterior.
blobsnumpy.ndarray, optional: Blobs for each point of the posterior. Only returned if return_blobs is True.

Raises:

ValueError: If return_as_dict or return_blobs are True but the sampler has not been run in a way that that’s possible.

print_status(status='', header=False, end='\n')

Print current summary statistics.

Parameters:

status: string, optional: Status of the sampler to be printed. Default is ‘’.
headerbool, optional: If True, print a static header. Default is False.
endstr, optional: String printed at the end. Default is newline.

run(f_live=0.01, n_shell=1, n_eff=10000, n_like_max=inf, discard_exploration=False, timeout=inf, verbose=False)

Run the sampler until convergence.

Parameters:

f_livefloat, optional: Maximum fraction of the evidence contained in the live set before building the initial shells terminates. Default is 0.01.
n_shellint, optional: Minimum number of points in each shell. The algorithm will sample from the shells until this is reached. Default is 1.
n_efffloat, optional: Minimum effective sample size. The algorithm will sample from the shells until this is reached. Default is 10000.
n_like_maxint, optional: Maximum total (accross multiple runs) number of likelihood evaluations. Regardless of progress, the sampler will not start new likelihood computations if this value is reached. Note that this value includes likelihood calls from previous runs, if applicable. Default is infinity.
discard_explorationbool, optional: Whether to discard points drawn in the exploration phase. This is required for a fully unbiased posterior and evidence estimate. Default is False.
timeoutfloat, optional: Timeout interval in seconds. The sampler will not start new likelihood computations if this limit is reached. Unlike for n_like_max, this maximum only refers to the current function call. Default is infinity.
verbosebool, optional: If True, print information about sampler progress. Default is False.

Returns:

successbool: Whether the run finished successfully without stopping prematurely. False if the run finished because the n_like_max or timeout limits were reached and True otherwise.

sample_shell(index, shell_t=None)

Sample a batch of points uniformly from a shell.

The shell at index \(i\) is defined as the volume enclosed by the bound of index \(i\) and enclosed by not other bound of index \(k\) with \(k > i\).

Parameters:

indexint: Index of the shell.
shell_tnp.ndarray or None, optional: If not None, an array of shell associations of possible transfer points.

Returns:

pointsnumpy.ndarray: Array of shape (n_shell, n_dim) containing points sampled uniformly from the shell.
n_boundint: Number of points drawn within the bound at index \(i\). Will be different from n_shell if there are bounds with index \(k\) with \(k > i\).
idx_tnp.ndarray, optional: Indeces of the transfer candidates that should be transferred. Only returned if shell_t is not None.

shell_association(points, n_max=None)

Determine the shells each point belongs to.

Parameters:

pointsnumpy.ndarray: points for which to determine shell association.
n_maxint, optional: The maximum number of shells to consider. Effectively, this determines the shell association at step n_max in the exploration phase. Default is to consider all shells.

Returns:

shell: int: Shell association for each point.

shell_bound_occupation(fractional=True)

Determine how many points of each shell are also part of each bound.

Parameters:

fractionalbool, optional: Whether to return the absolute or fractional dependence. Default is True.

Returns:

mnumpy.ndarray: Two-dimensional array with occupation numbers. The element at index \((i, j)\) corresponds to the occupation of points in shell shell \(i\) that also belong to bound \(j\). If fractional is True, this is the fraction of all points in shell \(i\) and otherwise it is the absolute number.

update_shell_info(index)

Update the shell information for calculation of summary statistics.

Parameters:

index: int: Index of the shell.

write(filepath, overwrite=False)

Write the sampler to disk.

Parameters:

filepathstring or pathlib.Path: Path to the file. Must have a ‘.h5’ or ‘.hdf5’ extension.
overwritebool, optional: Whether to overwrite an existing file. Default is False.

Raises:

ValueError: If file extension is not ‘.h5’ or ‘.hdf5’.
RuntimeError: If file exists and overwrite is False.

write_shell_update(filepath, shell)

Update the sampler data for a single shell.

Parameters:

filepathstring or pathlib.Path: Path to the file. Must have a ‘.h5’ or ‘.hdf5’ extension.
shellint: Shell index for which to write the upate.

nautilus.prior

Module implementing the prior bounds and convencience functions.

class nautilus.prior.Prior

Bases: object

Initialize a prior without any parameters.

add_parameter(key=None, dist=(0, 1))

Add a model parameter to the prior.

Parameters:

keystr or None: Name of the model parameter. If None, the key name will be x_i, where i is a number.
distfloat, tuple, str or object: Distribution the parameter should follow. If a float, the parameter is fixed to this value and will not be fitted in any analysis. If a tuple, it gives the lower and upper bound of a uniform distribution. If a string, the parameter will always be equal to the named model parameter. Finally, if an object, it must have a isf attribute, i.e. the inverse survival function.

Raises:

TypeError: If key or dist is not the correct type.
ValueError: If a new key already exists in the key list or if dist is a string but does not refer to a previously defined key.

dimensionality()

Determine the number of free model parameters.

Returns:

n_dimint: The number of free model parameters.

physical_to_dictionary(phys_points)

Convert points in the physical parameter space as a dictionary.

Parameters:

phys_pointsnumpy.ndarray: Points in the physical parameter space. If more than one-dimensional, each row represents a point.

Returns:

param_dictdict: Points as a dictionary. Each model parameter, including fixed ones, can be accessed via their key.

Raises:

ValueError: If dimensionality of phys_points does not match the prior.

unit_to_dictionary(points)

Convert points from the unit hypercube to a dictionary.

Parameters:

pointsnumpy.ndarray: A 1-D or 2-D array containing single point or a collection of points in the unit hypercube. If more than one-dimensional, each row represents a point.

Returns:

param_dictdict: Points as a dictionary. Each model parameter, including fixed ones, can be accessed via their key.

unit_to_physical(points)

Convert points from the unit hypercube to physical parameters.

Parameters:

pointsnumpy.ndarray: A 1-D or 2-D array containing single point or a collection of points in the unit hypercube. If more than one-dimensional, each row represents a point.

Returns:

phys_pointsnumpy.ndarray: Points transferred into the prior volume. Has the same shape as points.

Raises:

ValueError: If dimensionality of points does not match the prior.

nautilus.bounds

Modules implementing various multi-dimensional bounds.

class nautilus.bounds.Ellipsoid

Bases: object

Ellipsoid bound.

Attributes:

n_dimint: Number of dimensions.
cnumpy.ndarray: The position of the ellipsoid.
Anumpy.ndarray: The bounds of the ellipsoid in matrix form, i.e. \((x - c)^T A (x - c) \leq 1\).
Bnumpy.ndarray: Cholesky decomposition of the inverse of A.
B_invnumpy.ndarray: Inverse of B.
rngnumpy.random.Generator: Determines random number generation.

classmethod compute(points, enlarge_per_dim=1.1, rng=None)

Compute the bound.

Parameters:

pointsnumpy.ndarray with shape (n_points, n_dim): A 2-D array where each row represents a point.
enlarge_per_dimfloat, optional: Along each dimension, the ellipsoid is enlarged by this factor. Default is 1.1.
rngNone or numpy.random.Generator, optional: Determines random number generation. Default is None.

Returns:

boundEllipsoid: The bound.

Raises:

ValueError: If enlarge_per_dim is smaller than unity or the number of points does not exceed the number of dimensions.

contains(points)

Check whether points are contained in the bound.

Parameters:

pointsnumpy.ndarray: A 1-D or 2-D array containing single point or a collection of points. If more than one-dimensional, each row represents a point.

Returns:

in_boundbool or numpy.ndarray: Bool or array of bools describing for each point whether it is contained in the bound.

property log_v

Return the natural log of the volume of the bound.

Returns:

log_vfloat: The natural log of the volume.

classmethod read(group, rng=None)

Read the bound from an HDF5 group.

Parameters:

group: h5py.Group: HDF5 group to write to.
rng: None or numpy.random.Generator, optional: Determines random number generation. Default is None.

Returns:

bound: Ellipsoid: The bound.

reset(rng=None)

Reset random number generation and any progress, if applicable.

Parameters:

rngNone or numpy.random.Generator, optional: Determines random number generation. If None, random number generation is not reset. Default is None.

sample(n_points=100)

Sample points from the bound.

Parameters:

n_pointsint, optional: How many points to draw. Default is 100.

Returns:

pointsnumpy.ndarray: Points as two-dimensional array of shape (n_points, n_dim).

transform(points, inverse=False)

Transform points into the coordinate frame of the ellipsoid.

Parameters:

pointsnumpy.ndarray: A 1-D or 2-D array containing single point or a collection of points. If more than one-dimensional, each row represents a point.
inversebool, optional: By default, the coordinates are transformed from the regular coordinates to the coordinates in the ellipsoid. If inverse is set to true, this function does the inverse operation, i.e. transform from the ellipsoidal to the regular coordinates. Default is False.

Returns:

points_tnumpy.ndarray: Transformed points.

write(group)

Write the bound to an HDF5 group.

Parameters:

group: h5py.Group: HDF5 group to write to.

class nautilus.bounds.NautilusBound

Bases: object

Union of multiple non-overlapping neural network-based bounds.

The bound is the overlap of the union of multiple neural network-based bounds and the unit hypercube.

Attributes:

n_dimint: Number of dimensions.
neural_boundslist: List of the individual neural network-based bounds.
outer_boundUnion: Outer bound used for sampling.
rngNone or numpy.random.Generator: Determines random number generation.
pointsnumpy.ndarray: Points that a call to sample will return next.
n_sampleint: Number of points sampled from the outer bound.
n_rejectint: Number of points rejected due to not falling into the neural network-based bounds.

classmethod compute(points, log_l, log_l_min, log_v_target, enlarge_per_dim=1.1, n_points_min=None, split_threshold=100, n_networks=4, neural_network_kwargs={}, pool=None, rng=None)

Compute a union of multiple neural network-based bounds.

Parameters:

pointsnumpy.ndarray with shape (m, n): A 2-D array where each row represents a point.
log_lnumpy.ndarray of length m: Likelihood of each point.
log_l_minfloat: Target likelihood threshold of the bound.
log_v_targetfloat: Expected volume of the bound. Used for multi-ellipsoidal decomposition.
enlarge_per_dimfloat, optional: Along each dimension, the ellipsoid of the outer bound is enlarged by this factor. Default is 1.1.
n_points_minint or None, optional: The minimum number of points each ellipsoid should have. Effectively, ellipsoids with less than twice that number will not be split further. If None, uses n_points_min = n_dim + 1. Default is None.
split_threshold: float, optional: Threshold used for splitting the multi-ellipsoidal bound used for sampling. If the volume of the bound is larger than split_threshold times the target volume, the multi-ellipsiodal bound is split further, if possible. Default is 100.
n_networksint, optional: Number of networks used in the emulator. Default is 4.
neural_network_kwargsdict, optional: Non-default keyword arguments passed to the constructor of MLPRegressor.
poolmultiprocessing.Pool, optional: Pool used for parallel processing.
rngNone or numpy.random.Generator, optional: Determines random number generation. Default is None.

Returns:

boundNautilusBound: The bound.

contains(points)

Check whether points are contained in the bound.

Parameters:

pointsnumpy.ndarray: A 1-D or 2-D array containing single point or a collection of points. If more than one-dimensional, each row represents a point.

Returns:

in_boundbool or numpy.ndarray: Bool or array of bools describing for each point it is contained in the bound.

property log_v

Return the natural log of the volume.

Returns:

log_vfloat: An estimate of the natural log of the volume. Will become more accurate as more points are sampled.

property n_ell

Return the number of ellipsoids in the bound.

Returns:

n_ellint: The number of ellipsoids.

property n_net

Return the number of neural networks in the bound.

Returns:

n_netint: The number of neural networks.

classmethod read(group, rng=None)

Read the bound from an HDF5 group.

Parameters:

grouph5py.Group: HDF5 group to write to.
rngNone or numpy.random.Generator, optional: Determines random number generation. Default is None.

Returns:

boundNautilusBound: The bound.

reset(rng=None)

Reset random number generation and any progress, if applicable.

Parameters:

rngNone or numpy.random.Generator, optional: Determines random number generation. If None, random number generation is not reset. Default is None.

sample(n_points=100, return_points=True, pool=None)

Sample points from the the bound.

Parameters:

n_pointsint, optional: How many points to draw. Default is 100.
return_pointsbool, optional: If True, return sampled points. Otherwise, sample internally until at least n_points are saved.
poolmultiprocessing.Pool, optional: Pool used for parallel processing.

Returns:

pointsnumpy.ndarray: Points as two-dimensional array of shape (n_points, n_dim).

update(group)

Update bound information previously written to an HDF5 group.

Parameters:

grouph5py.Group: HDF5 group to write to.

write(group)

Write the bound to an HDF5 group.

Parameters:

grouph5py.Group: HDF5 group to write to.

class nautilus.bounds.NeuralBound

Bases: object

Neural network-based bound.

Attributes:

n_dimint: Number of dimensions.
outer_boundUnitCubeEllipsoidMixture: Outer bound around the points above the likelihood threshold.
emulatorobject: Emulator based on sklearn.neural_network.MLPRegressor used to fit and predict likelihood scores.
score_predict_minfloat: Minimum score predicted by the emulator to be considered part of the bound.

classmethod compute(points, log_l, log_l_min, enlarge_per_dim=1.1, n_networks=4, neural_network_kwargs={}, pool=None, rng=None)

Compute a neural network-based bound.

Parameters:

pointsnumpy.ndarray with shape (m, n): A 2-D array where each row represents a point.
log_lnumpy.ndarray of length m: Likelihood of each point.
log_l_minfloat: Target likelihood threshold of the bound.
enlarge_per_dimfloat, optional: Along each dimension, the ellipsoid of the outer bound is enlarged by this factor. Default is 1.1.
n_networksint, optional: Number of networks used in the emulator. Default is 4.
neural_network_kwargsdict, optional: Non-default keyword arguments passed to the constructor of MLPRegressor.
poolmultiprocessing.Pool, optional: Pool used for parallel processing.
rngNone or numpy.random.Generator, optional: Determines random number generation. Default is None.

Returns:

boundNeuralBound: The bound.

contains(points)

Check whether points are contained in the bound.

Parameters:

pointsnumpy.ndarray: A 1-D or 2-D array containing single point or a collection of points. If more than one-dimensional, each row represents a point.

Returns:

in_boundbool or numpy.ndarray: Bool or array of bools describing for each point if it is contained in the bound.

classmethod read(group, rng=None)

Read the bound from an HDF5 group.

Parameters:

grouph5py.Group: HDF5 group to write to.
rngNone or numpy.random.Generator, optional: Determines random number generation. Default is None.

Returns:

boundNeuralBound: The bound.

write(group)

Write the bound to an HDF5 group.

Parameters:

grouph5py.Group: HDF5 group to write to.

class nautilus.bounds.Union

Bases: object

Union of multiple ellipsoids or unit cube-ellipsoid mixtures.

Attributes:

n_dimint: Number of dimensions.
enlarge_per_dimfloat: Along each dimension, ellipsoids enlarged by this factor.
n_points_minint or None: The minimum number of points each ellipsoid should have. Effectively, ellipsoids with less than twice that number will not be split further.
cubeUnitCube or None: If not None, the bound is defined as the overlap with the unit cube.
points_boundslist: The points used to create the individual bounds. Used to split bounds further.
boundslist: List of individual bounds.
log_v_allnumpy.ndarray: Natural log of the volume of each individual bound.
blocknumpy.ndarray: List indicating whether a bound should not be split further because the volume would not decrease or the number of points is too low.
pointsnumpy.ndarray: Points that a call to sample will return next.
n_sampleint: Number of points sampled from all bounds.
n_rejectint: Number of points rejected due to overlap.
rngnumpy.random.Generator: Determines random number generation.

classmethod compute(points, enlarge_per_dim=1.1, n_points_min=None, unit=True, bound_class=<class 'nautilus.bounds.basic.Ellipsoid'>, rng=None)

Compute the bound.

Upon creation, the bound consists of a single individual bound.

Parameters:

pointsnumpy.ndarray with shape (n_points, n_dim): A 2-D array where each row represents a point.
enlarge_per_dimfloat, optional: Along each dimension, the ellipsoid is enlarged by this factor. Default is 1.1.
n_points_minint or None, optional: The minimum number of points each ellipsoid should have. Effectively, ellipsoids with less than twice that number will not be split further. If None, uses n_points_min = n_dim + 1. Default is None.
unitbool, optional: If the bound is restricted to the overlap with the unit cube.
bound_classclass, optional: Type of the individual bounds, i.e. ellipsoids or unit cube-ellipsoid mixtures.
rngNone or numpy.random.Generator, optional: Determines random number generation. Default is None.

Returns:

boundUnion: The bound.

Raises:

ValueError: If n_points_min is smaller than the number of dimensions plus one.

contains(points)

Check whether points are contained in the bound.

Parameters:

pointsnumpy.ndarray: A 1-D or 2-D array containing single point or a collection of points. If more than one-dimensional, each row represents a point.

Returns:

in_boundbool or numpy.ndarray: Bool or array of bools describing for each point whether it is contained in the bound.

property log_v

Return the natural log of the volume of the bound.

Returns:

log_vfloat: The natural log of the volume.

classmethod read(group, rng=None)

Read the bound from an HDF5 group.

Parameters:

grouph5py.Group: HDF5 group to write to.
rngNone or numpy.random.Generator, optional: Determines random number generation. Default is None.

Returns:

boundMultiEllipsoid: The bound.

reset(rng=None)

Reset random number generation and any progress, if applicable.

Parameters:

rngNone or numpy.random.Generator, optional: Determines random number generation. If None, random number generation is not reset. Default is None.

sample(n_points=100)

Sample points from the bound.

Parameters:

n_pointsint, optional: How many points to draw. Default is 100.

Returns:

pointsnumpy.ndarray: Points as two-dimensional array of shape (n_points, n_dim).

split(allow_overlap=True)

Split the largest bound in the union.

Parameters:

allow_overlapbool, optional: Whether to allow splitting the largest bound if doing so creates overlaps between bounds. Cannot be False if the individual bounds are cube-ellipsoid mixtures. Default is True.

Returns:

successbool: Whether it was possible to split any bound.

Raises:

ValueError: If allow_overlap is False and the individual bounds are cube-ellipsoid mixtures.

trim(threshold=1000.0)

Drop the lowest-density bound, if possible.

Density is defined as the ratio of each bound’s number of points to its volume.

Parameters:

thresholdfloat, optional: Only drop the lowest-density bound if it has a density at least threshold times lower than the median of all other bounds.

Returns:

successbool: Whether it was possible to drop a bound. Will always return False if there is only one bound in the union.

update(group)

Update bound information previously written to an HDF5 group.

Parameters:

grouph5py.Group: HDF5 group to write to.

write(group)

Write the bound to an HDF5 group.

Parameters:

grouph5py.Group: HDF5 group to write to.

class nautilus.bounds.UnitCube

Bases: object

Unit (hyper)cube bound.

The \(n\)-dimensional unit hypercube has \(n_{\rm dim}\) parameters \(x_i\) with \(0 \leq x_i < 1\) for all \(x_i\).

Attributes:

n_dimint: Number of dimensions.
rngnumpy.random.Generator: Determines random number generation.

classmethod compute(n_dim, rng=None)

Compute the bound.

Parameters:

n_dimint: Number of dimensions.
rngNone or numpy.random.Generator, optional: Determines random number generation. Default is None.

Returns:

boundUnitCube: The bound.

contains(points)

Check whether points are contained in the bound.

Parameters:

pointsnumpy.ndarray: A 1-D or 2-D array containing single point or a collection of points. If more than one-dimensional, each row represents a point.

Returns:

in_boundbool or numpy.ndarray: Bool or array of bools describing for each point whether it is contained in the bound.

property log_v

Return the natural log of the volume of the bound.

Returns:

log_vfloat: The natural log of the volume of the bound.

classmethod read(group, rng=None)

Read the bound from an HDF5 group.

Parameters:

grouph5py.Group: HDF5 group to write to.
rngNone or numpy.random.Generator, optional: Determines random number generation. Default is None.

Returns:

boundUnitCube: The bound.

reset(rng=None)

Reset random number generation and any progress, if applicable.

Parameters:

rngNone or numpy.random.Generator, optional: Determines random number generation. If None, random number generation is not reset. Default is None.

sample(n_points=100, pool=None)

Sample points from the bound.

Parameters:

n_pointsint, optional: How many points to draw. Default is 100.
poolignored: Not used. Present for API consistency.

Returns:

pointsnumpy.ndarray: Points as two-dimensional array of shape (n_points, n_dim).

write(group)

Write the bound to an HDF5 group.

Parameters:

grouph5py.Group: HDF5 group to write to.

class nautilus.bounds.UnitCubeEllipsoidMixture

Bases: object

Mixture of a unit cube and an ellipsoid.

Dimensions along which an ellipsoid has a smaller volume than a unit cube are defined via ellipsoids and vice versa.

Attributes:

n_dim: int: Number of dimensions.
dim_cube: numpy.ndarray: Whether the boundary in each dimension is defined via the unit cube.
cube: UnitCube or None: Unit cube defining the boundary along certain dimensions.
ellipsoid: Ellipsoid or None: Ellipsoid defining the boundary along certain dimensions.

classmethod compute(points, enlarge_per_dim=1.1, rng=None)

Compute the bound.

Parameters:

points: numpy.ndarray with shape(n_points, n_dim): A 2-D array where each row represents a point.
enlarge_per_dim: float, optional: Along each dimension, the ellipsoid is enlarged by this factor. Default is 1.1.
rng: None or numpy.random.Generator, optional: Determines random number generation. Default is None.

Returns:

bound: UnitCubeEllipsoidMixture: The bound.

contains(points)

Check whether points are contained in the bound.

Parameters:

points: numpy.ndarray: A 1-D or 2-D array containing single point or a collection of points. If more than one-dimensional, each row represents a point.

Returns:

in_bound: bool or numpy.ndarray: Bool or array of bools describing for each point whether it is contained in the bound.

property log_v

Return the natural log of the volume of the bound.

Returns:

log_v: float: The natural log of the volume.

classmethod read(group, rng=None)

Read the bound from an HDF5 group.

Parameters:

group: h5py.Group: HDF5 group to write to.
rng: None or numpy.random.Generator, optional: Determines random number generation. Default is None.

Returns:

bound: UnitCubeEllipsoidMixture: The bound.

reset(rng=None)

Reset random number generation and any progress, if applicable.

Parameters:

rngNone or numpy.random.Generator, optional: Determines random number generation. If None, random number generation is not reset. Default is None.

sample(n_points=100)

Sample points from the bound.

Parameters:

n_points: int, optional: How many points to draw. Default is 100.

Returns:

points: numpy.ndarray: Points as two-dimensional array of shape (n_points, n_dim).

transform(points)

Transform points into the frame of the cube-ellipsoid mixture.

Along dimensions where the boundary is not defined via the unit range, the coordinates are transformed into the range [-1, +1]. Along all other dimensions, they are transformed into the coordinate system of the bounding ellipsoid.

Parameters:

points: numpy.ndarray: A 1-D or 2-D array containing single point or a collection of points. If more than one-dimensional, each row represents a point.

Returns:

points_t: numpy.ndarray: Transformed points.

write(group)

Write the bound to an HDF5 group.

Parameters:

group: h5py.Group: HDF5 group to write to.

nautilus.neural

Module implementing neural network emulators.

class nautilus.neural.NeuralNetworkEmulator

Bases: object

Likelihood neural network emulator.

Attributes:

meannumpy.ndarray: Mean of input coordinates used for normalizing coordinates.
scalenumpy.ndarray: Standard deviation of input coordinates used for normalizing coordinates.
networksklearn.neural_network.MLPRegressor: Artifical neural network used for emulation.

predict(x)

Calculate the emulator likelihood prediction for a group of points.

Parameters:

xnumpy.ndarray: Normalized coordinates of the training points.

Returns:

y_emunumpy.ndarray: Emulated normalized likelihood value of the training points.

classmethod read(group)

Read the emulator from an HDF5 group.

Parameters:

grouph5py.Group: HDF5 group to write to.

Returns:

emulatorNeuralNetworkEmulator: The likelihood neural network emulator.

classmethod train(x, y, n_networks=4, neural_network_kwargs={}, pool=None)

Initialize and train the likelihood neural network emulator.

Parameters:

xnumpy.ndarray: Input coordinates.
ynumpy.ndarray: Target values.
n_networksint, optional: Number of networks used in the emulator. Default is 4.
neural_network_kwargsdict, optional: Non-default keyword arguments passed to the constructor of MLPRegressor.
poolmultiprocessing.Pool, optional: Pool used for parallel processing.

Returns:

emulatorNeuralNetworkEmulator: The likelihood neural network emulator.

write(group)

Write the emulator to an HDF5 group.

Parameters:

grouph5py.Group: HDF5 group to write to.

nautilus.neural.train_network(x, y, neural_network_kwargs, random_state)

Train a network.

Parameters:

xnumpy.ndarray: Input coordinates.
ynumpy.ndarray: Target values.
neural_network_kwargsdict: Keyword arguments passed to the constructor of MLPRegressor.
random_stateint: Determines random number generation.

Returns:

networkMLPRegressor: The trained network.