Warm starting

In this tutorial you will learn:

How to play with model variations
Warm start feature: How UltraNest can resume and reuse an existing run, even if you modified the data/likelihood

As a simple example, lets say we want to fit a black body.

[1]:

import numpy as np
from numpy import pi, log
import scipy.stats
import matplotlib.pyplot as plt

Black body model

[2]:

parameters = ['Temperature', 'Amplitude']

[3]:

def black_body_model(wavelength, ampl, T):
    with np.errstate(over='ignore'):
        return ampl / wavelength**5 / (np.exp(1/(wavelength*T)) - 1)

Generate some data

[4]:

Ndata = 10
wavelength = np.logspace(1, 2, Ndata)

[5]:

np.random.seed(1)
ampl_true = 42.0
T_true = 0.01  # in um^-1
background_true = 1e-9
y_true = black_body_model(wavelength, ampl_true, T_true)
sigma_true = y_true * 0.1
y_obs = np.random.normal(y_true + background_true, sigma_true, size=Ndata)
sigma = y_true * 0.1

Visualise the data

Lets plot the data first to see what is going on:

[6]:

plt.figure(figsize=(10, 5))
plt.errorbar(x=wavelength, y=y_obs, yerr=sigma, marker='x', ls=' ')
plt.plot(wavelength, y_true, ':', color='gray')
plt.ylabel('Spectral flux density [Jy]');
plt.xlabel(r'Wavelength [$\mu$m]');

Prior

Here we intentionally set very wide priors:

a uniform prior on temperature, and
a very wide log-uniform prior on the normalisation.

[7]:

def prior_transform(x):
    z = x.copy()
    z[0] = x[0]
    z[1] = 10**(x[1] * 20 - 10)
    return z

[8]:

plt.figure(figsize=(10, 5))
plt.title("Prior predictive checks")
plt.errorbar(x=wavelength, y=y_obs, yerr=sigma, marker='x', ls=' ')
plt.ylim(0, y_obs.max() * 10)

for i in range(20):
    T, ampl = prior_transform(np.random.uniform(size=len(parameters)))
    y_predicted = black_body_model(wavelength, ampl, T)
    plt.plot(wavelength, y_predicted, '-', color='gray')
plt.ylabel('Spectral flux density [Jy]');
plt.xlabel('Wavelength [$\\mu$m]');

First simple model

Here is a typical gaussian likelihood with our black body function:

[9]:

def log_likelihood(params):
    T, ampl = params
    y_predicted = black_body_model(wavelength, ampl, T)
    return scipy.stats.norm(y_predicted, sigma).logpdf(y_obs).sum()

[10]:

from ultranest import ReactiveNestedSampler

reference_run_folder = 'blackbody-alldata'
sampler_ref = ReactiveNestedSampler(parameters, log_likelihood, prior_transform, log_dir=reference_run_folder, resume='overwrite')
results_ref = sampler_ref.run(frac_remain=0.5)
sampler_ref.print_results()

[ultranest] Sampling 400 live points from prior ...

[ultranest] Explored until L=2e+02
[ultranest] Likelihood function evaluations: 46953
[ultranest] Writing samples and results to disk ...
[ultranest] Writing samples and results to disk ... done
[ultranest]   logZ = 159.8 +- 0.1527
[ultranest] Effective samples strategy satisfied (ESS = 981.1, need >400)
[ultranest] Posterior uncertainty strategy is satisfied (KL: 0.46+-0.06 nat, need <0.50 nat)
[ultranest] Evidency uncertainty strategy is satisfied (dlogz=0.43, need <0.5)
[ultranest]   logZ error budget: single: 0.19 bs:0.15 tail:0.41 total:0.43 required:<0.50
[ultranest] done iterating.

logZ = 159.705 +- 0.538
  single instance: logZ = 159.705 +- 0.186
  bootstrapped   : logZ = 159.775 +- 0.353
  tail           : logZ = +- 0.406
insert order U test : converged: True correlation: inf iterations

    Temperature         : 0.00945│ ▁▁▁▁▁▁▁▁▂▂▂▂▄▅▄▅▅▇▇▅▆▅▅▃▃▂▂▂▁▁▁▁▁▁▁ ▁ │0.01035    0.00989 +- 0.00012
    Amplitude           : 37.2  │ ▁▁▁▁▁▁▁▂▂▃▃▅▄▅▆▆▇▇▆▆▄▄▃▃▂▂▁▁▁▁▁▁▁▁▁▁▁ │59.7      47.4 +- 2.7

Plot the fit

[11]:

plt.figure(figsize=(10, 5))
plt.errorbar(x=wavelength, y=y_obs, yerr=sigma, marker='x', ls=' ')
from ultranest.plot import PredictionBand
band = PredictionBand(wavelength)
for T, ampl in results_ref['samples']:
    band.add(black_body_model(wavelength, ampl, T))
band.line(color='k')
band.shade(color='k', alpha=0.5)
plt.ylabel('Spectral flux density [Jy]');
plt.xlabel('Wavelength [$\\mu$m]');

Warm starting a modified model

Lets say we alter our model slightly. We include a small constant background:

[12]:

def log_likelihood_with_background(params):
    T, ampl = params
    y_predicted = black_body_model(wavelength, ampl, T) + 1e-9
    return scipy.stats.norm(y_predicted, sigma).logpdf(y_obs).sum()

We have the same parameters, and expect results to be only mildly different. So lets use warm starting.

Using the previous reference run output file …

[13]:

posterior_upoints_file = reference_run_folder + '/chains/weighted_post_untransformed.txt'

We define our accelerated likelihood and prior transform:

[14]:

from ultranest.integrator import warmstart_from_similar_file

aux_paramnames, aux_log_likelihood, aux_prior_transform, vectorized = warmstart_from_similar_file(
    posterior_upoints_file, parameters, log_likelihood_with_background, prior_transform)

Make accelerated run:

[15]:

sampler = ReactiveNestedSampler(aux_paramnames, aux_log_likelihood, aux_prior_transform, vectorized=vectorized)
res = sampler.run(frac_remain=0.5)

[ultranest] Sampling 400 live points from prior ...

[ultranest] Explored until L=2e+02
[ultranest] Likelihood function evaluations: 5393
[ultranest]   logZ = 166.7 +- 0.0473
[ultranest] Effective samples strategy satisfied (ESS = 763.2, need >400)
[ultranest] Posterior uncertainty strategy is satisfied (KL: 0.45+-0.08 nat, need <0.50 nat)
[ultranest] Evidency uncertainty strategy is satisfied (dlogz=0.41, need <0.5)
[ultranest]   logZ error budget: single: 0.08 bs:0.05 tail:0.41 total:0.41 required:<0.50
[ultranest] done iterating.

[16]:

plt.figure(figsize=(10, 5))
plt.errorbar(x=wavelength, y=y_obs, yerr=sigma, marker='x', ls=' ')
from ultranest.plot import PredictionBand
band = PredictionBand(wavelength)
for T, ampl in results_ref['samples']:
    band.add(black_body_model(wavelength, ampl, T))
band.line(color='k')
band.shade(color='k', alpha=0.5)

band = PredictionBand(wavelength)
for T, ampl, _ in res['samples']:
    band.add(black_body_model(wavelength, ampl, T))
band.line(color='orange')
band.shade(color='orange', alpha=0.5)
plt.plot(wavelength, y_true, ':', color='gray')
plt.ylabel('Spectral flux density [Jy]');
plt.xlabel('Wavelength [$\\mu$m]');

Speed-up

[17]:

print("Speed-up of warm-start: %d%%" % ((results_ref['ncall'] / res['ncall'] - 1)*100))

Speed-up of warm-start: 770%

The cost savings are higher, the more similar the posterior of the modified run is to the original run. This speed-up increases drastically if you have highly informative posteriors. This benefit is independent of problem dimension.

How it works & Limitations

Warm-starting works by deforming the parameter space. The prior transform function is adjusted, and the adjustment is removed by reweighting the likelihood function, to produce the same posterior. To make this work, posterior samples from the unit cube space are required. The deformation uses a factorized auxiliary distribution, based on marginal posterior quantiles.

The weighted_post_untransformed.txt file from a hot-started run cannot be used. This is because it has a deformation already applied.

Read the full documentation at * warmstart_from_similar_file and * the underlying get_auxiliary_contbox_parameterization function

Warm starting from posterior samples

If you already have posterior samples, then you can create an appropriate weighted_post_untransformed.txt file. However, the inverse of the prior transformation has to be applied.

In some cases, this is easy to do analytically, e.g., for uniform priors it is just a scaling.

When the transform cannot be inverted easily

The following code works for arbitrary, factorized priors (as in the blackbody example in this notebook), for an arbitrary number of parameters.

Lets start with our posterior samples. These could be obtained posterior samples from MCMC, or generated from the parameter errors quoted in a paper. Here we take it from the reference run:

[18]:

posterior_samples = results_ref['samples']

plt.scatter(posterior_samples[:,0], posterior_samples[:,1]);
plt.xlabel('%s (p-space)' % parameters[0])
plt.ylabel('%s (p-space)' % parameters[1]);

Lets have a look how our unit-cube prior transform works:

The first parameter has a uniform prior, the other a log-uniform prior.

[19]:

uguess = np.linspace(1e-6, 1-1e-6, 40000)
pguess = np.array([prior_transform(ui * np.ones(len(parameters))) for ui in uguess])

[20]:

plt.subplot(2, 1, 1)
plt.plot(uguess, pguess[:,0])
plt.xlabel('u-space (%s)' % parameters[0])
plt.ylabel('p-space (%s)' % parameters[0]);
plt.subplot(2, 1, 2)
plt.plot(uguess, pguess[:,1])
plt.xlabel('u-space (%s)' % parameters[1])
plt.ylabel('p-space (%s)' % parameters[1]);

Here we convert the posterior samples to u-space, by finding the unit-cube value by optimization.

[21]:

import scipy.optimize

[22]:

import tqdm

[23]:

nparams = len(parameters)
u = np.ones(nparams) * 0.5
stdevs = posterior_samples.std(axis=0)

def minfunc(ui, i, u, pi):
    if not 0 < ui < 1: return 1e100
    u[i] = ui
    p = prior_transform(u)
    return (p[i] - pi)**2

usamples = np.empty((len(posterior_samples), nparams))
for j, sample in enumerate(tqdm.tqdm(posterior_samples)):
    for i, param in enumerate(parameters):
        ui0 = np.interp(sample[i], pguess[:,i], uguess)
        result = scipy.optimize.minimize_scalar(
            minfunc,
            args=(i, u, sample[i]),
            method='brent',
            bracket=(ui0 - 1e-4, ui0, ui0 + 1e-4),
            tol=0.001 * stdevs[i],
        )
        usamples[j,i] = result.x

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6505/6505 [00:01<00:00, 6308.72it/s]

Lets see whether our untransformed (u-space) posterior samples are correct:

[24]:

weights = results_ref['weighted_samples']['weights']
i = np.random.choice(len(weights), p=weights, size=1000)
plt.scatter(results_ref['weighted_samples']['upoints'][i,0], results_ref['weighted_samples']['upoints'][i,1],
            color='gray', label='reference run');

plt.scatter(usamples[:,0], usamples[:,1], label='modified run, usamples reconstructed', marker='x', alpha=0.5)
plt.xlabel('u-space (%s)' % parameters[0])
plt.ylabel('u-space (%s)' % parameters[1])
plt.legend();

This looks like great agreement! We successfully untransformed the posterior samples to u-space.

Writing a run file for warm start

We write a weighted_post_untransformed.txt file based on our untransformed posterior samples. Since these are equally weighted, the first two columns are constants.

[25]:

weights = np.ones((len(usamples), 1)) / len(usamples)
logl = np.zeros(len(usamples)).reshape((-1, 1))

np.savetxt(
    'custom-weighted_post_untransformed.txt',
    np.hstack((weights, logl, usamples)),
    header=' '.join(['weight', 'logl'] + parameters),
    fmt='%f'
)

[26]:

!head custom-weighted_post_untransformed.txt

# weight logl Temperature Amplitude
0.000154 0.000000 0.009828 0.584218
0.000154 0.000000 0.009933 0.582950
0.000154 0.000000 0.009911 0.583998
0.000154 0.000000 0.010134 0.580971
0.000154 0.000000 0.009717 0.585696
0.000154 0.000000 0.009823 0.584407
0.000154 0.000000 0.009729 0.585907
0.000154 0.000000 0.009938 0.582713
0.000154 0.000000 0.009849 0.582950

We can now point the warmstart_from_similar_file function at this file.

Conclusion

Warm start allows accelerated computation based on already knowing the posterior peak approximately. This allows you to:

vary the data (change the analysis pipeline)
vary model assumptions

without needing to start the computation from scratch (potentially costly).

These features are experimental and feedback is appreciated. It is recommended to do a full, clean run to obtain final, reliable results before publication.

References: * “SuperNest” by Aleksandr Petrosyan and Will Handley https://arxiv.org/abs/2212.01760