Tutorial¶

Quickstart¶

Here is how to fit a simple likelihood function:

[1]:

paramnames = ['Hinz', 'Kunz']

def loglike(z):
    return -0.5 * (((z - 0.5) / 0.01)**2).sum()

def transform(x):
    return 10. * x - 5.

from autoemcee import ReactiveAffineInvariantSampler

sampler = ReactiveAffineInvariantSampler(paramnames, loglike, transform)

results = sampler.run()

sampler.print_results()

[autoemcee] finding starting points and running initial 100 MCMC steps

100%|██████████| 100/100 [00:00<00:00, 807.30it/s]
100%|██████████| 100/100 [00:00<00:00, 832.95it/s]
100%|██████████| 100/100 [00:00<00:00, 836.21it/s]
100%|██████████| 100/100 [00:00<00:00, 840.75it/s]

[autoemcee] rhat chain diagnostic: [1.08197523 1.09111689] (<1.010 is good)
[autoemcee] not converged yet at iteration 1 after 80400 evals
[autoemcee] Running 1000 MCMC steps ...
[autoemcee] Starting points chosen: {83}, L=-32.8
[autoemcee] Starting at [0.54827713 0.55143729] +- [4.31449488e-06 4.59671758e-06]

100%|██████████| 100/100 [00:00<00:00, 833.73it/s]
100%|██████████| 1000/1000 [00:01<00:00, 830.73it/s]

[autoemcee] Starting points chosen: {92}, L=-32.8
[autoemcee] Starting at [0.5503052  0.54975206] +- [4.78590847e-06 4.87496661e-06]


100%|██████████| 100/100 [00:00<00:00, 841.34it/s]
100%|██████████| 1000/1000 [00:01<00:00, 848.18it/s]

[autoemcee] Starting points chosen: {87}, L=-32.8
[autoemcee] Starting at [0.5517947  0.54986288] +- [5.18789023e-06 5.62474579e-06]


100%|██████████| 100/100 [00:00<00:00, 846.89it/s]
100%|██████████| 1000/1000 [00:01<00:00, 847.88it/s]

[autoemcee] Starting points chosen: {95}, L=-32.8
[autoemcee] Starting at [0.55019943 0.5495126 ] +- [4.52229076e-06 5.08853735e-06]


100%|██████████| 100/100 [00:00<00:00, 838.69it/s]
100%|██████████| 1000/1000 [00:01<00:00, 770.95it/s]

[autoemcee] Used 440800 calls in last MCMC run

[autoemcee] rhat chain diagnostic: [1.00026866 1.00017957] (<1.010 is good)
[autoemcee] converged!!!

    Hinz                0.500 +- 0.010
    Kunz                0.500 +- 0.010

The chains converged and we got parameter and uncertainty estimates.

[2]:

print("Loglikelihood was called %d times." % sampler.results['ncall'])

Loglikelihood was called 521200 times.

Visualisation¶

[3]:

import corner
corner.corner(sampler.results['samples'], labels=paramnames, show_titles=True);

Advanced usage¶

Lets try a function that cannot be described by a simple gaussian.

[4]:

paramnames = ['Hinz', 'Kunz'] #, 'Fuchs', 'Gans', 'Hofer']

def loglike_rosen(theta):
    a = theta[:-1]
    b = theta[1:]
    return -2 * (100 * (b - a**2)**2 + (1 - a)**2).sum()

def transform_rosen(u):
    return u * 20 - 10

import logging
logging.getLogger('autoemcee').setLevel(logging.DEBUG)

sampler = ReactiveAffineInvariantSampler(paramnames, loglike_rosen, transform=transform_rosen)
result = sampler.run(max_ncalls=1000000)

[autoemcee] finding starting points and running initial 100 MCMC steps

100%|██████████| 100/100 [00:00<00:00, 634.17it/s]
100%|██████████| 100/100 [00:00<00:00, 635.09it/s]
100%|██████████| 100/100 [00:00<00:00, 611.98it/s]
100%|██████████| 100/100 [00:00<00:00, 623.38it/s]

[autoemcee] rhat chain diagnostic: [1.02016236 1.01082186] (<1.010 is good)
[autoemcee] not converged yet at iteration 1 after 80400 evals
[autoemcee] Running 1000 MCMC steps ...
[autoemcee] Starting points chosen: {84}, L=-0.0
[autoemcee] Starting at [0.49732951 0.49864447] +- [0.0003298 0.0005775]


100%|██████████| 100/100 [00:00<00:00, 600.89it/s]
100%|██████████| 1000/1000 [00:01<00:00, 627.67it/s]

[autoemcee] Starting points chosen: {13}, L=-0.0
[autoemcee] Starting at [0.59658084 0.69056221] +- [0.00034056 0.00054237]


100%|██████████| 100/100 [00:00<00:00, 636.22it/s]
100%|██████████| 1000/1000 [00:01<00:00, 641.76it/s]

[autoemcee] Starting points chosen: {81}, L=-0.0
[autoemcee] Starting at [0.45616206 0.54018849] +- [0.00033495 0.00057437]


100%|██████████| 100/100 [00:00<00:00, 636.27it/s]
100%|██████████| 1000/1000 [00:01<00:00, 633.62it/s]

[autoemcee] Starting points chosen: {70}, L=-0.0
[autoemcee] Starting at [0.50447467 0.49863139] +- [0.00026029 0.00050001]


100%|██████████| 100/100 [00:00<00:00, 631.02it/s]
100%|██████████| 1000/1000 [00:01<00:00, 635.85it/s]

[autoemcee] Used 440800 calls in last MCMC run

[autoemcee] rhat chain diagnostic: [1.05343804 1.04422692] (<1.010 is good)
[autoemcee] not converged yet at iteration 2 after 521200 evals
[autoemcee] would need more likelihood calls (4408000) than maximum (1000000) for next step

This already took quite a bit more effort.

[5]:

print("Loglikelihood was called %d times." % result['ncall'])

Loglikelihood was called 521200 times.

Lets see how well it did:

[6]:

from getdist import MCSamples, plots
import matplotlib.pyplot as plt

samples_g = MCSamples(samples=result['samples'],
                       names=result['paramnames'],
                       label='Gaussian',
                       settings=dict(smooth_scale_2D=3))

mcsamples = [samples_g]

Removed no burn in

[7]:

import numpy as np
x = np.linspace(-0.5, 4, 100)
a, b = np.meshgrid(x, x)
z = -2 * (100 * (b - a**2)**2 + (1 - a)**2)

g = plots.get_single_plotter()
g.plot_2d(mcsamples, paramnames)
plt.contour(a, b, z, [-5, -1, 0], colors='red')
plt.xlim(-0.5, 2)
plt.ylim(-0.5, 4);

As you can see, the MCMC sampler (black) is approximating the rosenbrock curvature (red).

[8]:

from getdist import MCSamples, plots
import matplotlib.pyplot as plt

samples_g = MCSamples(samples=result['samples'],
                       names=result['paramnames'],
                       label='Gaussian')

mcsamples = [samples_g]

g = plots.get_subplot_plotter(width_inch=8)
g.settings.num_plot_contours = 3
g.triangle_plot(mcsamples, filled=False, contour_colors=plt.cm.Set1.colors)

#corner.corner(sampler.results['samples'], labels=sampler.results['paramnames'], show_titles=True);

Removed no burn in

WARNING:root:auto bandwidth for Hinz very small or failed (h=0.0005911494768376028,N_eff=400000.0). Using fallback (h=0.010347105781485627)
WARNING:root:auto bandwidth for Kunz very small or failed (h=0.0006298615145645845,N_eff=400000.0). Using fallback (h=0.007399624413831077)