BXA/Xspec¶

Load your data, define your background model and source model as usual. This, you have to do using calls to PyXSpec functions. Look at the file examples/example_simplest.py for an instructive example on how to do the simplest analysis.

Defining priors¶

Create a list of prior transformations like in the example above, one line for each variable. These functions will help you with that.

BXA (Bayesian X-ray Analysis) for Xspec

bxa.xspec.create_uniform_prior_for(model, par)

Use for location variables (position) The uniform prior gives equal weight in non-logarithmic scale.

bxa.xspec.create_jeffreys_prior_for(model, par)

Use for scale variables (order of magnitude) The Jeffreys prior gives equal weight to each order of magnitude between the minimum and maximum value. Flat in logarithmic scale

bxa.xspec.create_custom_prior_for(model, par, transform, aftertransform=<function <lambda> at 0x7efbd684ab90>)

Pass your own prior weighting transformation

See examples/example_simplest.py for a simple example. examples/example_advanced_priors.py introduces more complex and custom priors.

Running the analysis¶

A convencience method is provided for you, called standard_analysis, which does everything. You need to specify a prefix, called outputfiles_basename where the files are stored.

bxa.xspec.standard_analysis(transformations, outputfiles_basename, skipsteps=, []**kwargs)[source]

Default analysis which produces nice plots:

• runs nested sampling analysis, creates MCMC chain file
• marginal probabilities (1d and 2d)
• model posterior predictions + data overplotted, convolved
• model posterior predictions unconvolved
• quantile-quantile plot with statistics
• prints out summary of parameters
• prints out model evidence

Look at the source of this function to figure out how to do the individual parts. Copy them to your scripts and adapt them to your needs.

bxa.xspec.nested_run(transformations, prior_function=None, sampling_efficiency=0.3, n_live_points=400, evidence_tolerance=0.5, outputfiles_basename='chains/', verbose=True, **kwargs)[source]

Run the Bayesian analysis with specified parameters+transformations.

If prior is None, uniform priors are used on the passed parameters. If parameters is also None, all thawed parameters are used.

Parameters: transformations – Parameter transformation definitions prior_function – set only if you want to specify a custom, non-separable prior outputfiles_basename – prefix for output filenames.

The remainder are multinest arguments (see PyMultiNest and MultiNest documentation!)

The remainder are multinest arguments (see PyMultiNest and MultiNest documentation!) n_live_points: 400 are often enough

For quick results, use sampling_efficiency = 0.8, n_live_points = 50, evidence_tolerance = 5. The real results must be estimated with sampling_efficiency = 0.3 and without using const_efficiency_mode, otherwise it is not reliable.

Both method return a pymultinest.Analyzer object, which provides access to the results.

The example examples/example_custom_run.py shows how to customize the analysis (other plots)

This will allow you to create marginal plots, qq plots, plots of the spectra, etc.

Marginal plots¶

For histograms (1d and 2d) of the marginal parameter distributions, use plot.marginal_plots.

Plotting of posterior parameter marginal distributions

bxa.xspec.plot.marginal_plots(analyzer, d=None)[source]

Create marginal plots

• analyzer: A instance of pymultinest.Analyzer

• d: if more than 20 dimensions, by default only 1d marginal distributions

are plotted. set d=2 if you want to force a 2d matrix plot

For plotting the model parameters found against the data, use these functions.

bxa.xspec.posterior_predictions_unconvolved(analyzer, transformations, component_names=None, plot_args=None, nsamples=400)[source]

Plot unconvolved model posterior predictions.

component_names: labels to use. Set to ‘ignore’ to skip plotting a component plot_args: matplotlib.pyplot.plot arguments for each component

bxa.xspec.posterior_predictions_convolved(analyzer, transformations, component_names=None, plot_args=None, nsamples=400)[source]

Plot convolved model posterior predictions. Also returns data points for plotting.

component_names: labels to use. Set to ‘ignore’ to skip plotting a component plot_args: matplotlib.pyplot.plot arguments for each component

bxa.xspec.set_best_fit(analyzer, transformations)[source]

Set to best fit

Binning routines for plotting

bxa.xspec.sinning.binning(outputfiles_basename, bins, widths, data, models)[source]

Bins the data for plotting. Using the gof module, computes a Poisson goodness-of-fit range, i.e. ranges where the model must lie. This is done for multiple binning sizes simultaneously.

Returns:

• marked_binned: data points binned to contain 10 counts a sequence ready to be passed to matplotlib.pyplot.errorbar
• modelrange: range allowed by the data ready to be passed to matplotlib.pyplot.fill_between
• and statistics (GoF measure)

Refer to the standard_analysis function as an example of how to use them.

Error propagation¶

pymultinest.Analyzer.equal_weighted_posterior() provides access to the posterior samples (similar to a Markov Chain). Use these to propagate errors:

• For every row in the chain, compute the quantity of interest
• Then, make a histogram of the results, or compute mean and standard deviations.

This preserves the structure of the uncertainty (multiple modes, degeneracies, etc.)

Continuing in Xspec: A chain file, compatible with Xspec chain commands is written for you into <outputfiles_basename>chain.fits. In Xspec, load it using “chain load”. This should set parameters, and compute flux estimates.

Model comparison¶

examples/model_compare.py shows an example of model selection. Keep in mind what model prior you would like to use.

• Case 1: Multiple models, want to find one best one to use from there on:
• follow examples/model_compare.py, and pick the model with the highest evidence
• Case 2: Simpler and more complex models, want to find out which complexity is justified:
• follow examples/model_compare.py, and keep the models above a certain threshold
• Case 3: Multiple models which could be correct, only interested in a parameter
• Marginalize over the models: Use the posterior samples from each model, and weigh them by the relative probability of the models (weight = exp(lnZ))

Example output:

jbuchner@ds42 $python model_compare.py absorbed- line- simplest- Model comparison **************** model simplest- : log10(Z) = -1632.7 XXX ruled out model absorbed- : log10(Z) = -7.5 XXX ruled out model line- : log10(Z) = 0.0 <-- GOOD The last, most likely model was used as normalization. Uniform model priors are assumed, with a cut of log10(30) to rule out models. jbuchner@ds42$


Here, the probability of the second-best model, “absorbed”, is $$10^7.5$$ times less likely than the model “line”. As this exceeds our threshold (by a lot!) we can claim the detection of an iron line!

Experiment design¶

We want to to evaluate whether a planned experiment can detect features or constrain parameters, i.e. determine the discriminatory power of future configurations/surveys/missions.

For this, simulate a few spectra using the appropriate response.

• Case 1: Can the experiment constrain the parameters?
• Analyse and check what fraction of the posterior samples lie inside/outside the region of interest.
• Case 2: Can the experiment distinguish between two models?
• Model selection as above.
• Case 3: Which sources (redshift range, luminosity, etc) can be distinguished?
• Compute a grid of spectra. Do model selection at each point in the grid.

Model discovery¶

Is the model the right one? Is there more in the data? These questions can not be answered in a statistical way, but what we can do is

1. generate ideas on what models could fit better
2. test those models for significance with model selection

For the first point, Quantile-Quantile plots provide a unbinned, less noisy alternative to residual plots.

QQ plot example (left), with the corresponding spectrum for comparison (right).

In these plots, for each energy the number of counts observed with lower energy are plotted on one axis, while the predicted are on the other axis. If model and data agree perfectly, this would be a straight line. Deviances are indications of possible mis-fits.

This example is almost a perfect fit! You can see a offset growing at 6-7 keV, which remains at higher energies. This indicates that the data has more counts than the model there.

As the growth is in a S-shape, it is probably a Gaussian (see its cumulative density function).

Refer to the appendix of the accompaning paper for more examples.

For Xspec, the qq function in the qq module allows you to create such plots easily.

bxa.xspec.qq.qq(analyzer, markers=5, annotate=True)[source]

Create a quantile-quantile plot for model discovery (deviations in data from model).

The current data and model is used, so call set_best_fit(analyzer, transformations) before, to get the qq plot at the best fit.

• markers: list of energies/channels (whichever the current plotting xaxis unit)

or number of equally spaced markers between minimum+maximum. * annotate: add information to the plot

Refer to the accompaning paper, which gives an introduction and detailed discussion on the methodology.

BXA/Sherpa