Example ========================================= SysCorr comes with a simple demonstration example in the example/ folder. We will go through the code. Data ------ First, gen.py generates example data for analysis: .. code-block:: bash $ python gen.py For instance, it creates the sysline/ dataset, which contains 40 datapoints: .. code-block:: bash $ ls sysline/chain chain00 chain04 chain08 chain12 chain16 chain20 chain24 chain28 chain32 chain36 chain01 chain05 chain09 chain13 chain17 chain21 chain25 chain29 chain33 chain37 chain02 chain06 chain10 chain14 chain18 chain22 chain26 chain30 chain34 chain38 chain03 chain07 chain11 chain15 chain19 chain23 chain27 chain31 chain35 chain39 You can look at each data point, which is made up of a chain of values to represent the uncertainty (like a markov chain): .. code-block:: bash $ gnuplot > plot [0:1] [-1:1] "sysline/chain01" When you prepare your own data, make sure they are scaled so that they are in the units where you expect a relationship. I.e. if you are looking for a power law, take the logarithm of the values. If your x/y values lie outside the -20:20 range, you have to adapt the transform functions in the code (or scale the data). Run ------- Launch the example application now: .. code-block:: bash $ RESUME=1 python analyse.py out/sysline_ sysline/chain?? This will take some time, so lets find out what the analysis.py file does. You will write a similar file for your application: .. literalinclude:: ../example/analyse.py :start-after: # handle command line arguments :end-before: # choose models 1. The output will be written to 'out/sysline\_' (first argument) 2. Files are the data files (second argument and onwards). 3. If you set RESUME=1, then the analysis will continue. Otherwise, it will start from scratch. Models -------- Next, the models are defined: .. literalinclude:: ../example/analyse.py :start-after: # choose models :end-before: # allow choosing model via environment variable If you want to compare a linear correlation to a non-correlation, you would have 'independent_uniform' and 'line'. The next code chunk just allows you to specify models from the command line without editing code: .. literalinclude:: ../example/analyse.py :start-after: # allow choosing model via environment variable :end-before: # load data Calculation --------------- All that is left is to load the data and send it off! .. literalinclude:: ../example/analyse.py :start-after: # load data Output -------------- As an example, the text output can look like this:: (...) Total Samples: 8011 Nested Sampling ln(Z): -38.456223 ln(ev)= -38.109756655272683 +/- 0.13633196813580595 Total Likelihood Evaluations: 8011 Sampling finished. Exiting MultiNest analysing data from out/sysline_independent_normal.txt Analysing multinest output... analysing data from out/sysline_independent_normal.txt posterior plot written to "out/sysline_independent_normalpost_model_data.pdf" calculation results: ======================================== model name | log Z independent_uniform : -21.8 independent_normal : -16.6 tribinned : -11.3 square : -7.9 line : -6.7 <- BEST MODEL Parameters for model "line": syserror: 0.18+/-0.04 a: 1.8+/-0.2 b: 0.17+/-0.08 This gives you the best model (line), and the parameters of that line. (Input was y=2*x + 0.1 with error 0.1) .. figure:: line.png :alt: posterior prediction for line model The model with several permitted parameters overplotted over the data. For each model prediction, red is the median line, while grey shows the 10%/90% quantiles. The data points are shown in blue using the x/y median and the permitted x/y extrema for the rectangles. .. figure:: independent.png :alt: posterior prediction for independent_uniform model Same as above, but for the model with independent uniform distribution for y, i.e. y is uncorrelated of x. This model is ruled out by the model selection (see text output above).