QtiSAS|QtiKWS

SANS: reduction, analysis, global instrumental fit

User Tools

Site Tools


Sidebar


Vitaliy Pipich


Contact Form


SANS@Tools



SANS
Data
Reduction


ascii
SANS
1D


Compile
Fitting
Function

Fitting
Curve(s)
Tools


Singular
Value De-
composition


Jülich
NSE
Tools


SLD: Calculator


SANS@ToDo

SANS@MLZ




Very Small Angle Scattering KWS-3 ‘VerySANS’ www.verysans.com

JCNS :: Institutes

fit-curve:bayesian-fit

!!! New Option: >= v.0.15.2

Bayesian [Global, Instrumental] Fitting




Introduction

Nonlinear least–squares fitting determines model parameters by minimizing the discrepancy between experimental data and theoretical predictions. For scattering and similar experiments multiple datasets are often fitted simultaneously using shared (global) and dataset–specific (local) parameters.

For a total of $N$ experimental points the classical chi–square is

$$ \chi^2_{\mathrm{data}} = \sum_{i=1}^{N} \left( \frac{I_{\mathrm{calc}}(q_i,\mathbf{p}) - I_{\mathrm{exp}}(q_i)} {\sigma_i} \right)^2 , $$

where

  • $I_{\mathrm{exp}}$ is the measured intensity,
  • $I_{\mathrm{calc}}$ is the model prediction,
  • $\sigma_i$ represents experimental uncertainty,
  • $\mathbf{p}$ denotes the parameter vector.

The minimization corresponds to a maximum likelihood estimate assuming Gaussian experimental errors.

Motivation for Bayesian Fitting

Many physical models contain parameters which are:

  • weakly sensitive to data,
  • strongly correlated,
  • known approximately from previous experiments,
  • restricted by physical reasoning.

Examples include:

  • particle size ranges,
  • background levels,
  • scattering contrast,
  • instrument resolution parameters.

Pure least–squares fitting may then produce unstable or nonphysical parameter values.

Bayesian fitting introduces additional information called prior knowledge.


Bayesian Idea

Bayesian statistics combines:

  • information from measurements (data likelihood),
  • previous knowledge about parameters (prior).

Instead of asking

  • “Which parameters best fit the data?”*

Bayesian fitting asks

  • “Which parameters best fit the data while remaining physically plausible?”*

Mathematically this corresponds to maximizing the posterior probability

$$ P(\mathbf{p}|D) \propto P(D|\mathbf{p}) P(\mathbf{p}), $$

where

  • $P(D|\mathbf{p})$ — likelihood from experimental data,
  • $P(\mathbf{p})$ — prior probability.

Gaussian Priors

The most common prior assumes that a parameter is approximately known:

$$ p_j \approx \mu_j \pm \sigma_{p_j}. $$

This produces a Gaussian prior

$$ P(p_j) \propto \exp \left( -\frac{(p_j-\mu_j)^2}{2\sigma_{p_j}^2} \right). $$

In least–squares form this becomes a Bayesian penalty

$$ \chi^2_{\mathrm{Bayes}} = \sum_j \frac{(p_j-\mu_j)^2}{\sigma_{p_j}^2}. $$

Total minimized quantity:

$$ \chi^2_{\mathrm{total}} = \chi^2_{\mathrm{data}} + \chi^2_{\mathrm{Bayes}}. $$




Example 1: Weakly Determined Background


Consider fitting scattering intensity

$$ I(q)=A\,P(q)+B $$

where $B$ is background intensity.

If low-$q$ data are missing, many values of $B$ produce similar fits.

Without Bayesian constraint:

  • background may drift,
  • amplitude compensates incorrectly,
  • parameter errors diverge.

Introducing prior knowledge

$$ B = 0 \pm 0.01 $$

stabilizes the solution while still allowing variation.


Example 2: Known Particle Size Range



A particle radius obtained from microscopy is

$$ R = 50 \pm 5\ \mathrm{nm}. $$

During fitting the data alone might allow unrealistically large radii.

Bayesian fitting adds the penalty

$$ \frac{(R-50)^2}{5^2} $$

which gently pulls the solution toward physically reasonable values.

This is a soft constraint, not a fixed limit.


Interpretation as Regularization


Bayesian fitting is mathematically equivalent to regularized least squares:

$$ \chi^2 = ||\text{data residuals}||^2 + ||\text{parameter deviation}||^2. $$

This improves numerical stability and reduces parameter correlations.

Large prior uncertainty:

$$ \sigma_{p_j}\rightarrow\infty $$

recovers classical least–squares fitting.


Bayesian Fit in Practice



Bayesian terms are useful when:

  • parameters are weakly determined,
  • global fits contain many correlations,
  • physical ranges are known,
  • convergence becomes unstable.

Typical workflow:

  • perform unconstrained fit,
  • identify unstable parameters,
  • introduce Bayesian priors,
  • refine using nonlinear optimization.

Bayesian fitting therefore combines experimental information with physical knowledge to obtain stable and realistic parameter estimates.

Bayesian Extension

In many practical situations parameters are only weakly constrained by data. Prior physical knowledge can therefore be incorporated using Bayesian regularization.

Each selected parameter $p_j$ is assumed to follow a Gaussian prior

$$ p_j \sim \mathcal{N}(\mu_j,\sigma_{p_j}), $$

with prior mean $μ_j$ and prior width $\sigma_{p_j}$ introducing a Bayesian penalty term

$$ \chi^2_{\mathrm{Bayes}} = \sum_j \frac{(p_j-\mu_j)^2}{\sigma_{p_j}^2}. $$

The minimized objective becomes

$$ \chi^2_{\mathrm{total}} = \chi^2_{\mathrm{data}} + \chi^2_{\mathrm{Bayes}}, $$

which corresponds to Maximum A Posteriori (MAP) estimation.

Bayesian terms act as soft parameter constraints rather than fixed limits.


Global Fit Formulation

For $K$ datasets with $N_k$ points:

$$ \chi^2_{\mathrm{data}} = \sum_{k=1}^{K} \sum_{i=1}^{N_k} r_{k,i}^2, $$

with residuals

$$ r_{k,i} = \frac{ I_{\mathrm{calc},k}(q_{k,i}) - I_{\mathrm{exp},k}(q_{k,i}) }{\sigma_{k,i}}. $$

Total number of residuals:

$$ N=\sum_k N_k. $$

Global parameters influence all datasets simultaneously, while local parameters affect only individual datasets.


Optimization Algorithms

Two complementary minimization strategies are used.

Simplex Method

The Nelder–Mead simplex algorithm minimizes directly

$$ \chi^2_{\mathrm{total}}(\mathbf{p}). $$

Advantages:

  • derivative–free,
  • robust far from minimum,
  • suitable for initialization.

However convergence close to the optimum is relatively slow.


Levenberg–Marquardt Method

The Levenberg–Marquardt (LM) algorithm minimizes

$$ \chi^2=\sum_i f_i^2 $$

using a residual vector $f_i$ and Jacobian matrix

$$ J_{ij}=\frac{\partial f_i}{\partial p_j}. $$

LM combines gradient descent and Gauss–Newton methods:

$$ (J^T J + \lambda I)\Delta p = -J^T f. $$

It provides fast quadratic convergence near the solution and enables reliable error estimation.


Bayesian Residual Formulation for LM

Standard LM implementations (e.g. GNU Scientific Library) require the number of residuals to remain equal to the number of experimental data points.

Instead of adding extra Bayesian residuals, the Bayesian contribution is distributed over all data residuals.

First, the normalized Bayesian term is defined as

$$ B = \frac{1}{N} \sum_j \frac{(p_j-\mu_j)^2}{\sigma_{p_j}^2}. $$

Each residual is then modified as

$$ f_i = \mathrm{sign}(r_i) \sqrt{r_i^2 + B}. $$

This formulation guarantees

$$ \sum_i f_i^2 = \chi^2_{\mathrm{data}} + \chi^2_{\mathrm{Bayes}}, $$

while preserving the required residual dimension.

The Bayesian information therefore influences every data point equally during minimization.


Jacobian Matrix

The LM algorithm requires derivatives of the modified residuals:

$$ J_{ij} = \frac{\partial f_i}{\partial p_j}. $$

Using the residual definition above,

$$ \frac{\partial f_i}{\partial p_j} = \frac{ r_i \frac{\partial r_i}{\partial p_j} + \frac12 \frac{\partial B}{\partial p_j} }{ \sqrt{r_i^2+B} }. $$

Model contribution:

$$ \frac{\partial r_i}{\partial p_j} = \frac{1}{\sigma_i} \frac{\partial I_{\mathrm{calc}}}{\partial p_j}. $$

Bayesian contribution:

$$ \frac{\partial B}{\partial p_j} = \frac{2}{N} \frac{p_j-\mu_j}{\sigma_{p_j}^2}. $$

Thus Bayesian priors modify all Jacobian rows without increasing matrix size.


Relation to Regularization

The Bayesian penalty is mathematically equivalent to Tikhonov regularization:

$$ \chi^2 = ||r||^2 + ||L(\mathbf{p}-\mathbf{\mu})||^2. $$

Bayesian fitting therefore stabilizes ill-posed problems, reduces parameter correlations, and suppresses nonphysical solutions.

Infinite prior width reproduces classical least squares.


Parameter Errors

After convergence the covariance matrix is obtained from

$$ C = (J^T J)^{-1}. $$

Parameter uncertainties are

$$ \sigma_{p_j}=\sqrt{C_{jj}}. $$

Bayesian priors effectively add statistical information, leading to finite and stable error estimates even for weakly determined parameters.


Practical Workflow

Typical fitting procedure:

  1. Simplex global search
  2. Levenberg–Marquardt refinement
  3. covariance evaluation
  4. uncertainty estimation

This combined Bayesian global fitting approach provides robust convergence and statistically consistent parameter errors for complex nonlinear models.

  • Global/Local Limits: instead of the range select a bayesian option: instead min/max put prior mean value and prior width
fit-curve/bayesian-fit.txt · Last modified: 2026/05/08 10:02 by Vitaliy Pipich