SANS@Tools
SANS@ToDo
SANS@MLZ
Useful Links
!!! New Option: >= v.0.15.2
Nonlinear least–squares fitting determines model parameters by minimizing the discrepancy between experimental data and theoretical predictions. For scattering and similar experiments multiple datasets are often fitted simultaneously using shared (global) and dataset–specific (local) parameters.
For a total of $N$ experimental points the classical chi–square is
$$ \chi^2_{\mathrm{data}} = \sum_{i=1}^{N} \left( \frac{I_{\mathrm{calc}}(q_i,\mathbf{p}) - I_{\mathrm{exp}}(q_i)} {\sigma_i} \right)^2 , $$
where
The minimization corresponds to a maximum likelihood estimate assuming Gaussian experimental errors.
Many physical models contain parameters which are:
Examples include:
Pure least–squares fitting may then produce unstable or nonphysical parameter values.
Bayesian fitting introduces additional information called prior knowledge.
Bayesian statistics combines:
Instead of asking
Bayesian fitting asks
Mathematically this corresponds to maximizing the posterior probability
$$ P(\mathbf{p}|D) \propto P(D|\mathbf{p}) P(\mathbf{p}), $$
where
The most common prior assumes that a parameter is approximately known:
$$ p_j \approx \mu_j \pm \sigma_{p_j}. $$
This produces a Gaussian prior
$$ P(p_j) \propto \exp \left( -\frac{(p_j-\mu_j)^2}{2\sigma_{p_j}^2} \right). $$
In least–squares form this becomes a Bayesian penalty
$$ \chi^2_{\mathrm{Bayes}} = \sum_j \frac{(p_j-\mu_j)^2}{\sigma_{p_j}^2}. $$
Total minimized quantity:
$$
\chi^2_{\mathrm{total}}
=
\chi^2_{\mathrm{data}}
+
\chi^2_{\mathrm{Bayes}}.
$$
Consider fitting scattering intensity
$$ I(q)=A\,P(q)+B $$
where $B$ is background intensity.
If low-$q$ data are missing, many values of $B$ produce similar fits.
Without Bayesian constraint:
Introducing prior knowledge
$$ B = 0 \pm 0.01 $$
stabilizes the solution while still allowing variation.
A particle radius obtained from microscopy is
$$ R = 50 \pm 5\ \mathrm{nm}. $$
During fitting the data alone might allow unrealistically large radii.
Bayesian fitting adds the penalty
$$ \frac{(R-50)^2}{5^2} $$
which gently pulls the solution toward physically reasonable values.
This is a soft constraint, not a fixed limit.
Bayesian fitting is mathematically equivalent to regularized least squares:
$$ \chi^2 = ||\text{data residuals}||^2 + ||\text{parameter deviation}||^2. $$
This improves numerical stability and reduces parameter correlations.
Large prior uncertainty:
$$ \sigma_{p_j}\rightarrow\infty $$
recovers classical least–squares fitting.
Bayesian terms are useful when:
Typical workflow:
Bayesian fitting therefore combines experimental information with physical knowledge to obtain stable and realistic parameter estimates.
In many practical situations parameters are only weakly constrained by data. Prior physical knowledge can therefore be incorporated using Bayesian regularization.
Each selected parameter $p_j$ is assumed to follow a Gaussian prior
$$ p_j \sim \mathcal{N}(\mu_j,\sigma_{p_j}), $$
with prior mean $μ_j$ and prior width $\sigma_{p_j}$ introducing a Bayesian penalty term
$$ \chi^2_{\mathrm{Bayes}} = \sum_j \frac{(p_j-\mu_j)^2}{\sigma_{p_j}^2}. $$
The minimized objective becomes
$$ \chi^2_{\mathrm{total}} = \chi^2_{\mathrm{data}} + \chi^2_{\mathrm{Bayes}}, $$
which corresponds to Maximum A Posteriori (MAP) estimation.
Bayesian terms act as soft parameter constraints rather than fixed limits.
For $K$ datasets with $N_k$ points:
$$ \chi^2_{\mathrm{data}} = \sum_{k=1}^{K} \sum_{i=1}^{N_k} r_{k,i}^2, $$
with residuals
$$ r_{k,i} = \frac{ I_{\mathrm{calc},k}(q_{k,i}) - I_{\mathrm{exp},k}(q_{k,i}) }{\sigma_{k,i}}. $$
Total number of residuals:
$$ N=\sum_k N_k. $$
Global parameters influence all datasets simultaneously, while local parameters affect only individual datasets.
Two complementary minimization strategies are used.
The Nelder–Mead simplex algorithm minimizes directly
$$ \chi^2_{\mathrm{total}}(\mathbf{p}). $$
Advantages:
However convergence close to the optimum is relatively slow.
The Levenberg–Marquardt (LM) algorithm minimizes
$$ \chi^2=\sum_i f_i^2 $$
using a residual vector $f_i$ and Jacobian matrix
$$ J_{ij}=\frac{\partial f_i}{\partial p_j}. $$
LM combines gradient descent and Gauss–Newton methods:
$$ (J^T J + \lambda I)\Delta p = -J^T f. $$
It provides fast quadratic convergence near the solution and enables reliable error estimation.
Standard LM implementations (e.g. GNU Scientific Library) require the number of residuals to remain equal to the number of experimental data points.
Instead of adding extra Bayesian residuals, the Bayesian contribution is distributed over all data residuals.
First, the normalized Bayesian term is defined as
$$ B = \frac{1}{N} \sum_j \frac{(p_j-\mu_j)^2}{\sigma_{p_j}^2}. $$
Each residual is then modified as
$$ f_i = \mathrm{sign}(r_i) \sqrt{r_i^2 + B}. $$
This formulation guarantees
$$ \sum_i f_i^2 = \chi^2_{\mathrm{data}} + \chi^2_{\mathrm{Bayes}}, $$
while preserving the required residual dimension.
The Bayesian information therefore influences every data point equally during minimization.
The LM algorithm requires derivatives of the modified residuals:
$$ J_{ij} = \frac{\partial f_i}{\partial p_j}. $$
Using the residual definition above,
$$ \frac{\partial f_i}{\partial p_j} = \frac{ r_i \frac{\partial r_i}{\partial p_j} + \frac12 \frac{\partial B}{\partial p_j} }{ \sqrt{r_i^2+B} }. $$
Model contribution:
$$ \frac{\partial r_i}{\partial p_j} = \frac{1}{\sigma_i} \frac{\partial I_{\mathrm{calc}}}{\partial p_j}. $$
Bayesian contribution:
$$ \frac{\partial B}{\partial p_j} = \frac{2}{N} \frac{p_j-\mu_j}{\sigma_{p_j}^2}. $$
Thus Bayesian priors modify all Jacobian rows without increasing matrix size.
The Bayesian penalty is mathematically equivalent to Tikhonov regularization:
$$ \chi^2 = ||r||^2 + ||L(\mathbf{p}-\mathbf{\mu})||^2. $$
Bayesian fitting therefore stabilizes ill-posed problems, reduces parameter correlations, and suppresses nonphysical solutions.
Infinite prior width reproduces classical least squares.
After convergence the covariance matrix is obtained from
$$ C = (J^T J)^{-1}. $$
Parameter uncertainties are
$$ \sigma_{p_j}=\sqrt{C_{jj}}. $$
Bayesian priors effectively add statistical information, leading to finite and stable error estimates even for weakly determined parameters.
Typical fitting procedure:
This combined Bayesian global fitting approach provides robust convergence and statistically consistent parameter errors for complex nonlinear models.