Analysis - Unfolding¶
The Concept of Unfolding¶
The quantity of interest, the propagation length of the particles, is smeared out in the detection process due to the limited accuracy and sensitivity of the detector. Hence, it is impossible to determine the physical truth from the measurement directly. In fact, the measurement yields proxy variables, which stem from the detector response to the underlying physics. The inverse process, namely the determination of the most probable physical truth on the basis of the measurements, is referred to as unfolding.
The distribution of a measured quantity \(g(y)\) is connected to the true physical distribution \(f(x)\) via the detector response \(A(x,y)\) according to the convolution integral
This is referred to as the Fredholm integral equation of the first kind. The additional term \(b(y)\) represents some bias, which originates form contributions to the measurement that are considered as background. Since the selected sample is assumed to have a sufficiently high purity, the bias will be neglected in the following. The quantity \(\epsilon(y)\) denotes systematic uncertainties, which are not yet taken into account. The detector response function \(A(x,y)\) describes the whole detection procedure. It depends on the physical truth and has to be determined from Monte Carlo simulations.
In practice, it is necessary to discretize the phase space of the variables \(x\) and \(y\). The integral has to be transformed into a sum
where the equation on the right represents a matrix multiplication with the detector response matrix \(\textbf{A}\) and the vectors \(\textbf{g}\) and \(\textbf{f}\). Each vector component denotes a bin entry in the distribution of the corresponding quantity.
The main task is the reconstruction of the true distribution \(\textbf{f}\) from the measured distribution \(\textbf{g}\). In principle, this could be done by simply inverting the discrete Fredholm equation. However, the response matrix is not necessarily a square matrix and even if the inverse \(\textbf{A}^{-1}\) exists, the solution often shows an oscillating behavior. This would be an unphysical result and can be traced back to the properties of the matrix itself. The approach is an ill-posed problem, where small eigenvalues of the matrix heavily amplify statistical noise in the measurement. Consequently, more sophisticated methods are needed to unfold the true spectrum.
Maximum Likelihood Estimation¶
One such approach is a maximum likelihood estimation. A likelihood function is constructed by assuming a Poisson distribution for each bin of the proxy variable
where the used model \(\lambda_{i}(\textbf{f}\,)\) corresponds to the discrete Fredholm equation
The most probable distribution \(\textbf{f}\) on the basis of the measurements \(\textbf{g}\) maximizes the likelihood. For practical reasons, it is often useful to minimize the negative logarithm of this function instead. The result remains unchanged, but the likelihood reads
The complication of oscillating solutions still remains. To overcome this problem, it is necessary to introduce additional assumptions about the smoothness of the distribution. This method is referred to as regularization. A commonly used form of regularization is the Tikhonov regularization. It introduces a penalty term which suppresses high second derivatives of the solution
with the regularization matrix
It is chosen such that \(\textbf{C} \textbf{f}\) is the second derivative according to the finite difference method. The parameter \(\tau\) controls the strength of the regularization. Introducing a regularization serves as a form of bias. It is important to optimize its strength in such a way that the oscillations are suppressed, but the spectrum is not flattened to an extent that makes the result physically unreliable. The total likelihood then reads
Since bins of the target distribution \(\textbf{f}\) can be empty and the logarithm of zero is not definded, a small offset \(d\) is added. The target distribution is further subject to the detector response, which is why not \(\textbf{f}\) itself, but the logarithm of the acceptance corrected spectrum is expected to be flat. Hence, the regularization term is modified:
funfolding¶
The concepts of ML unfolding have already been implemented in a python package by Mathis Börner, a former member of the
TU Dortmund group. The package is available via pip install funfolding
. In addition to the above mentioned unfolding
approach, the package features tools such as an MCMC sampling or the possibility of fitting nuisance parameters to
the distributions, which improves the estimation of systematic uncertainties in the analysis. Funfolding has also been
used in an earlier IceCube analysis (3yr NuMu analysis) where
additional informaiton can be found.
MCMC Sampling¶
Funfolding evaluates the likelihood in a bayesian approach. A Markov Chain Monte Carlo (MCMC) sampling is used to contruct an a-posteriori distribution of the form
where the a-priori distribution of \(\textbf{f}\) is set to be uniform
The drawn samples depend only on the one sampled before them, and they are independent of all others. Starting from a random guess, the chain moves to another point by adding a noise from a predetermined distribution. The jump is accepted with a certain probability, following the concepts of a modified Metropolis-Hastings algorithm as implemented in the \(\texttt{EnsembleSampler}\) of the \(\texttt{emcee}\) python package. The new point is included in the sample.
Parameter Optimization¶
The regularization strength \(\tau\) and the log-offset \(d\) have to be optimized. This is done by means of a simple grid search which aims to minimize the chi-squared distance between the unfolded result (on MC data) and the MC truth. The following metric is calculated for the individual unfoldings:
Systematic Uncertainties¶
Certain parameters in the simulation chain have uncertainties, which is why it is necessary to estimate their impact on the analysis as so-called systematic uncertainties. In terms of unfolding, the systematics are described by systematic parameters, which are fitted to the data as additional nuisance parameters. The detector repsonse matrix then depends on these additional parameters:
For each systematic, new simulation sets are used. These simulation sets are created with the systematic parameters variing within their defined range. This enables to construct weighting functions \(\omega_{i}(\xi_{j})\) (\(i\): Bin, \(j\): Systematic) that present the relative change of the bin content compared to the baseline simulation. To consider all parameters in a single bin, the functions are multiplied
The response matrix is then calculated as
The likelihood \(\mathcal{L}\) now also depends on the nuisance parameters \(\vec{\xi}\), which makes it possible to fit these parameters in the unfolding.
Unfolding¶
In this analysis, we would like to unfold the energy of the leading muon at the surface. This is variable is called our target variable. For the unfolding, we need proxies, that correlate with our target.
Proxy Variable¶
At first, we need a proxy that correlates with the true physical quantity. In Figure Fig. 132, the muon energy of the leading muon at entry is used as a proxy variable. The target is the energy of the leading muon at the surface.
Unfold Event Rate - one proxy¶
In Figure Fig. 133, the unfolded event rate of the muon energy of the leading muon at the surface is shown. In Figure Fig. 134, the same unfolding is performed with more steps and walkers.
Unfold Event Rate - multiple proxies / tree binning¶
Instead of using just one proxy, multiple proxies can be used. Due to the high dimensionality, a decision tree is used to bin the data. This is done in a way, that the tree is used to solve our problem: estimating the energy of the leading muon at the surface. But then, we don’t use the estimation of the energy, but just the leafs for the binning. This way, we can use the tree to bin the data. Of course, this approach can also be used with only one proxy. This is presented first, to compare the results with the classical binning by hand.
In Figures Fig. 135 and Fig. 136, the decision tree regression for the leading muon energy at the surface is shown. The tree is able to reconstruct the target variable.
Using this binning and the same number of steps and walkers a in Figure Fig. 133, the unfolding of the event rate results as in Figure Fig. 137. The tree binning improves the unfolding result.
Following, the leading muon energy at entry and the zenith angle are used as proxies. The tree binning is shown in Figure Fig. 138. The result of the unfolded event rate is shown in Figure Fig. 139.
Unfold Muon Flux - one proxy¶
For the unfolding of the muon flux at surface, an effective area is needed. This area is basically the information, how many muons correspond to a certain event measured by the detector.
In Figure Fig. 140 the muon flux at surface is unfolded using the leading muon energy at entry as a proxy with classical binning.