Unblinding
Proposal For 2001-2003
Rolling Cascade Search
Unblinding
Results
No signal was found. The results of
the 3 year unblinding are quite consistent with background
expectations. The maximum number of events in any bin in the 1
second time window search was 2, a result that was 70.1% probable
assuming only background. The maximum number of events in any bin
in the 100 second search was 3, a result that was 75.3% probable
assuming only background. The total number of bins with 2 or 3
events is also quite consistent with the background rate assuming
Poissonian statistics.
Short Window Search: Distribution of total number of windows with 2
events in full sample determined by 10000 cycles of toy Monte Carlo.
Actual results are marked in red.
Long
Window Search: Distribution of total number of windows with 2 events or
3 events in full sample determined by 10000 cycles of toy Monte
Carlo. Actual results are marked in red.
My original unblinding
proposal for the 2001 data set is accessible here. Longer
discussions of several issues such as sensitivity and optimization for
discovery are found there, and these principles have not changed for
this analysis. The current proposal, however, is intended to be
self-contained, and is probably long enough by itself. Links to
plots and more details are available throughout. Internal links
are in
blue, external links are in red.
Changes made since previous iteration of this proposal:
Time windows have been returned to 100 and
1 seconds and the support vector machine cut variables have been
returned to the same values used in 2001. This was done to avoid
blindness issues arising from re-analyzing the 2001 data with different
cuts. The 2001 data now uses the same cuts as the previous
unblinding. The additional years, 2002 and 2003, use the same
cuts except for a change in the definition of the direct hits cut
(which is not a part of the support vector machine). All plots
and numbers now reflect these changes.
Proposal
I am proposing to conduct the search described below on the
2001, 2002 and 2003 data sets.
Overview
The
concept
of the rolling search is a
straightforward one: simply scan through all the data in a given
year and search for a statistically significant signal within a fixed
duration. The proposed rolling search is to be done with
the 2001, 2002 and 2003 data sets. This is during the period
after BATSE and before Swift, during which a large percentage of GRBs
were undetected. It uses the cascade channel
and
optimizes on a Waxman-Bahcall broken
power law energy
spectrum with break energies at 105
GeV and
107 GeV, consistent with expectations for a
GRB neutrino signal from prompt emission. The signal Monte Carlo uses
neutral current
interactions for all three neutrino flavors and charged current
interactions for electron and tau neutrinos.
The
specific method employed is to start at each event that survives cuts
and
check in a 100 second or 1 second time window following this event for
other
surviving events. Since each surviving event starts a new window,
it is guaranteed that one will not miss a significant
cluster. The significance of the largest cluster of
events occuring during these years will be
evaluated for both time windows. We will also check for two or
three independent upward
fluctuations which achieve a considerable level of significance when
taken together,
and for coincidence of a cluster of events with IPN3
(third
interplanetary network) satellite
detections.
This analysis was performed on the 2001 data set in
September
2005. No
signal was observed, and results were consistent with the expected
background. This unblinding proposal concerns the extension of
this analysis to the 2002 and 2003 data sets.
Motivation
Although the IPN3 satellite network was
detecting GRBs during the period 2001-2003, the BATSE detector aboard
CGRO ceased operations in early 2000 and the Swift satellite did not
launch until late 2004. It is clear that the majority of GRBs
went undetected by gamma-ray detectors during this period, since the
IPN3 network detects GRBs at a lesser rate than BATSE did (nominally
about a GRB per week for the IPN3 network compared to about a GRB per
day for BATSE), and BATSE itself had only ~2/3 sky coverage. This
search is therefore designed as a complement to satellite-coincident
searches which looks for a transient neutrino signal without relying on
a satellite trigger. In addition to conventional GRBs, there is
the potential to identify other transient phenomena which may be
visible by neutrino detection but not via photons. An example of
such
a phenomenon is the so-called "choked"
GRB, which would emit neutrinos in a fashion similar to precursor
neutrinos from a
normal GRB, but would fail to actually become a gamma ray burst because
the jet was unable to push through the stellar envelope.
Summary of Changes Between Previous
Analysis and This One
The analysis remains conceptually unchanged from the previous
unblinding, but there have been a few modifications:
1. Most obviously, there is 3 times as much data, which requires
tighter cuts to keep the chance of upward background
fluctuations sufficiently low.
2. Flarechecking cuts have changed.
3. Data reduction has been slightly improved by replacing the cut
Ndird(muon
fit)/Nhits with (Ndird(muon fit) - Ndird(cascade
fit))/Nhits, which shows improved separation
between signal and
background. This change was made only for 2002 and
2003 in order to keep 2001 cuts the same as in the original
unblinding.
Run/File Selection
Runs were required to be at least 4000 seconds long and taken only from
the February to October period when the station was closed to avoid
data spikes from human interference. Bad files identified in the
Zeuthen point source analysis's filtering page http://www-zeuthen.desy.de/%7Ebernardi/point/combined00-03/Processing.html
were removed. Runs 7219 and 7249 in the 2003 data set were
removed entirely because of multiple gaps resulting from bad
files. Run 3399 was removed from the 2001 analysis due to
abnormal behavior in the flarechecking variables. The livetime
used for 2001 is 183.4 days with 21.3%
deadtime, the livetime for 2002 is 193.8 days with an average
15.0% deadtime and the livetime for 2003 is 185.2 days with an average
15.3% deadtime.
Reconstruction and Low Level filtering
The 2001 data uses the Madison filtering high energy stream.
The 2002 and 2003 data sets use Henrike Wissing's filtering at
Zeuthen. The same fits, including upandel muon and
cascade fits, were applied to all 3 years.
Monte Carlo
dCorsika was used as background Monte Carlo and ANIS and Tea were both
used as signal MC. Filtering matches that used on the real data
as closely as possible. Thus, the Monte Carlo for the 2002 and
2003 samples were filtered in Sieglinde, while the 2001 Monte Carlo was
filtered in Siegmund, just like the real data sets, even though this
makes essentially no difference in the end result.
Flarechecking cuts used in this analysis are as follows: For
2002 and 2003 the cuts are Induc_b10 <
16, Induc_11 < 8, Missing < 14. For 2001
the cuts are Induc_b10 <
16, Induc_11 < 8, and short_m < 14. Plots showing cuts for
all the flarechecking variables are available for 2001, 2002 and
2003.
In addition, the top 1% of
values were removed from the five distributions which did not show any
selection effects at higher cut levels as per Arvid Pohl's flarechecking proposal.
Distributions for extended flarechecking
cuts are available for 2001, 2002 and 2003.
Final Cut Selection
The analysis was optimized for
discovery rather than sensitivity. This was defined as
determining the lowest possible
neutrino flux such that there was a 90% chance of observing an event
cluster
with at least 5 sigma significance (the Model Discovery
Potential method.) As in the previous iteration of this analysis,
the distribution of neutrino events per burst is modelled according to
the predictions in Guetta et al.
The support
vector machines were trained independently for each year, since each
year is slightly different. A large number of
support vector machines with varying cut tightness were trained for all
3 years, then matched to each other such that each year has the same
average rate of surviving events. The percentage of signal
retained is no more
than a few percent different for each year when this is used as the
standard.
For the 100 second search, the optimal cuts result in an average
background rate of 1 event per 2404 seconds and the percentage of
signal retained by the support vector machine cut is 67% for 2001, 63%
for 2002 and 65% for 2003 (weighted average of all 3
flavors). For the 1 second search, the average background
rate is 1 event every 427.5 seconds and the signal retention rates for
the support vector machine cut are 92% for 2001, 90% for 2002 and 91%
for 2003. (Note that these
signal retentions are just for the support vector machine cut stage,
which must be multiplied by an additional factor of ~.64 to get signal
retention relative to trigger level.) A 5 sigma detection
would require a cluster of 7 events in the 100 second search or 5
events
in the 1 second search.
Happily, both of these choices are very close to the optimal
sensitivity. The sensitivity for this analysis is 1.62 X 10-6
GeV/cm2ssrfor all flavors, assuming a rate of 667
GRBs per year based on the BATSE rate of detection and a 1:1:1 flavor
ratio. This number is consistent with the expectation of a
1/sqrt(3) improvement over the previous analysis resulting from a data
sample roughly 3 times as large. MDF and MRF plots for both the
short and long searches are available here.
Since GRB signals tend to be dominated by a few spectacular bursts, the
analysis is setup to look for a single significant event.
However, we will also check for 2 or 3 separate clusters which are not
themselves significant, but have a combined significance greater than 5
sigma when taken together.
In the absence of a discovery, we will also check for clusters with
more marginal significance (3 or 4 sigma).
If there is a larger-than-expected cluster of events, we will also
check against the occurences of GRBs detected by the IPN3
network. Any part of the GRB overlapping with the time window is
counted as a coincidence in our calculations. The set of bursts
to be checked against includes roughly 80 to 90 bursts per year, many
of which do not have well-determined durations. For the majority
of these bursts, the duration was estimated (by me) from the light
curve obtained by the Konus-Wind satellite and is not guaranteed.
Where possible, the durations used in Kyler's IPN analysis are also
applied here. This is not a full-fledged
satellite-coincident analysis, just an additional check made after
obtaining the rolling search results.
The following checks for coincidence with a gamma ray detection will be
made:
-A 6 event cluster from the 100 second search in coincidence with a GRB
would have a significance greater than 5 sigma.
-A 4 event cluster from the 100 second search in coincidence
with a
GRB would have a significance greater than 4 sigma.
-A 4 event cluster from the 1 second search in coincidence
with a GRB would have a significance greater than 5 sigma.
-A 3 event cluster from the 1 second search in coincidence
with a GRB would have a significance greater than 4 sigma.
Total Probability of a
False Detection
Summing the chance probabilities for all of the checks for
discovery results in a probability below 6.2 X 10-7 (5 sigma
significance)
scenario
probability
7 events in long time window
2.0 X 10-7
5 events in short time window
2.0 X 10-7
2 or 3 event combinations
1.2 X 10-7
IPN3 coincidence
0.2 X 10-7
Total
5.4 X 10-7
The odds of any of these scenarios overlapping given just
background events, for example 7 events in a 100 second window which
include 5 events in a 1 second time window, are sufficiently small that
it is approximately correct to simply add the probabilities to obtain
the total.
Similarly, the chance
probability of any
scenarios specified for 4 sigma significance totals 2.9 X 10-4 and the
probability for any 3 sigma significance totals 8.7 X 10-4.