Optimized Corsika - tests and run time estimates

I Standard corsika (100k files) set 1541 (weighted spectrum of Hoerandel dslope=-1 ) log(Energy):
after Level6 and 7 cuts can be found here


II Optimized Corsika - tests

Two Oksana's Corsika settings have been tested (Amanda cascade analysis) with an extra cut on the most energetic cascades (Sebastians IcePick filter)
Tested: Oxana setting 1 with LogLeadingCscd_Energy ge 2.0 (set1a), Oxana setting 2 with LogLeadingCscd_Energy ge 2.0 (set2a) and Setting3 with LogLeadingCscd_Energy ge 2.0 (set3a), see Table below for explicite cut values


Test other possibilities:
- smaller generation volume? (level4 cuts reduce number of corsika events at online cascade filter level by a factor of 200 - 400 depending on corsika settings)

1. WeightedCorsika Old MC
standard
New MC
standard (test)
New MC
optimized set1a
New MC
optimized set2a
New MC
optimized set3a
New MC
optimized set4a
New MC
optimized set5a
2a. Energy_Primary Cut 600 GeV 600 GeV 3 TeV 20 TeV 40 TeV 40 TeV 80 TeV
2b. Energy_Muons Cut (ecuts2) 273 GeV 273 GeV 1.2 TeV 3.0 TeV 3.0 TeV 5.0 TeV 5.0 TeV
2c. Energy_electrons Cut (ecuts3) 0.003 GeV 0.003 GeV 500 GeV 800 GeV 800 GeV 800 GeV 800 GeV
2d. Log10(LeadCscdEnergy Cut) - - 2.0 2.0 2.0 2.0 2.0
3. DatasetId 1625 ( 1541 ) 2301 2471 2479 2487 ( 2480 2600 ) 2493 2519 2512
4. Number of files 100 000 files 90 files 100 files 200 files (934, 9841 files) 100 files 991 files 200 files
4.a Number of gen corsika showers per file 400 000 400 000 400 000 400 000 400 000 400 000 400 000
4.b dslope -1 -1 -1 -1 -1 -1 -1
5. NumberOfEvents per file (IC22 triggered) 2800 2900 4000 (xxx) 7400 10900 6500 3000
6. NumberOfEvents per file (Level2) 900 935 1600 (xxx) 3000 4500 2600 1100
7. NumberOfEvents per file (Cscd filter only at level2) xx 216 450 912 (44222/50=884) 1370 416
8. NumberOfEvents per file
(level2 CscdFilter + log(RecoEn) gt 4)
xx 21 49 79 (3959/50=79.18) 126 50
9. NumberOfEvents per file (level4) 55000/100000=0.55 48/90 = 0.53 (8) 101/100 = 1.0 578/200 = 2.89 (2617/934=2.80) 412/100 = 4.12 2084/872 = 2.4
10. NumberOfEvents per file (level6) 3240/100000=0.03 4/90 = 0.04 9/100 = 0.09 88/200 = 0.44 (415/934=0.44, 4181/9841=0.42) 53/100 = 0.53 350/872 = 0.4 14/150= 0.09
11. NumberOfEvents per file
(level6 log(RecoEn) gt 3.0)
1153/100000=0.01 xx 4/100 = 0.04 43/200 = 0.21 (164/934=0.18, 1807/9841=0.18) 26/100 = 0.26 167/872 = 0.19 8/150 = 0.05
12. NumberOfEvents per file
(level4 + log(RecoEn) gt 2.0 )
xxx 39/90 = 0.43 (7) 93/100 =0.93 549 /200 = 2.75 (2461/934=2.63) 391 /100 = 3.91
13. NumberOfEvents per file
(level4 + log(RecoEn) gt 2.6 )
xxx 11/90 = 0.12 (4) 38/100=0.38 279 /200 = 1.40 (1247/934=1.34) 194 /100 = 1.94
PDSF (e.g. 1 job only) RunTime (wallclock) xx pc1016 26700 s (7.4h) pc1016 a) job=2479.26 50573 s (14 h)
b) job=2479.97 27167 s (7.5h)
PDSF (e.g. 1 job only) RunTime (user CPU) (*) xx pc1016 16000 s (4.4h) pc1016 a) job=2479.26 47554 s (13.2 h)
b) job=2479.97 26380 s (7.3h)
Running time on PDSF machines: 1 node= 2x4=8 cores, cpu speed = 2 GHz , total memory 16 GB
1 core=1 job slot
(*) USer CPU at pdsf : not correct ? to be checked with pdsf experts what is user CPU (does not look like cpu)




7) At level2 (cascade filter only)

Rate versus log(Energy):


Test of standard corsika (a) large statistics older software set 1541 (black histogram) compared with (b) newer software (used for optimized corsika sets) set 2301 (green histogram): Rate [Hz] vs log (MCPrimary_Energy) and ratio vs log (MCPrimary_Energy)
Conclusion: standard corsika sets (old and new software, settings the same) are consistent

10) After level6 (low statistics) (no energy cut)
10a) Rate versus log(Energy): black=standard corsika


10b) Effective Livetime ( icecube/200902001-v2 ) vs : MCLeadCascade log(Energy) , log(RecoEnergy) , MCPrimary log(Energy)

10c) Rate versus log(Energy): (same as 10a but only 3 high statistics histograms)

(left) Ratio=Rate(set 2a)/Rate(standard) vs log10(Primary_Energy) and (right) Ratio=Rate(set 4a)/Rate(standard) vs log10(Primary_Energy)



11) After level6 (low statistics) (logRecoEn gt 3.0 )
Note: In the analysis log(RecoEnergy) gt 4 cut is used at the final cut level

11a) Rate versus log(Energy):


11b) Effective Livetime ( icecube/200902001-v2 ) vs : MCLeadCascade log(Energy) , log(RecoEnergy) , MCPrimary log(Energy)

11c) Rate versus log(Energy): (same as 11a but only 3 high statistics histograms)


(left) Ratio=Rate(set 2a)/Rate(standard) vs log10(Primary_Energy) and (right) Ratio=Rate(set 4a)/Rate(standard) vs log10(Primary_Energy)



Run time (rough!) estimate:
Optimized corsika test samples have limited statistics, but we can see that for optimized and standard corsika rates are ~ consistent and similar energy ranges are covered for standard and optimized corsika; MC statistics enhancement factor is ~ or more than 20
To get the same amount of statistics as for standard MC (100 000 jobs), we would need to run 4400 jobs (400 000 corsika showers per job)
At pdsf it would take a month or longer. We need more than what we have in stndard IC22 MC.


12) After level7 (no energy cut)
Note: In the analysis log(RecoEnergy) gt 4 cut is used at the final cut level

12a) Rate versus log(Energy):


12b) Effective Livetime ( icecube/200902001-v2 ) vs : MCLeadCascade log(Energy) , log(RecoEnergy) , MCPrimary log(Energy)