NumPy¶

This notebook was written by Kayla Leonard (June 2019) to supplement the currently existing IceCube Bootcamp Tutorials.

NumPy is an external python package that has very useful mathematical tools. It's documentation can be found at https://docs.scipy.org/doc/numpy/reference/.

Realistically the best way to find a new function is just to Google "numpy" with what ever you are trying to do (for example "numpy convert radians to degrees").

It is standard to name the package "np" when you import it.

Feel free to add additional cells and change the numbers to try your own

Topics include:

Special numbers (pi, infinity, etc.)
Trigonometry (sin, cos, etc.)
Random Numbers
Arrays (NumPy's version of lists)
Dataset calculations (mean, median, etc.)
Masks

import numpy as np

Special Numbers¶

np.pi

3.141592653589793

np.e

2.718281828459045

np.inf # infinity

inf

Trigonometry¶

If you are looking for standard trigometric functions in python, this package is probably what you want to use.

np.cos(np.pi)

-1.0

np.sin(np.pi/2)

1.0

The trig functions assume the input is in radians, so if you'd like to use degrees, there are useful functions to convert.

# These are equivalent functions
print(np.rad2deg(np.pi/2))
print(np.degrees(np.pi/2))

90.0
90.0

# These are equivalent functions
print(np.deg2rad(45))
print(np.radians(45))

0.7853981633974483
0.7853981633974483

np.sin(np.deg2rad(45)) # convets 45 degrees to radians, then takes the sin of that

0.7071067811865475

Random Numbers¶

NumPy has an exentsive list of functions helpful random nubmers, probability, and statistics.

np.random.random() # picks a random number between 0 and 1

0.3883925064755209

# rerun this cell several times and you will see the value change each time
np.random.random()

0.9854419945815169

If you want your results to be re-producible, you can run np.random.seed(13) or with your favorite number, and then every time you re-run the cell you will get the same number.

np.random.normal(0,10) # draws a random number from a gaussian distribution centered at 0 with standard deviation 10

-6.290842493187123

np.random.normal(0,10,10) # we can give it an additonal argument that is the length of the array we want

array([  3.18359367,  10.65710673,  15.70076931,   6.11426628,
        -5.56016407,   2.38773831,  -9.34114013, -12.87750992,
         3.15110509,  -1.97483647])

There are many other distributions available like binomial, chi-squared, poisson, gamma, etc. https://docs.scipy.org/doc/numpy/reference/routines.random.html

Arrays¶

There are several quick ways to initialize numpy arrays.

np.linspace(0,20,num=21) # returns an array of evenly spaced values from 0 to 20.
# Note: This will include both the start and end points so you must make the number n+1

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.,
       13., 14., 15., 16., 17., 18., 19., 20.])

np.linspace(0,100,num=26) # it will calculate the step size, it doesn't need to be 1

array([  0.,   4.,   8.,  12.,  16.,  20.,  24.,  28.,  32.,  36.,  40.,
        44.,  48.,  52.,  56.,  60.,  64.,  68.,  72.,  76.,  80.,  84.,
        88.,  92.,  96., 100.])

np.linspace(0,1,num=17) # it can also handle decimal numbers

array([0.    , 0.0625, 0.125 , 0.1875, 0.25  , 0.3125, 0.375 , 0.4375,
       0.5   , 0.5625, 0.625 , 0.6875, 0.75  , 0.8125, 0.875 , 0.9375,
       1.    ])

np.logspace(2,4,21) # similar to linspace but returns values that are evenly spaced in log space from 10**2 t o 10**4

array([  100.        ,   125.89254118,   158.48931925,   199.5262315 ,
         251.18864315,   316.22776602,   398.10717055,   501.18723363,
         630.95734448,   794.32823472,  1000.        ,  1258.92541179,
        1584.89319246,  1995.26231497,  2511.88643151,  3162.27766017,
        3981.07170553,  5011.87233627,  6309.5734448 ,  7943.28234724,
       10000.        ])

np.arange(0,100,4) # If you know the stepsize you want but not the number of items, you can use arange
# Note: This one does not include the end point in the list

array([ 0,  4,  8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64,
       68, 72, 76, 80, 84, 88, 92, 96])

np.zeros(10) # creates an array of 10 items, where all entries are zero

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

np.ones(10) # creates an array of 10 items, where all entries are one

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

Arrays are useful because they allow for elementwise manipulation of numbers that are prohibited with python lists.

[1,2,3,4]**2 # This will give us an error because we can't square a list

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-21-e570a8d4207d> in <module>
----> 1 [1,2,3,4]**2 # This will give us an error because we can't square a list

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

np.array([1,2,3,4])**2 # this will square each element in the array

array([ 1,  4,  9, 16])

You can also use elementwise calculations of two arrays of the same length:

np.array([1,2,3,4])+np.array([5,6,7,8])

array([ 6,  8, 10, 12])

np.array([1,2,3,4])*np.array([5,6,7,8])

array([ 5, 12, 21, 32])

You can also run numpy functions and it will apply it to each element.

np.sqrt([1,4,9,10])

array([1.        , 2.        , 3.        , 3.16227766])

np.sin([0,np.pi/4,np.pi/2])

array([0.        , 0.70710678, 1.        ])

Basic data set calculations¶

Given an array of values, we can calculate all the standard things like mean, median, etc.

scores = np.array([95,41,72,100,80,97,95])
print(scores)
print(type(scores))

[ 95  41  72 100  80  97  95]
<class 'numpy.ndarray'>

# Mean
print(np.mean(scores))
print(np.average(scores))

82.85714285714286
82.85714285714286

# Median
print(np.median(scores))

95.0

# calculate the value of the 90th percentile
print(np.percentile(scores,90))

98.2

# Standard Deviation
print(np.std(scores))

19.51869851800408

Masks¶

Masks are arrays of True or False that can be used to identify certain elements in an array.

# Let's figure out which elements in this array are not zero:
my_array = np.array([1,2,0,3,0,4,0])
my_mask = my_array!=0
print(my_mask)

[ True  True False  True False  True False]

# We can also see which elements are greater than 3
my_mask2 = my_array>3
print(my_mask2)

[False False False False False  True False]

Masks allow you to extract only certain elements using the following notation:

my_array[my_mask]

array([1, 2, 3, 4])