NumPy
numpy - numerical python
- NumPy provides a computational foundation for general numeric data processing.
- The lingua franca for data exchange.
- Functions for performing element-wise computations with arrays or mathematical operations between arrays
- A fast and efficient multidimensional array object
ndarray
- Tools for reading and writing array-based datasets to disk and working with memory-mapped files
- Linear algebra operations, Fourier transform, and random number generation
- A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.
Install
pip install numpy
Import
import numpy as np
Operations
-
Fast vectorized array operations for data munging and cleaning, subsetting and filtering, transformation, and any other kinds of computations
-
Common array algorithms
- sorting
- unique
- set operations, etc.
-
Efficient descriptive statistics and aggregating/summarizing data
-
Data alignment and relational data manipulations for merging and joining together heterogeneous datasets
-
Expressing conditional logic as array expressions instead of loops with if-elif-else branches
-
Group-wise data manipulations
- aggregation
- transformation
- function application
Efficiency
-
NumPy internally stores data in a contiguous block of memory.
-
It performs complex computations on entire arrays without the need for
for
loops. -
NumPy arrays are more efficient for storing and manipulating data than the other built-in Python data structures.
Hello world
import numpy as np
# numpy arrays := vectors (1D) and matrices (2D)
my_list = [1, 2, 3]
np.array(my_list)
my_matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
np.array(my_matrix)
np.arange(0, 10) # ([0, 1, ..., 9])
np.arange(0, 11, 2) # ([0, 2, 4, 6, 8, 10])
# np.ones() similar to the zeros
np.zeros(3) # ([0., 0., 0.])
np.zeros((5, 5)) # ([[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]])
np.linspace(0, 10, 3) # ([0., 5., 10.]) # 0 and 10 are included.
np.eye(4) # [4x4] identity matrix
# rand creates an array of the given shape and populates it with random samples from a uniform distribution over [0, 1].
np.random.rand(2) # ([0.23565463, 0.46336045])
np.random.rand(5,5) # similar to the previous, but the size is [5x5].
np.random.randn(2) # rand was uniform distribution, this is standard
np.random.randint(1, 100) # low, high # 62
np.random.randint(1, 100, 10) # (low, high, #items) # ([80, 38, 51, 16, 90, 85, 53, 60, 18, 57])
arr = np.arange(25) # ([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24])
arr.reshape(5, 5) # ([0, 1, 2, 3, 4] , [...], ..., [..., 24])
arr.max()
arr.argmax()
arr.min()
arr.argmin()
arr.shape # (#rows, #cols)
arr.dtype
arr[0]
arr[1:5]
arr[0:5] = 100 # the first five elements are 100
slice_arr = arr[0:6] # slice_arr = ([0, 1, 2, 3, 4, 5])
arr_copy = arr.copy() # to create a copy
arr[0] # the first row
arr[0][1] # arr[row][col]
arr[0, 1] # arr[row, col]
arr = np.arange(1,11)
arr[arr>2] # array([ 3, 4, 5, 6, 7, 8, 9, 10])
Arithmetic operations
arr = np.array([[1., 2., 3.], [4., 5., 6.]])
arr * arr # multiples one-by-one
arr - arr # zero matrix
1 / arr
arr ** 0.5
arr2 = np.array([[0., 4., 1.], [7., 2., 12.]])
arr > arr2 # compares each element one-by-one and returns a boolean matrix
Array creation
np.arange()
- Return evenly spaced values within a given interval.
# np.arange(<length>)
np.arange(3) # array([0, 1, 2])
np.arange(3.) # array([0., 1., 2.])
# np.arange(start=<start>, stop=<stop> step=<step>)
np.arange(3, 7) # array([3, 4, 5, 6])
np.arange(3, 7, 2) # array([3 5])
array
data = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr = np.array(data)
arr.ndim # 2
arr.shape # (2, 4)
arr.dtype # int64
asarray
empty_like
empty
- Creates an array without initializing its values to any particular value.
- It is not safe to assume that
np.empty
will return an array of all zeros.
# creates two nxm matrix in a single one
np.empty((2, 3, 2))
eye
full_like
full
identity
ones_like
ones
zeros_like
zeros
-
shape
:= int or tuple of ints -
dtype
:= data type -
order
:= {‘C’, ‘F’} row-major or col-major order in memory -
like
:= array_like -
Return:
out
:= ndarray
# np.zeros(<m>)
np.zeros(5) # [0., 0., 0., 0., 0.]
np.zeros((5,), dtype=int) # [0, 0, 0, 0, 0]
# np.zeros((<n>, <m>)) # nxm matrix
np.zeros((2,1))
Assignment
- The “bare” slice [:] will assign to all values in an array.
arr[:] = <value> # it assigns value to each element in matrix
np.column_stack()
- It stacks two 1D-array on top of each other.
np.diag()
np.diag([1, 2, 3])
File operations
import numpy as np
# load floating numbers from a file
floats = numpy.loadtxt('<file_name>')
# save the variable to the file
numpy.save('<file_name>', <variable>)
floor_divide
numpy.floor_divide(<array_like_numerator>, <array_like_denominator>)
Indexing and slicing
-
arr[i:j]
-
arr[i][j] <=> arr[i, j]
-
arr[i]
arr[i, j, k]
-
The ellipsis
...
is a Python parser.x[i, ...] <==> x[i, :, :, :,]
Boolean indexing
Matrix
data
:= array_like or stringdtype
:= datacopy
:= bool
arr = np.matrix('1 2; 3 4')
# equals to
arr = np.matrix([1, 2], [3, 4])
ndarray
- N-dimensional array similar to
list
, but it has a fixed size and common data type for all elements. - It is a fast and flexible container for large datasets in Python.
- Items in
ndarray
can be fetched usinga[i, j]
ora[m:n, k:l]
# a 2D slice
data.shape
(n, m)data.dtype
i.e.dtype('float64')
np.random.randint()
np.random.randint(low=<min_value>, high=<max_value> size=(<n>, <m>))
random.randn()
- Returns a sample(s) from the “standard normal distribution”
from numpy.random import randn
data = {i : randn() for i in range(7)} # [<#>, <#>, <#>, <#>, <#>, <#>, <#>]
data = randn(2, 3) # array([[<#>, <#>, <#>], [<#>, <#>, <#>]])
numpy.random.seed()
- We set seeds to ensure we see the same random numbers each time we run the program.
numpy.random.seed()
creates a random number which is generated with a predictable algorithm.
reshape()
${array_variable}.reshape(${number_of_rows}, ${number_of_columns})
shape
${array_variable}.shape
size
np.size(${array_variable})
T (transpose)
<np_array>.T