Archive

Posts Tagged ‘Data Science’

NumPy in a Nutshell

Hello and welcome back. I have started a new category in my blog about Python. The purpose of this post is to go through NumPy library. I will be using Jupyter for the demo but will provide the py file if you prefer to run it in PyCharm for example. NumPy is a core Python Linear Algebra library for Data Science used for faster array processing than the native Python lists with a bunch of handy methods. Let’s make a start!

 

You can cast a normal list to a one-dimensional array using the array function.

Or have a list of list and cast it as a two-dimensional array. This effectively is a matrix that has 2 rows and 4 columns. The size attribute gives the number of elements of the array.

Next section shows different ways to create NumPy arrays.

Functions ones and zeros are a handy way to create arrays of 1s and 0s. Linspace is another function similar to arange but using equal steps. Also check out reshape and ravel().

See examples of other useful methods below.

Next, let’s have a look at selections and indexing.

Great stuff. To Illustrate the indexing, let’s create a new two-dimensional array.

Let’s see what other operations you can do apart from copy().

Let’s have a look at some basic operations like += or *= to change an existing array instead of creating a new one. Check out how to calculate the sum of all elements of an array or find the min or max value below.

As promised see below the py file with all the examples.

import numpy as np

# normal list
v_even_list = [20, 40, 60]

print(v_even_list)

# cast to 1-dimensional array

print(np.array(v_even_list))

# 2-dimensional array

v_matrix = np.array([[10,20,30,40],
                     [50, 60, 70, 80]])

print(v_matrix)

print(v_matrix.shape)

print(v_matrix.size)




# Create Array:

# Use arange

v_array = np.arange(20)


print(v_array)


# Use array to create one-dimensional array


v_array1 = np.array([1,3,5])


print(v_array1)


# Use array to create n-dimensional array


v_array2 = np.array([[1,2,3],
                     [4,5,6],
                     [7,8,9],
                     [10,11,12]])



print(v_array2)


v_array3 = np.array([(2.25,3.25,4.25), (5,6,7)])



print(v_array3)


# Create array of 1s. Note default type is float64

v_one = np.ones((2,3))

print(v_one)


# Create array of 0s, specify the type needed


v_zero = np.zeros((2,4), dtype=np.int16)



print(v_zero)


# Array of 3 numbers between 5 and 10 in equal steps


v_eq_steps = np.linspace(5,10, 3)



print(v_eq_steps)


# 2-dimensional array with 3 rows and 5 columns to modify the shape the way you need.
# ravel() is the opposite and will flatten the array


r = np.arange(15).reshape(3,5)

print(r)

# Array of random values in this case a matrix with 2 rows and 3 columns

v_array = np.random.rand(2,3)


print(v_array)

# Random 20 integer values in the range of 10 and 100

v_arr_int = np.random.randint(10, 100, 20)


print(v_arr_int)


# The index of the min value in the array

print(v_arr_int.argmin())


# The index of the max value in the array

print(v_arr_int.argmax())


# return elements of the array where value is > 30


print(v_arr_int[v_arr_int>30])




# Create a sample array


v_array = np.arange(20)


print(v_array)


# Slice from index 5 to 10


print(v_array[5:10])


# Everything up to index 10

print(v_array[:10])


# All elements beyond index 10


print(v_array[10:])


# We can assign values which is called broadcast and then slice


v_array[15:20]=-5


print(v_array)

v_slice_array = v_array[15:20]


print(v_slice_array)


# Broadcast actually change the original array.
# You can use v_array.copy() to keep the original values

# Create a sample matrix



v_matrix = np.array([[1,2,3,4],
                    [5,6,7,8]])


print(v_matrix)


# Get the row specified by the index

print(v_matrix[0])


# Get just one value - the element from the last row and last column

print(v_matrix[1,3])



# Return submatrices eg a slice which is anything beyond row 0 and after column 2


print(v_matrix[0:,2:])


# Nice one!


# Create a sample with 0s

# In[2]:


m = np.zeros((2,4), dtype = int)


print(m)


# Modify existing a to add 5

m += 5

print(m)


# Modify a to multiply by 4

m *= 4


print(m)

print(m.sum())


print(m.min())


print(m.max())


# Sum of each column

print(m.sum(axis = 0))


# Cumulative sum of each row

b = np.arange(6).reshape(2,3)

print(b)

print(b.cumsum(axis = 1))


 

That’s all for now. Stay tuned.

 

Cheers,

Maria

Categories: Python Tags: ,
%d bloggers like this: