Code a Neural Network in plain NumPy: Part 1 (with no Hidden Layer)

You will learn to:

Build the general architecture of a learning algorithm, including:
Initializing parameters
Calculating the cost function and its gradient
Using an optimization algorithm (gradient descent)

Gather all three functions above into a main model function, in the right order.

Overview of the Problem set

Problem Statement: You are given a dataset containing:

- a training set of m_train images labeled as cat (y=1) or non-cat (y=0)
- a test set of m_test images labeled as cat or non-cat
- each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB). Thus, each image is square (height = num_px) and (width = num_px).

You will build a simple image-recognition algorithm that can correctly classify pictures as cat or non-cat.

Show me the code

import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset # for this lib git link is provided
%matplotlib inline

Let’s get more familiar with the dataset.In [183]:

# Loading the data (cat/non-cat)
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()

In [28]:

train_set_x_orig.shape

Out[28]:

(209, 64, 64, 3)

In [29]:

train_set_y.shape

Out[29]:

(1, 209)

In [21]:

classes

Out[21]:

array([b'non-cat', b'cat'], dtype='|S7')

In [184]:

# Example of a picture
index = 200
plt.imshow(train_set_x_orig[index])

Out[184]:

<matplotlib.image.AxesImage at 0x22c6a3ba240>

Exercise: 1

Find the values for:

m_train (number of training examples)
m_test (number of test examples)
num_px (= height = width of a training image) Remember that train_set_x_orig is a numpy-array of shape (m_train, num_px, num_px, 3). For instance, you can access m_train by writing train_set_x_orig.shape[0].

In [186]:

m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]

print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("Height/Width of each image: num_px = " + str(num_px))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_set_x shape: " + str(train_set_x_orig.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x shape: " + str(test_set_x_orig.shape))
print ("test_set_y shape: " + str(test_set_y.shape))

Number of training examples: m_train = 209
Number of testing examples: m_test = 50
Height/Width of each image: num_px = 64
Each image is of size: (64, 64, 3)
train_set_x shape: (209, 64, 64, 3)
train_set_y shape: (1, 209)
test_set_x shape: (50, 64, 64, 3)
test_set_y shape: (1, 50)

Exercise:2

Standardize and Reshape the training and test data sets so that images of size (num_px, num_px, 3) are flattened into single vectors of shape (num_px ∗ num_px ∗ 3, 1).In [66]:

train_set_x_flatten =train_set_x_orig.reshape(train_set_x_orig.shape[1]*train_set_x_orig.shape[1]*3,train_set_x_orig.shape[0])

In [67]:

train_set_x_flatten .shape

Out[67]:

(12288, 209)

In [54]:

test_set_x_flatten =test_set_x_orig.reshape(test_set_x_orig.shape[1]*test_set_x_orig.shape[1]*3,test_set_x_orig.shape[0])

In [56]:

test_set_x_flatten.shape

Out[56]:

(12288, 50)

To represent color images, the red, green and blue channels (RGB) must be specified for each pixel, and so the pixel value is actually a vector of three numbers ranging from 0 to 255. Let’s standardize our dataset.In [69]:

train_set_x = train_set_x_flatten/255.
test_set_x = test_set_x_flatten/255.

General Architecture of the learning algorithm

Initialize the parameters of the model
Learn the parameters for the model by minimizing the cost
Use the learned parameters to make predictions (on the test set)
Analyse the results and conclude

Building the parts of our algorithm

The main steps for building a Neural Network are: Define the model structure (such as number of input features) Initialize the model’s parameters Loop: Calculate current loss (forward propagation) Calculate current gradient (backward propagation) Update parameters (gradient descent) You often build 1-3 separately and integrate them into one function we call model().

Exercise:3

Implement sigmoid() function .In [188]:

def sigmoid(z):
    return 1/(1+np.exp(-z))

In [189]:

print ("sigmoid([0, 2]) = " + str(sigmoid(np.array([0,2]))))

sigmoid([0, 2]) = [0.5        0.88079708]

Initializing parameters

Exercise:4

Implement parameter initialization. You have to initialize w as a vector of zeros. If you don’t know what numpy function to use, look up np.zeros() in the Numpy library’s documentation.In [192]:

def initialize_with_zeros(dim):
    w=np.zeros((dim,1))
    b=0
    return w,b

In [193]:

dim = 5
w, b = initialize_with_zeros(dim)
print ("w = " + str(w))
print ("b = " + str(b))

w = [[0.]
 [0.]
 [0.]
 [0.]
 [0.]]
b = 0

Forward and Backward propagation

Exercise:5

Implement a function propagate() that computes the cost function and its gradient.In [194]:

def propagate(w, b, X, Y):
    m=X.shape[1]
    A=sigmoid((np.dot(w.T,X) + b))
    yloga=np.multiply(Y,np.log(A))
    ylogaa=np.multiply((1-Y),np.log(1-A))
    cost=(-1/m)*(np.sum(yloga+ylogaa))
    dw=(1/m)*(np.dot(X,(A-Y).T))
    db=(1/m)*(np.sum(A-Y))
    assert(dw.shape == w.shape)
    assert(db.dtype == float)
    cost = np.squeeze(cost)
    assert(cost.shape == ())
    grads = {"dw": dw,
             "db": db}
    return grads, cost

In [195]:

w, b, X, Y = np.array([[1.],[2.]]), 2., np.array([[1.,2.,-1.],[3.,4.,-3.2]]), np.array([[1,0,1]])
grads, cost = propagate(w, b, X, Y)
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))
print ("cost = " + str(cost))

dw = [[0.99845601]
 [2.39507239]]
db = 0.001455578136784208
cost = 5.801545319394553

Optimization

You have initialized your parameters. You are also able to compute a cost function and its gradient. Now, you want to update the parameters using gradient descent.

Exercise: 6

Write down the optimization function. The goal is to learn w and b by minimizing the cost function J . For a parameter θ ,the update rule is θ=θ−α*dθ ,where α is the learning rate.In [196]:

def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):
    costs=[]
    for i in range(num_iterations):
        grads, cost=propagate(w, b, X, Y)
        
        dw=grads['dw']
        db=grads['db']
        
        w=w-np.multiply(learning_rate,dw)
        b=b-np.multiply(learning_rate,db)
        
        costs.append(cost)
        
        # Record the costs
        if i % 100 == 0:
            costs.append(cost)
            
        # Print the cost every 100 training iterations
        if print_cost and i % 100 == 0:
            print ("Cost after iteration %i: %f" %(i, cost))
    
    params = {"w": w,
              "b": b}
    grads = {"dw": dw,
             "db": db} 
    return params, grads, costs

In [197]:

params, grads, costs = optimize(w, b, X, Y, num_iterations= 100, learning_rate = 0.009, print_cost = False)

print ("w = " + str(params["w"]))
print ("b = " + str(params["b"]))
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))

w = [[0.19033591]
 [0.12259159]]
b = 1.9253598300845747
dw = [[0.67752042]
 [1.41625495]]
db = 0.21919450454067652

Exercise: 7

We are able to learned w and b for a dataset X. Now Implement the predict() function.In [147]:

def predict(w, b, X):
    m = X.shape[1]
    Y_prediction = np.zeros((1,m))
    A=sigmoid(np.dot(w.T,X)+b)
    for i in range(A.shape[1]):
            if A[0,i] > 0.5:
                Y_prediction[0][i] = 1
            else:
                Y_prediction[0][i] = 0
    assert(Y_prediction.shape == (1, m))
    return Y_prediction

In [148]:

w = np.array([[0.1124579],[0.23106775]])
b = -0.3
X = np.array([[1.,-1.1,-3.2],[1.2,2.,0.1]])
print ("predictions = " + str(predict(w, b, X)))

predictions = [[1. 1. 0.]]

Merge all functions into a model

You will now see how the overall model is structured by putting together all the building blocks (functions implemented in the previous parts) together, in the right order.

Exercise: Implement the model function. Use the following notation:

Y_prediction_test for your predictions on the test set
Y_prediction_train for your predictions on the train set
w, costs, grads for the outputs of optimize()

In [198]:

def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):
    dim=X_train.shape[0]
    w,b=initialize_with_zeros(dim)
    params, grads, costs = optimize(w, b, X_train, Y_train, num_iterations= num_iterations, learning_rate= learning_rate, print_cost= True)
    
    Y_prediction_train=predict(params["w"], params["b"], X_train)
    Y_prediction_test=predict(params["w"],params["b"], X_test)
    
    print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
    print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))
    
    d = {"costs": costs,
         "Y_prediction_test": Y_prediction_test, 
         "Y_prediction_train" : Y_prediction_train, 
         "w" : params["w"], 
         "b" : params["b"],
         "learning_rate" : learning_rate,
         "num_iterations": num_iterations}
    
    return d

In [203]:

d=model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)

Cost after iteration 0: 0.693147
Cost after iteration 100: 0.709726
Cost after iteration 200: 0.657712
Cost after iteration 300: 0.614611
Cost after iteration 400: 0.578001
Cost after iteration 500: 0.546372
Cost after iteration 6: 0.518331
Cost after iteration 700: 0.492852
Cost after iteration 800: 0.469259
Cost after iteration 900: 0.447139
Cost after iteration 1000: 0.426262
Cost after iteration 1100: 0.406617
Cost after iteration 1200: 0.388723
Cost after iteration 1300: 0.374678
Cost after iteration 1400: 0.365826
Cost after iteration 1500: 0.358532
Cost after iteration 1600: 0.351612
Cost after iteration 1700: 0.345012
Cost after iteration 1800: 0.338704
Cost after iteration 1900: 0.332664
train accuracy: 91.38755980861244 %
test accuracy: 34.0 %

In [162]:

# Plot learning curve (with costs)
costs = np.squeeze(d['costs'])
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()

In [165]:

classes

Out[165]:

array([b'non-cat', b'cat'], dtype='|S7')

In [175]:

# Example of a picture that was wrongly classified.
index = 2
plt.imshow(test_set_x[:,index].reshape((num_px, num_px, 3)))
# print ("y = " + str(test_set_y[0,index]) + ", you predicted that it is a \"" + classes[d["Y_prediction_test"][0,index]].decode("utf-8") +  "\" picture.")

Out[175]:

<matplotlib.image.AxesImage at 0x22c6a2d2f60>

In [ ]:

learning_rates = [0.01, 0.001, 0.0001]
models = {}
for i in learning_rates:
    print ("learning rate is: " + str(i))
    models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, print_cost = False)
    print ('\n' + "-------------------------------------------------------" + '\n')

for i in learning_rates:
    plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["learning_rate"]))

plt.ylabel('cost')
plt.xlabel('iterations (hundreds)')

legend = plt.legend(loc='upper center', shadow=True)
frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()

learning rate is: 0.01
Cost after iteration 0: 0.693147
Cost after iteration 100: 2.321788
Cost after iteration 200: 3.011239
Cost after iteration 300: 0.483519
Cost after iteration 400: 1.297533
Cost after iteration 500: 1.215430
Cost after iteration 600: 1.135770
Cost after iteration 700: 0.901737
Cost after iteration 800: 0.821976
Cost after iteration 900: 0.791033
Cost after iteration 1000: 0.762400
Cost after iteration 1100: 0.736228
Cost after iteration 1200: 0.711983
Cost after iteration 1300: 0.689076
Cost after iteration 1400: 0.667013
train accuracy: 71.29186602870814 %
test accuracy: 64.0 %

-------------------------------------------------------

learning rate is: 0.001
Cost after iteration 0: 0.693147
Cost after iteration 100: 0.605784
Cost after iteration 200: 0.589938
Cost after iteration 300: 0.577890
Cost after iteration 400: 0.567791
Cost after iteration 500: 0.559013
Cost after iteration 600: 0.551207
Cost after iteration 700: 0.544146
Cost after iteration 800: 0.537671
Cost after iteration 900: 0.531668
Cost after iteration 1000: 0.526054
Cost after iteration 1100: 0.520764
Cost after iteration 1200: 0.515752
Cost after iteration 1300: 0.510979
Cost after iteration 1400: 0.506416
train accuracy: 74.16267942583733 %
test accuracy: 34.0 %

-------------------------------------------------------

learning rate is: 0.0001
Cost after iteration 0: 0.693147
Cost after iteration 100: 0.636292
Cost after iteration 200: 0.630322
Cost after iteration 300: 0.625487
Cost after iteration 400: 0.621470
Cost after iteration 500: 0.618051
Cost after iteration 600: 0.615075
Cost after iteration 700: 0.612432
Cost after iteration 800: 0.610042
Cost after iteration 900: 0.607850
Cost after iteration 1000: 0.605814
Cost after iteration 1100: 0.603904
Cost after iteration 1200: 0.602098
Cost after iteration 1300: 0.600377
Cost after iteration 1400: 0.598731

lr_utils.py code

import numpy as np
import h5py

def load_dataset():
 train_dataset = h5py.File('datasets/train_catvnoncat.h5', "r")
 train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
 train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels
 test_dataset = h5py.File('datasets/test_catvnoncat.h5', "r")
 test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
 test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels

 classes = np.array(test_dataset["list_classes"][:]) # the list of classes

 train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
 test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))

 return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes

REFERENCES:

coursera.org
deep learning specialization

Code a Neural Network in plain NumPy: Part 1 (with no Hidden Layer)

You will learn to:

Overview of the Problem set

Show me the code

Exercise: 1

Exercise:2

General Architecture of the learning algorithm

Building the parts of our algorithm

Exercise:3

Initializing parameters

Exercise:4

Forward and Backward propagation

Exercise:5

Optimization

Exercise: 6

Exercise: 7

Merge all functions into a model

REFERENCES:

Published by llamasearch

Leave a comment Cancel reply

You will learn to:

Overview of the Problem set

Show me the code

Exercise: 1

Exercise:2

General Architecture of the learning algorithm

Building the parts of our algorithm

Exercise:3

Initializing parameters

Exercise:4

Forward and Backward propagation

Exercise:5

Optimization

Exercise: 6

Exercise: 7

Merge all functions into a model

REFERENCES:

Share this:

Related

Published by llamasearch

Leave a comment Cancel reply