Skip to main content

Command Palette

Search for a command to run...

Building a simple Neural Network

Published
4 min read
Building a simple Neural Network
K

I put here the things I am learning related to tech

In this article I dive deep into how you can build your own neural network. This article is meant for beginners who like to dive deeper into the stuff they are currently learning. I have taken Linear Algebra course by Andrew Ng, and building a simple neural network was a part of that.


I am not covering back propagation in this article since I don’t know what that is at this point.


Flow of the article

This article covers these topics on a beginner level → forward propagation, back-propagation (just theory), and calculating and building loss function from scratch.

THE PREREQUISITES

So, before starting, we need certain things for the project. We need to initialize some libraries and some values that are needed to design a matrix of certain dimensions in the project.

Matrices are used in neural networks. So, you have to become familiar with linear algebra a little bit to understand this article.

import numpy as np

a = 1 #row
b = 30 #columns

INITIALISING THE PARAMETERS

The concept of models is simple. To make some predictions, they must have some parameters initialized, which are the weights and biases of the model.

A simple linear regression model with one input parameter has two parameters; one is the input parameter itself, and the other is the weight. Basically, by changing these two things, you can vary your model’s output. So, your model has two degrees of freedom.

In this project, we are using the simple random module of NumPy to make predictions at the beginning.

def ini_perams(a,b):
    weights = np.random.rand(a,b) * 0.001
    bias  = np.zeros((a,b))

    perameters = {"W":weights, "B":bias}    
    return perameters

This function helps us in initializing the parameters as discussed above. As you can see, weights are random, but we are multiplying them with 0.001 to make the values smaller; we are using bias, which gives us only a matrix of 0s. Both weight and bias are a matrix of dimension (1×30).

In the end, we are returning a dictionary containing weight and bias as values.

FORWARD PROPAGATION

If you are familiar with a little bit of machine learning and know how a model calculates its outputs, you understand forward propagation.

Any model starts with assumptions about the data; without making any assumptions about the data, there is no reason why you should choose one model over the other (No-Free-Lunch-Theorem).

Now, when your model makes some assumptions about the data, which are the parameters of the model, you need to calculate the output that your model generates based on those parameters when the data is filled into it. This process of calculating the outputs for different parameters of the model is called forward-propagation.

def forward_prop(X,perameters):
    weights = perameters['W']
    bias = perameters['B']

    Z = weights@X +bias
    return Z

In this function, we are using the values that we got from the dictionary of the parameters function. This function calculated the output for the various weights and biases that we have initialized.

We are creating another matrix named Z, which includes the product of X, which is the input matrix. It stores all the input values that we are going to put in the model.

We are taking the product of two matrices, X and weights, and then adding the bias matrix to it, which gives us the final output matrix Z.

Z contains the output that you get when you put some stuff into the model.

But how do we decide whether our predictions are good enough? We have to calculate the error in our predictions.

To get the best model, we have to minimize the error. We do that using cost function.

COST FUNCTION

The cost function is very simple. You put in the predictions of the model and then subtract each predicted value for a particular input from the actual output value for that same input. This gives you the error in your prediction.

We try to square the error and then add all the errors in our predictions. This is called RMSE (Root Mean Square Error), which is quite similar to the L1 (Euclidean) norm.

We are following the same procedure in the function below; 1/2m is added to make calculations simpler.

def cost_function(Y_dash, Y):
  m = Y.shape[1]
  error = np.sum((Y_dash-Y)**2) / (2*m)


  return error, m

Y_dash are our model predictions, which is matrix Z, and Y is a matrix containing all the actual values.

We are using np.sum to sum all the squared errors and then dividing the whole value with 2m, where m is the number of rows in the matrix Y.

In this project, we are using matrices that are in the row format; however, the general format is, I guess, column one. If you are using column matrices, then it’s better to swap weights with Z in our forward_prop function.

The goal of any good ML engineer is to minimize the error.

This is done with backpropagation because we need to try out different values of weights and biases (or parameters) for our model.

This process of tweaking different parameters of the model to get a better output is called back propagation.

However, back propagation is the topic of another day!