In this article, I will be able to bestow you with my mathematical intuition on the topics mentioned above. After reading this you will be able to get more intuition of what these things are and how to make sense of them. All of these things mentioned in the title are used to evaluate your model performance and how much better your model performs on a given data. This gives you an idea of whether or not you need to make some modifications to your model and where you are going.
What is an Error?
In machine learning and in life, we have got things that we don’t want. We have some expectations and if something does not go according to them we call them errors. So in machine learning, we have a particular value that we expect our model to return, and if it does not come out as we want it to we call it error.
Basic error is calculated as the difference between the actual value and the predicted value and then we like to square it to make it a little bit positive but you will learn that later.
What is R2 score?
Lil Maths
To put it simply R2 score is mathematically given as below →
$$R^2 = 1 - \frac{\sum_{i=1}^n (y_i - \hat{y}i)^2}{\sum{i=1}^n (y_i - \bar{y})^2}$$
Now to understand what each term means we have to understand it part by part.
Look at the numerator of the quantity being subtracted by 1 that thing is called the residual sum of squares.
Now to understand that more clearly understand that y with the cap is what we call model predictions. So we are subtracting each model prediction from the actual value for each independent variable.
That is equal to the error in the model predictions and then we are taking the square of those errors.
Now if you look at the term in the denominator you will find another similar term to the first one, this is called the total sum of squares. It is basically the difference between the actual value of the dependent variable subtracted from the mean of the output values of the model.
Exploring the possibilities
Now when you combine both of them you will get another term, there are many different things that you can say about that term. One is that it will always be positive and that is because you were basically adding the squares so no negative value. If the numerator is greater than the denominator then the value will be more than 1 and if it is less than the denominator then the value will be less than 1 and if they are equal then the value will be 1.
Now if you have understood that clearly then the game is simple. To make it even simpler I am providing you with the image that can help you easily to guide through this.
If the value of R2 is 0 then the square of the difference between the predictions of your model and the actual value is exactly equal to the square of the difference between the mean and the actual value and so that beast-looking quantity becomes 1 as both the numerator and denominator are same and cancel out to 1.
1-1 =0.
When your R2 score becomes 1 it means that the square of the difference between the predicted value and the actual value is 0 which means your predicted value actually lines up at your actual values of the dependent variables. This also means that your model has predicted exactly the same values as the dependent variable.
In case of the R2 score being negative or <0 it means that the square of the difference between the predicted value is more than the square of the difference between the actual value and the mean of the values of the dependent variable.
However in real life you may mostly fall in between these three cases.
Mean Squared Error (MSE)
The mathematical formula for mean squared error is given below →
$$\ \text{MSE} = \frac{1}{n} \sum_{i=1}^n y_i - \hat{y}_i^2 \$$
Understanding the mathematical aspects of this formula is pretty simple. To put it simply it gives us the average value of the square of the error.
To give an analogy let us say there are 10 apples which represent the complete MSE of the model and there are 5 values of the independent variable. Now how many apples each person can get or if I translate this analogy to mathematical terms ask yourself how much error is there per one value of the datapoint or independent variable?
That’s the Mean Squared Error.
LIMITATIONS
One of the limitations of MSE is that it squares the error so you should not be afraid when your MSE is large also it is affected by outliers.
Another limitation is that it is affected by the scale of your data so if you have a scale of 100s or 1000s then your MSE is going to be pretty high because it’s clear from the formula that it squares the error, so you better learn that early.
Then what is it that is going to help us?
Root Mean Squared Error (RMSE)
We had to deal with a lot of issues in the case of MSE so here we have introduced RMSE. The MSE was getting pretty high because we were squaring the errors and so the value that we were getting ultimately is in squared.
Imagine saying your predictions are off by 2500 dollars squared, that does not make any sense.
Therefore we take the square root of MSE to interpret the issue better and the units of the errors become more interpretable.
The mathematical formula for RMSE is →
$$\text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}$$
That’s why we have RMSE.
FINAL WORDS
Thanks for reading the whole article. I hope that I have made my point clear.
Do you have some suggestions? You can send them below
Want to support? Follow!