From 57afa63edfaebf4e12716a879ffcecc38a2f9a06 Mon Sep 17 00:00:00 2001 From: Tanuj Vishnoi Date: Mon, 5 Nov 2018 16:23:46 +0530 Subject: [PATCH] R Squared Complete --- README.md | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 01b5c76..484a361 100644 --- a/README.md +++ b/README.md @@ -13,6 +13,24 @@ Random Forest is a team of Decision Trees. # Evaluation Regression Models -## R-Squared Intitution +## R-Squared -SSres = SUM (yi - ypredi)2 \ No newline at end of file +SSres = SUM (yi - ypredi)2 +SStot = SUM (yi - yavgi)2 + +R2 = 1 - (SSres/SStot) + +It tells how good is our best fit line compared to average line +Ideal: R2 = 1 +R2 can be negative when SSres fits daughter worse than the average line. It means model is completely broken + +## Adjusted R-Squared + +Whenever a third variable or more variable is added to the regression model, SSres keeps minimizing and R2 will never decrease, even if the third variable does not affect the prediction or negatively affect the prediction as there is always a co-relation between data. So problem is we can add variable and we will not know if it helped our model or not. + +Adjusted R2 = 1 - (1 - R2)((n-1)/(n-p-1)) + + p = number of regressors + n = sample size + + Adjusted R2 has a penalizing factor, it penalizes for adding a factor that dont help the model