Skip to content

Commit

Permalink
R Squared Complete
Browse files Browse the repository at this point in the history
  • Loading branch information
vishnoitanuj committed Nov 5, 2018
1 parent 32e1377 commit 57afa63
Showing 1 changed file with 20 additions and 2 deletions.
22 changes: 20 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,24 @@ Random Forest is a team of Decision Trees.

# Evaluation Regression Models

## R-Squared Intitution
## R-Squared

SS<sub>res</sub> = SUM (y<sub>i</sub> - y<sub>pred<sub>i</sub></sub>)<sup>2</sup>
SS<sub>res</sub> = SUM (y<sub>i</sub> - y<sub>pred<sub>i</sub></sub>)<sup>2</sup>
SS<sub>tot</sub> = SUM (y<sub>i</sub> - y<sub>avg<sub>i</sub></sub>)<sup>2</sup>

R<sup>2</sup> = 1 - (SS<sub>res</sub>/SS<sub>tot</sub>)

It tells how good is our best fit line compared to average line
Ideal: R<sup>2</sup> = 1
R<sup>2</sup> can be negative when SS<sub>res</sub> fits daughter worse than the average line. It means model is completely broken

## Adjusted R-Squared

Whenever a third variable or more variable is added to the regression model, SS<sub>res</sub> keeps minimizing and R<sup>2</sup> will never decrease, even if the third variable does not affect the prediction or negatively affect the prediction as there is always a co-relation between data. So problem is we can add variable and we will not know if it helped our model or not.

Adjusted R<sup>2</sup> = 1 - (1 - R<sup>2</sup>)((n-1)/(n-p-1))

p = number of regressors
n = sample size

Adjusted R<sup>2</sup> has a penalizing factor, it penalizes for adding a factor that dont help the model

0 comments on commit 57afa63

Please sign in to comment.