Then the line that minimizes the sum of the squares of the vertical distances of the points from the line is found out. This is the linear regression line.
However, the values of Y for different values of X can not be determined exactly. We determine the statistical relationship between the sales (Y) and the advertising expenditure(X) of the firm in probabilistic terms. Here the vertical distances are the error and we get the following stochastic relationship.
Y= a+bX+U where a is the intercept, b is the slope and U is the error term.
r2, a measure of goodness-of-fit of linear regression
r2=1-( SSreg / SStot )
where SSreg is the sum of square of the errors and SStot is the sum of square of the vertical distances from the horizontal line depicting the mean value of Y. |