On the Misconception of R^2 for (r)^2 in a Regression Model
- December 26, 2019
- Posted by: RSIS
- Categories: IJRSI, Mathematics
International Journal of Research and Scientific Innovation (IJRSI) | Volume VI, Issue XII, December 2019 | ISSN 2321–2705
On the Misconception of R2 for (r)2 in a Regression Model
Ijomah, Maxwell Azubuike
Dept. of Maths/Statistics, University of Port Harcourt, Nigeria
Abstract:-The coefficient of determination (R2) is perhaps the single most extensively used measure of goodness of fit for regression models, and measures the proportion of variation in the dependent variable explained by the predictors included in the model. It is however, widely misused as the square of correlation coefficient and this has led to poor interpretation of research reports in regression model. In this paper, we investigate the controversy regarding use of coefficient of determination as the square of correlation coefficient in statistical analysis. Difference between the two statistics are illustrated using examples from simple and multiple regression models.
Keywords: linear regression; coefficient of determination; correlation coefficient; multiple correlation; regression coefficient.
I. INTRODUCTION
The classical linear regression model is the standard procedure for extracting the statistical information from the data through the determination of relationship between the study and explanatory variables. In the course of model estimation, it is common practice to assess the appropriateness or adequacy of the fitted model in explaining the variations in the data set. A popular tool to determine the adequacy of the fitted model is the coefficient of determination.The coefficient of determination is a measure used in statistical analysis that assesses how well a model explains and predicts future outcomes. It is indicative of the level of explained variability in the data set and considers the variation in the dependent variable explained by the independent variable(s). It provides a summary measure for the goodness of fit of any linear regression model and is based on the proportion of variability of the study variable that can be explained through the knowledge of a given set of explanatory variables.These definitions are found by both econometrics and statistics handbooks and is widely accepted among quantitative scholars.