What is Adjusted R-squared?
The Adjusted R-squared (R² adjusted) is a statistical measure that provides a more accurate representation of the goodness of fit for a regression model. Unlike the traditional R-squared (R²), which can be misleading when you add more predictors to the model, the adjusted R² adjusts for the number of predictors and penalizes for overfitting.
Adjusted R-squared Formula:
Adjusted R-squared ) = 1 – ((1 – R-squared) * ((n – 1) / (n – k – 1)))
Where:
- R-squared is the coefficient of determination (regular R-squared),
- n is the number of data points (observations),
- k is the number of independent variables (predictors).
Steps to Calculate Adjusted R-squared:
- Calculate R-squared: This is computed from the regression model between the dependent variable and the independent variables.
- Obtain nnn and kkk: nnn is the number of observations, and kkk is the number of predictors.
- Apply the Adjusted R-squared formula.
VBA Code Example:
Sub CalculateAdjustedRSquared()
Dim ws As Worksheet
Dim rngY As Range, rngX As Range
Dim n As Long, k As Long
Dim coeff As Double
Dim residualSum As Double
Dim totalSum As Double
Dim SSR As Double, SST As Double, R2 As Double
Dim R2Adjusted As Double
' Define worksheet and data ranges
Set ws = ThisWorkbook.Sheets("Sheet1") ' Change to your sheet name
Set rngY = ws.Range("B2:B101") ' Range for dependent variable Y
Set rngX = ws.Range("A2:A101") ' Range for independent variable X
' Number of observations (n) and number of predictors (k)
n = rngY.Rows.Count
k = 1 ' For this example, we assume a simple linear regression (1 predictor X)
' Perform linear regression to get the coefficients
With Application.WorksheetFunction
coeff = .LinEst(rngY, rngX)(1, 1) ' Coefficient for the regression line
End With
' Calculate the total sum of squares (SST)
totalSum = 0
For i = 1 To n
totalSum = totalSum + (rngY.Cells(i, 1).Value - Application.WorksheetFunction.Average(rngY)) ^ 2
Next i
' Calculate the sum of squared residuals (SSR)
residualSum = 0
For i = 1 To n
residualSum = residualSum + (rngY.Cells(i, 1).Value - coeff * rngX.Cells(i, 1).Value) ^ 2
Next i
' Calculate R-squared (R2)
R2 = 1 - (residualSum / totalSum)
' Calculate Adjusted R-squared (R2 adjusted)
R2 Adjusted = 1 - ((1 - R2) * (n - 1)) / (n - k - 1)
' Display the Adjusted R-squared result
MsgBox "The Adjusted R-squared is: " & R2Adjusted
End Sub
Explanation of the Code:
- Variable Declaration:
- ws: The worksheet object that contains the data.
- rngY and rngX: Ranges for the dependent variable YYY and the independent variable XXX.
- n: The number of observations (rows of data).
- k: The number of independent variables (predictors). For a simple linear regression, this is 1.
- R2: The R-squared value.
- R2Adjusted: The adjusted R-squared value.
- Setting the Worksheet and Data Ranges: The code assumes your data is in « Sheet1 » with the dependent variable YYY in column B and the independent variable XXX in column A, from rows 2 to 101.
- Linear Regression Calculation:
- The LinEst function in Excel calculates the slope (regression coefficient) of the linear model. In this case, it’s used for simple linear regression, so it’s accessing the first coefficient in the result.
- Sum of Squares Calculations:
- SST (Total Sum of Squares): Measures the total variation in the dependent variable.
- SSR (Sum of Squared Residuals): Measures the variation not explained by the regression model.
- These are calculated manually by iterating through the data points.
- R-squared and Adjusted R-squared:
- R-squared is calculated as 1−SSRSST1 – \frac{SSR}{SST}1−SSTSSR.
- Adjusted R-squared is then computed using the formula, which adjusts for the number of predictors kkk and the sample size nnn.
- Displaying the Result: The adjusted R-squared value is shown in a message box.
How to Use:
- Place your data in columns A and B in the Excel sheet (adjust ranges if your data set is larger or smaller).
- Run the macro by pressing Alt + F8 and selecting CalculateAdjustedRSquared.
- The result will be shown in a message box.
Customization:
- Multiple predictors (Multiple Linear Regression): If you have more than one predictor (e.g., X1,X2,…,XkX1, X2, \ldots, XkX1,X2,…,Xk), adjust the range of rngX to include the additional columns, like Set rngX = ws.Range(« A2:C101 ») for 3 predictors.
- More Observations: Adjust the ranges (B2:B101 and A2:A101) to include more data points if necessary.