StdCoef.Rd
Standardize model coefficients by Standard Deviation or Partial Standard Deviation.
StdCoef(x, partial.sd = FALSE, ...)
PartialSD(x)
The standardized coefficients are meant to allow for a comparison of the importance of explanatory variables that have different variances. Each of them shows the effect on the response of increasing its predictor X(j) by one standard deviation, as a multiple of the response's standard deviation. This is often a more meaningful comparison of the relevance of the input variables.
Note, however, that increasing one X(j) without also changing others may not be possible in a given application, and therefore, interpretation of coefficients can always be tricky. Furthermore, for binary input variables, increasing the variable by one standard deviation is impossible, since an increase can only occur from 0 to 1, and therefore, the standardized coeffient is somewhat counterintuitive in this case.
Standardizing model coefficients has the same effect as centring and scaling the input variables.
“Classical” standardized coefficients are calculated as \(\betaᵢ* = \betaᵢ (sₓᵢ / Sᵧ) \) , where \(\beta\) is the unstandardized coefficient, \(sₓᵢ\) is the standard deviation of associated depenent variable \(Xᵢ\) and \(Sᵧ\) is SD of the response variable.
If the variables are intercorrelated, the standard deviation of \(Xᵢ\) used in computing the standardized coefficients \(\betaᵢ*\) should be replaced by a partial standard deviation of \(Xᵢ\) which is adjusted for the multiple correlation of \(Xᵢ\) with the other \(X\) variables included in the regression equation. The partial standard deviation is calculated as \(s*ₓᵢ = sₓᵢ √(VIFₓᵢ⁻¹) √((n-1)/(n-p)) \), where VIF is the variance inflation factor, n is the number of observations and p number of predictors in the model. Coefficient is then transformed as \(\betaᵢ* = \betaᵢ s*ₓᵢ \).
A matrix with at least two columns for standardized coefficient estimate and its standard error. Optionally, third column holds degrees of freedom associated with the coefficients.
Cade, B.S. (2015) Model averaging and muddled multimodel inferences. Ecology 96, 2370-2382.
Afifi A., May S., Clark V.A. (2011) Practical Multivariate Analysis, Fifth Edition. CRC Press.
Bring, J. (1994). How to standardize regression coefficients. The American Statistician 48, 209-213.
# Fit model to original data:
fm <- lm(Fertility ~ Agriculture + Examination + Education + Catholic,
data = swiss)
# Partial SD for the default formula:
psd <- PartialSD(lm(data = swiss))[-1] # remove first element for intercept
#> Warning: longer object length is not a multiple of shorter object length
# Standardize data:
zswiss <- scale(swiss, scale = c(NA, psd), center = TRUE)
# Note: first element of 'scale' is set to NA to ignore the first column 'y'
# Coefficients of a model fitted to standardized data:
# zapsmall(coefTable(stdizeFit(fm, data = zGPA)))
# Standardized coefficients of a model fitted to original data:
# zapsmall(StdCoef(fm, partial = TRUE))
# Standardizing nonlinear models:
fam <- Gamma("inverse")
fmg <- glm(log(Fertility) ~ Agriculture + Examination + Education + Catholic,
data = swiss, family = fam)
psdg <- PartialSD(fmg)
#> Warning: longer object length is not a multiple of shorter object length
# zGPA <- stdize(GPA, scale = c(NA, psdg[-1]), center = FALSE)
# fmgz <- glm(log(y) ~ z.x1 + z.x2 + z.x3 + z.x4, zGPA, family = fam)
# Coefficients using standardized data:
# coef(fmgz) # (intercept is unchanged because the variables haven't been
# centred)
# Standardized coefficients:
# coef(fmg) * psdg