Skip to contents

[Stable]

Select a set of predictors with minimal multicollinearity using the variance inflation factor (VIF) as criteria to remove collinear variables. The algorithm will: (i) compute the VIF value of the correlation matrix containing the variables selected in ...; (ii) arrange the VIF values and delete the variable with the highest VIF; and (iii) iterate step ii until VIF value is less than or equal to max_vif.

Usage

non_collinear_vars(
  .data,
  ...,
  max_vif = 10,
  missingval = "pairwise.complete.obs"
)

Arguments

.data

The data set containing the variables.

...

Variables to be submitted to selection. If ... is null then all the numeric variables from .data are used. It must be a single variable name or a comma-separated list of unquoted variables names.

max_vif

The maximum value for the Variance Inflation Factor (threshold) that will be accepted in the set of selected predictors.

missingval

How to deal with missing values. For more information, please see stats::cor().

Value

A data frame showing the number of selected predictors, maximum VIF value, condition number, determinant value, selected predictors and removed predictors from the original set of variables.

Examples

# \donttest{
library(metan)
# All numeric variables
non_collinear_vars(data_ge2)
#>          Parameter                                       values
#> 1       Predictors                                           10
#> 2              VIF                                         7.16
#> 3 Condition Number                                       56.797
#> 4      Determinant                                 0.0008810515
#> 5         Selected PERK, EP, CDED, NKR, PH, NR, TKW, EL, CD, ED
#> 6          Removed                          EH, CL, CW, KW, NKE

# Select variables and choose a VIF threshold to 5
non_collinear_vars(data_ge2, EH, CL, CW, KW, NKE, max_vif = 5)
#>          Parameter          values
#> 1       Predictors               4
#> 2              VIF           2.934
#> 3 Condition Number          11.248
#> 4      Determinant    0.2400583901
#> 5         Selected NKE, EH, CL, CW
#> 6          Removed              KW
# }