Computes Pearson's linear correlation or partial correlation with p-values

Usage

corr_coef(
data,
...,
type = c("linear", "partial"),
method = c("pearson", "kendall", "spearman"),
use = c("pairwise.complete.obs", "everything", "complete.obs"),
by = NULL,
verbose = TRUE
)

Arguments

data

The data set. It understand grouped data passed from dplyr::group_by().

...

Variables to use in the correlation. If no variable is informed all the numeric variables from data are used.

type

The type of correlation to be computed. Defaults to "linear". Use type = "partial" to compute partial correlation.

method

a character string indicating which partial correlation coefficient is to be computed. One of "pearson" (default), "kendall", or "spearman"

use

an optional character string giving a method for computing covariances in the presence of missing values. See stats::cor for more details

by

One variable (factor) to compute the function by. It is a shortcut to dplyr::group_by().This is especially useful, for example, to compute correlation matrices by levels of a factor.

verbose

Logical argument. If verbose = FALSE the code is run silently.

Value

A list with the correlation coefficients and p-values

Details

The partial correlation coefficient is a technique based on matrix operations that allow us to identify the association between two variables by removing the effects of the other set of variables present (Anderson 2003) A generalized way to estimate the partial correlation coefficient between two variables (i and j ) is through the simple correlation matrix that involves these two variables and m other variables from which we want to remove the effects. The estimate of the partial correlation coefficient between i and j excluding the effect of m other variables is given by: $r_{ij.m} = \frac{{- {a_{ij}}}}{{\sqrt {{a_{ii}}{a_{jj}}}}}$

Where $$r_{ij.m}$$ is the partial correlation coefficient between variables i and j, without the effect of the other m variables; $$a_{ij}$$ is the ij-order element of the inverse of the linear correlation matrix; $$a_{ii}$$, and $$a_{jj}$$ are the elements of orders ii and jj, respectively, of the inverse of the simple correlation matrix.

References

Anderson, T. W. 2003. An introduction to multivariate statistical analysis. 3rd ed. Wiley-Interscience.

Author

Tiago Olivoto tiagoolivoto@gmail.com

Examples

# \donttest{
library(metan)

# All numeric variables
all <- corr_coef(data_ge2)

# Select variable
sel <-
corr_coef(data_ge2,
EP, EL, CD, CL)
sel$cor #> EP EL CD CL #> EP 1.0000000 0.2634237 0.1750448 0.3908239 #> EL 0.2634237 1.0000000 0.9118653 0.2554068 #> CD 0.1750448 0.9118653 1.0000000 0.3003636 #> CL 0.3908239 0.2554068 0.3003636 1.0000000 # Select variables, partial correlation sel <- corr_coef(data_ge2, EP, EL, CD, CL, type = "partial") sel$cor
#>            EP         EL         CD         CL
#> EP  1.0000000  0.2938850 -0.2418441  0.3856626
#> EL  0.2938850  1.0000000  0.9110035 -0.1549749
#> CD -0.2418441  0.9110035  1.0000000  0.2454591
#> CL  0.3856626 -0.1549749  0.2454591  1.0000000

# }