Skip to contents

effect_size() returns the effect-size estimate associated with a visstat() result. If result$effect_size is already present, it is returned unchanged. Otherwise, the estimate is computed from the test object stored in result; for some base R stats results, it is extracted directly from the returned object.

Usage

effect_size(result, x = NULL, y = NULL, ...)

Arguments

result

A list returned by visstat() or a compatible test result object; or, in raw-data mode, the first input vector x.

x

First input vector, matching the first argument of visstat(x, y). Required when the effect size cannot be extracted from result alone. In raw-data mode this is the second input vector y.

y

Second input vector, matching the second argument of visstat(x, y). Required when the effect size cannot be extracted from result alone.

...

Passed to visstat() in raw-data mode (e.g. correlation, conf.level).

Value

A list with components name, estimate, effect_size_method, and optionally conf.int. In raw-data mode (effect_size(x, y)) the list additionally contains selected_test, the name of the test visstat() selected.

Details

Notation used below: \(x\) and \(y\) are the two variables entering the selected analysis, \(N\) is the total number of non-missing observations, \(n_j\) is the sample size in group \(j\), \(k\) is the number of groups, \(\bar{y}_j\) is the mean of numeric vector \(y\) in group \(j\), and \(s_j^2\) is the variance of numeric vector \(y\) in group \(j\).

The following estimates are computed internally:

  • Student's two-sample t.test(..., var.equal = TRUE): Hedges' \(g_{s_p} = J(N-2)(\bar{y}_1-\bar{y}_2)/s_p\), where \(s_p = \sqrt{((n_1-1)s_1^2+(n_2-1)s_2^2)/(N-2)}\) and \(J(\nu) = \Gamma(\nu/2)/(\sqrt{\nu/2}\Gamma((\nu-1)/2))\).

  • Welch's two-sample t.test(..., var.equal = FALSE): Hedges' \(g_{s^*} = J(\nu^*)(\bar{y}_1-\bar{y}_2)/s^*\), where \(s^* = \sqrt{(s_1^2+s_2^2)/2}\) and \(\nu^* = ((n_1-1)(n_2-1)(s_1^2+s_2^2)^2)/ ((n_2-1)s_1^4+(n_1-1)s_2^4)\).

  • Wilcoxon rank-sum test: signed rank-biserial correlation \(r = 2W/(n_1 n_2)-1\), where \(W\) is the statistic returned by wilcox.test() for the first group.

  • Fisher's one-way ANOVA: omega-squared \(\omega^2 = \nu_1(F-1)/(\nu_1F+\nu_2+1)\), where \(F\) is the ordinary one-way ANOVA statistic, \(\nu_1=k-1\), and \(\nu_2=N-k\). Negative estimates are truncated to zero.

  • Welch's one-way test: approximate omega-squared-type estimate \(\nu_1(F_W-1)/(\nu_1F_W+\nu_2+1)\), where \(F_W\) is the Welch ANOVA statistic, \(\nu_1=k-1\), and \(\nu_2\) is the usually fractional denominator degree of freedom returned by oneway.test(). Negative estimates are truncated to zero.

  • Kruskal-Wallis test: Kelley-adjusted eta-squared based on \(H\), \(\eta_H^2=(H-k+1)/(N-k)\), where \(H\) is the Kruskal-Wallis statistic. Negative estimates are truncated to zero.

  • Pearson's chi-squared test: Cramer's \(V\) for general \(R\times C\) tables, \(V=\sqrt{\chi^2/(N\cdot(\min(R,C)-1))}\), where \(R\) and \(C\) are the numbers of rows and columns. For \(2\times 2\) tables this is phi, \(\sqrt{\chi^2/N}\). The chi-squared statistic is used as supplied by chisq.test().

The following estimates are extracted from existing result objects:

  • \(R^2\) from summary(lm())$r.squared.

  • Spearman's \(\rho\) from cor.test(method = "spearman")$estimate.

  • Kendall's \(\tau_b\) from cor.test(method = "kendall")$estimate.

  • The conditional maximum-likelihood odds ratio from fisher.test()$estimate and its confidence interval from fisher.test()$conf.int for \(2\times 2\) tables.

References

Hedges, L. V. (1981). Distribution theory for Glass's estimator of effect size and related estimators. Journal of Educational Statistics, 6(2), 107–128. doi:10.3102/10769986006002107.

Delacre, M., Lakens, D., Ley, C., Liu, L., & Leys, C. (2021). Why Hedges' g*s based on the non-pooled standard deviation should be reported with Welch's t-test. PsyArXiv. doi:10.31234/osf.io/tu6mp.

Kerby, D. S. (2014). The simple difference formula: An approach to teaching nonparametric correlation. Comprehensive Psychology, 3, 11.IT.3.1. doi:10.2466/11.IT.3.1.

Albers, C., & Lakens, D. (2018). When power analyses based on pilot data are biased: Inaccurate effect size estimators and follow-up bias. Journal of Experimental Social Psychology, 74, 187–195. doi:10.1016/j.jesp.2017.09.004.

Kelley, T. L. (1935). An unbiased correlation ratio measure. Proceedings of the National Academy of Sciences, 21(9), 554–559. doi:10.1073/pnas.21.9.554.

Cohen, J. (2013). Statistical power analysis for the behavioural sciences. Routledge. doi:10.4324/9780203771587.

Examples

x <- ToothGrowth$supp
y <- ToothGrowth$len
tt <- list("t-test-statistics" = t.test(y ~ x, var.equal = TRUE))
effect_size(tt, x = x, y = y)
#> $name
#> [1] "Hedges' g"
#> 
#> $estimate
#> [1] 0.4880931
#> 
#> $effect_size_method
#> [1] "Hedges' g using pooled standard deviation"
#> 

kw <- list(
  "Kruskal Wallis rank sum test" = kruskal.test(Petal.Width ~ Species,
                                               data = iris)
)
effect_size(kw, x = iris$Species, y = iris$Petal.Width)
#> $name
#> [1] "eta-squared based on H"
#> 
#> $estimate
#> [1] 0.8788121
#> 
#> $effect_size_method
#> [1] "Eta-squared based on H for Kruskal-Wallis rank sum test"
#> 

tab <- matrix(c(10, 5, 4, 12), nrow = 2)
effect_size(chisq.test(tab))
#> $name
#> [1] "phi"
#> 
#> $estimate
#> [1] 0.3535596
#> 
#> $effect_size_method
#> [1] "Phi coefficient for 2 x 2 contingency table"
#> 

## Raw-data mode: select the test with visstat() and return the effect
## size together with the name of the selected test.
effect_size(ToothGrowth$supp, ToothGrowth$len)
#> $name
#> [1] "Hedges' g"
#> 
#> $estimate
#> [1] 0.4880931
#> 
#> $effect_size_method
#> [1] "Hedges' g using pooled standard deviation"
#> 
#> $selected_test
#> [1] "Two Sample t-test"
#> 

if (FALSE) { # \dontrun{
## Large-sample example with a statistically significant Student's
## t-test p-value but a small effect size, measured by Hedges' g
## using the pooled standard deviation. A small mean shift is added
## to noisy normal data. Because N is large, the t-test p-value
## becomes small, while Hedges' g remains close to zero.
## The residual Shapiro-Wilk p-value in the diagnostic panel is NA
## because shapiro.test() is limited to n <= 5000.
set.seed(20260525)
n <- 2501
mean_shift <- 0.1
group <- factor(rep(c("control", "treatment"), each = n))
response <- rnorm(2 * n) + rep(c(0, mean_shift), each = n)
res <- visstat(group, response)
res[["t-test-statistics"]]$method
res[["t-test-statistics"]]$p.value
res$effect_size$effect_size_method
res$effect_size$estimate
} # }