^{2}as an effect size statistic: dichotomous variables

^{*}

Edited by: Jeremy Miles, Research and Development Corporation, USA

Reviewed by: Thom Baguley, Nottingham Trent University, UK; Wendy Christensen, University of California, Los Angeles, USA

*Correspondence: David Trafimow,

This article was submitted to Quantitative Psychology and Measurement, a section of the journal Frontiers in Psychology

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

There have been differences in the use of the correlation coefficient (^{2}) for indexing the effect size (see Rosenthal and DiMatteo,

To flesh out the idea, suppose that there are two variables and each of these is dichotomous and scored 0 or 1. From the point of view of a researcher who believes that the relation between the two variables is important, each case of matching scores (0 on both variables or 1 on both variables) is a success whereas each case of mismatching (0 on one variable and 1 on the other, or the reverse) constitutes a failure. The straightforward way to index the ability of the two variables to produce successes (agreements with respect to zeroes and ones) would be to use the proportion of obtained successes. However, because a 50% success rate would be expected due to chance, this proportion likely would be misleading.

I suggest controlling for chance by computing an adjusted proportion of successes or adjusted success rate (_{A}) using Equation (1) below, where

In correlation terms, given the simplification mentioned previously, the usual phi correlation coefficient reduces to the equation made famous by Rosenthal and Rubin (

Substituting Equation (2) into Equation (1) renders Equation (3).

Remembering that when there are two variables, we expect a 50% success rate by chance, 0.5 can be substituted for

In turn, Equation (4) simplifies to Equation (5).

Put into words, in the dichotomous case when there are equal numbers of zeroes and ones for both variables, the success rate adjusted for chance equals the correlation coefficient!

In summary, then, my argument is simple. Because the proportion of successes, controlling for chance, is a straightforward and easy way to understand an effect size, this should be the preferred effect size statistic. Happily, the correlation coefficient equals this under the simplified conditions that I set up. Therefore, in terms of straightforward intelligibility, the correlation coefficient is superior to the coefficient of determination as an effect size index.

Although my main point has been made, there are additional issues worth mentioning. First, there are additional reasons to favor ^{2}. One such reason is that the former is directional whereas the latter is not. Another reason is that ^{2}.

A second issue is that it is possible for ^{2}. Baguley (^{2} are standardized effect size measures and the reliabilities of the measures of the variables have a strong influence on standardized effect size measures. As reliabilities decrease standard deviations increase, and so effect size measures that are standardized via standard deviations (in the denominator) decrease. For those researchers who wish to have their effect size measures uninfluenced by reliability issues, they either can use the famous correction formula from classical test theory or use an effect size measure that is not standardized. Each of these involves considerations that go beyond the present scope.

The final issue I will consider pertains to the use of the present logic when one is considering correlation coefficients that are not based on dichotomous data with equal frequencies. To address this issue, it is important to remember that Equation 2 played an important role in getting to Equation 5 and that there has been much discussion about it in the literature. Rosenthal and Rubin (

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.