A practical guide to using Cliff's Delta as a measure of effect size where parametric equivalents are inappropriate.
Building: Holme Building
Room: MacCallum Room
Date: 2010-12-02 01:30 PM – 03:00 PM
Last modified: 2010-11-23
Abstract
Effect sizes are of increasing importance within social science reporting. The American Psychological Association (2001, p. 25) states that "it is almost always necessary to include some index of effect size or strength of relationship". Indeed, within two years, at least 23 major social science journals required effect sizes to be reported (Capraro & Capraro, 2003). Cohen's d is by far the most commonly used effect size within the social sciences, yet its popularity is arguably a result of its easy of calculation rather than its suitability within the social sciences. Micceri (1989) concluded that the majority of data collected within the social sciences do not meet common assumptions made by parametric analyses - in particular, univariate normality, while Breckler (1990) asserts that fewer than 10% of researchers actually even consider whether normality assumptions are met. Since Cohen's d is calculated using only the means and standard deviations of the two groups being compared (as are other common effect sizes, such as Glass's delta or Hedges' g), it is extremely vulnerable to violations of normality and not an appropriate quantification of the effect (Hogarty & Kromrey, 2001). However, even in instances where researchers have acknowledged that their data are not normally distributed and have appropriately employed the use of non-parametric statistics to test the null hypothesis, many persist in reporting Cohen's d or other parametric effect sizes (Leech & Onwuegbuzie, 2002). The most obvious reason for this is the paucity of information about how to select and calculate a non-parametric effect size, coupled with the inability of major statistical packages to compute it for the researcher. This paper relates to the use of Cliff's delta as a robust and intuitive alternative to Cohen's d, in situations where data are either non-normal, or are ordinal and therefore have reduced variance (e.g. likert scale responses from survey data). There are already a number of articles that advocate the use of Cliff's delta, and emphasise its robustness in situations where parametric assumptions are not met (e.g. Ledesma, Macbeth & Cortada de Kohan, 2009; Romano, Kromrey, Corragio & Skowronek, 2006; Fern & Monroe, 1996; Hess & Kromrey, 2004), but few give practical advice on how to actually calculate and interpret the statistic. This paper uses real data to compare and contrast the use of Cliff's delta with Cohen's d. An explanation of how it can be calculated by anyone with access to Excel is provided, along with details of interpretation.
References
American Psychological Association. (2001). Publication manual of the American Psychological Association (5th ed.). Washington, DC: Author.
Breckler, S. J. (1990). Application of covariance structure modeling in psychology: Cause for concern? Psychological Bulletin, 107, 260-273.
Capraro, M. M., & Capraro, R. M. (2003). Exploring the APA fifth edition Publication Manual's impact on the analytic preferences of journal editorial board members. Educational and Psychological Measurement, 63, 554-565.
Fern, E. & Monroe, K. (1996). Effect size estimates: Issues and problems in interpretation. Journal of Consumer Research, 23, 89-105.
Hess, M. & Kromrey, J. (2004). Robust confidence intervals for effect sizes: A comparative study of Cohen's d and Cliff's delta under non-normality and heterogeneous variances. Paper presented at the annual meeting of the American Educational Research Association, San Diego, CA.
Hogarty, K. & Kromrey, J. (2001). We've been reporting some effect sizes: Can you guess what they mean? Paper presented at the annual meeting of the American Educational Research Association, Seattle, WA.
Ledesma, R., Macbeth, G. & Cortada de Kohan, N. (2009). Computing effect size measures with ViSta - the visual statistics system. Tutorials in Quantitative Methods for Psychology, 5(1), 25-34.
Leech, N. & Onwuegbuzie, A. (2002). A call for greater use of nonparametric statistics. Paper presented at the Annual Meeting of the Mid-South Educational Research Association, Chattanooga, TN.
Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105(1), 156-166.
Romano, J., Kromrey, J., Corragio, J. & Skowronek, J. (2006). Appropriate statistics for ordinal level data: Should we really be using t-test and Cohen's d for evaluating group differences on the NSSE and other surveys? Paper presented at the annual meeting of the Florida Association of Institutional Research, February 1 -3, 2006, Cocoa Beach, Florida
@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }