ANOVA and the test of normality

Oct 13 2011 Published by under Statistical Reasoning

One still occasionally gets whinging from some corner or other about not being able to run Analysis of Variance statistical procedures (ANOVA) because the data didn't pass a test of normality. I.e., a test of whether they appear to fit a normal distribution.

Paper reviewers, trainees, colleagues....this can come from any corner. It betrays a grad-school class level of understanding of what statistical analysis of data is supposed to do...but not a grasp of what it is doing for us at a fundamental level within the conduct of science.

Your stock response should be "the ANOVA is robust against violations of normality, move along".

I note that the company GraphPad, which makes the Prism statistical/curve fitting package beloved of behavioral pharmacologists, has a tidy FAQ answer.

The extract version:

A population has a distribution that may be Gaussian or not. A sample of data cannot be Gaussian or not Gaussian. That term can only apply to the entire population of values from which the data were sampled...In almost all cases, we can be sure that the data were not sampled from an ideal Gaussian distribution... an ideal Gaussian distribution includes some very low negative numbers and some superhigh positive values...When collecting data, there are constraints on the possible values...Other variables can...have physical or physiological limits that don’t allow super large values... plenty of simulations have shown that these tests work well even when the population is only approximately Gaussian...It is hard to define what "close enough" means, and the normality tests were not designed with this in mind.

18 responses so far

  • D. C. Sessions says:

    The computers you're using to read this message communicate by way of data channels which are designed around Gaussian models of delay between two events. The error rates in those data channels are predicted very, very well by those Gaussian models despite the fact that the models themselves imply that some of the delays violate causality.

    Shorter: all models are invalid, but some are useful.

  • lylebot says:

    I'm frequently amazed by some of the insanely wrong assumptions some are willing to accept without a second thought even as they claim they can't use an ANOVA because of violation of normality. I wonder how we got to a point where we even have to tell people it's OK to ANOVA when your data doesn't appear normal? Is it because every field has that person that publishes that one paper that says "ANOVA requires data be distributed normally!"

  • drugmonkey says:

    It is because that is all they remember from the stats class they had to take in the first year of grad school. Or because they have a stat package that tests and gives them an error message or something.

  • queenrandom says:

    Prism also conveniently tests your variance assumption with every parametric test. It's hard to fail the test but it's easy to re-run as a kruskal or transform if it does fail. And it needs to be WILDLY different variances to fail. I like to show my f test assumptions in my methods to stave off reviewers anyway.

    I heart prism.

  • physioprof says:

    If you're that fucken anxious about normality, just run a motherfucken non-parametric rank statistic.

  • DJMH says:

    Is it wrong to admit that I prefer nonparametric tests primarily because they're so much more fun to say? I mean, if you were choosing a fantasy statistics team, wouldn't you rather be able to say, "I picked up Wilcoxon and Kruskal Wallis and Mann Whitney U" rather than "Uh, Student T and ANOVA"?

  • DrugMonkey says:

    Wrong PP, that is precisely one of the problems with this sort of ignorance.

  • physioprof says:

    There are some circumstances where using a rank-based non-parametric can lead to an underestimate of the likelihood of type I error in comparison to a parametric statistic, but these are very rare, and very obvious when they obtain.

  • Joat-mon says:

    I agree with DJMH and PP; non-parametric is the way to go. Also, in what circumstances do you run your normality test? n of 10-15? I would not be comfortable saying the data are normally distributed base on such a small sample size, even the program tells me so.

  • LeeHW says:

    Largely, I agree. ANOVA even works well when the dependent is binary (see Lunney 1970, in Jnl Ed Management). Buuuuuut... Wilcox showed in that lovely volume (2005, something like 'Intro to robust estimation' I think it was called), that ANOVA behaves very naughtily with respect to error rates when you've got both a fairly non-Gaussian distribution /and/ uneven group sizes.

    "It's robust so screw 'em" is about as unhelpful as "It's non-Normal, how dare you?"

  • drugmonkey says:

    Non parametrics lack power. Type III error- aka using the wrong stats test and missing something.

  • The choice of test depends upon which type of error you consider more of a problem, and whose likelihood you thus want to minimize.

    Honestly, I'd just be happy if I never, ever have to explain again why you can't just do thirteen fuckjillion pairwise t-tests to analyze an experiment with numerous experimental conditions.

  • "why you can't just do thirteen fuckjillion pairwise t-tests"

    Heh heh. I'd like you to meet my friends, Benjamini & Hochberg.

  • drugmonkey says:

    Agreed. the idea of alpha inflation with multiple tests is apparently more difficult to grasp than it has any right to be....

  • physioprof says:

    Maybe we should start a new blogge called

  • drugmonkey says:

    What, you wanna turn your manuscript reviews into a series of hyperlinks?

  • physioprof says:

  • anon says:

    I'd agree with that

Leave a Reply