What Makes a Statistical Analysis Wrong?

One of the most anxiety-laden questions I get from researchers is whether their analysis is “right.” I’m always slightly uncomfortable with that word because often there is no one right analysis.

It’s like finding Mr. or Ms. Right—most of the time, there is not just one Right. But there are many that are clearly Wrong.

Luckily, what makes an analysis right for your data is more easily defined than what makes a person right for you. It pretty much comes down to two things: whether the assumptions of the statistical method are being met and whether the analysis answers the research question.

Assumptions are very important. A test needs to reflect the scale of the variables, the study design, and issues in the data. A repeated measures study design requires a repeated measures analysis. A binary dependent variable requires a categorical analysis method.

But within those general categories, there are often many analyses that meet assumptions. A logistic regression or a chi-square test can both handle a binary dependent variable if there is only a single categorical predictor. But a logistic regression can also incorporate covariates, directly test interactions, and calculate predicted probabilities. A chi-square test can do none of these.

So you get different information from different tests. They answer different research questions.

An analysis that is correct from an assumptions point of view is totally useless if it doesn’t answer the research question. A data set can spawn an endless number of statistical tests (and you can spend an endless number of days running them) that don’t answer the research question.

And the real bummer is it’s not always clear that the analyses aren’t relevant until you sit down to write up the research paper.

That’s why writing out the research questions in theoretical and operational terms is the first step of any statistical analysis. It’s absolutely fundamental. And I mean writing them in minute detail. Issues of mediation, interaction, subsetting, control variables, et cetera, should all be blatantly obvious in the research questions.

The part on writing results sections in Daryl Bem’s chapter “Writing the Empirical Journal Article” is an excellent resource for planning a data analysis. It contains the best examples I’ve ever seen on how to write testable research questions. Thinking about how to write results before solidifying the research questions ensures the analysis is able to answer the questions. Whether the answer is what you expected or not is a different issue.

So when you are concerned about getting an analysis “right,” clearly define the design, variables, and data, but most importantly, get explicitly clear about what you want to learn from this analysis. Once you’ve done this, it’s much easier to find the statistical methods that answers the research questions and meets assumptions.


Tagged as: Research Questions, Statistical Assumptions

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *