^{1}

^{1}

^{2}

^{1}

^{2}

Edited by: Heather M. Buzick, Educational Testing Service, USA

Reviewed by: Andrew Jones, American Board of Surgery, USA; Fiona Fidler, LaTrobe University, Australia

*Correspondence: Rens Van de Schoot, Department of Methodology and Statistics, Utrecht University, P.O. Box 80.140, 3508 TC Utrecht, Netherlands. e-mail:

This article was submitted to Frontiers in Quantitative Psychology and Measurement, a specialty of Frontiers in Psychology.

This is an open-access article subject to an exclusive license agreement between the authors and Frontiers Media SA, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are credited.

This mini-review illustrates that testing the traditional null hypothesis is not always the appropriate strategy. Half in jest, we discuss Aristotle's scientific investigations into the shape of the earth in the context of evaluating the traditional null hypothesis. We conclude that Aristotle was actually interested in evaluating

The present mini-review argues that testing the traditional null hypothesis is not always the appropriate strategy. That is, many researchers have no particular interest in the hypothesis “nothing is going on” (Cohen,

One important prior note has to be made. Researchers like Wagenmakers et al. (

Cohen (

The question of the shape of the earth was a recurring issue in scientific debate during the era of Aristotle (384–322 BC; see Rusell, ^{1}

We propose that in order to falsify the old Mesopotamian hypothesis, Aristotle might have used an approach based on testing the traditional null hypothesis:

_{0}: The shape of the earth is a flat disk,

_{1}: The shape of the earth is not a flat disk.

Clearly, these hypotheses are no statistical hypotheses and no actual statistical inference could have been carried out; these hypotheses are purely designed to serve as an example.

So, in the set up of our reverse science fiction, Aristotle would have gathered data about the shape of the earth and found evidence against the null hypothesis, for example: stars that were seen in Egypt were not seen in countries north of Egypt, while stars that never were beyond the range of observation in northern Europe were seen to rise and set in Egypt. Such observations could not be taken as evidence of a flat earth. _{0} would have been rejected, leading Aristotle to conclude that the earth cannot be represented by a flat disk.

In actual fact, Aristotle agreed with Pythagoras (582 to ca. 507 BC), who believed that all astronomical objects have a spherical shape, including the earth. So, once again embarking on an episode of imaginary history, Aristotle might also have tested:

Now, imagine that Aristotle continued his search for data and that he gathered data yielding evidence against (!) the null hypothesis^{2}

What can be learned from this conclusion? Not much! Both hypothesis tests reject the traditional null hypotheses _{0} and _{1} and ^{3}

Rather than using the hypothesis tests given above, we might argue that Aristotle was actually interested in evaluating:

_{A}

versus

_{B}

In such a direct comparison the conclusion will be more informative.

Evaluating specific expectations directly produces more useful results than sequentially testing traditional null hypotheses against catch-all rivals. We argue that researchers are often interested in the evaluation of informative hypotheses and already know that the traditional null hypothesis is an unrealistic hypothesis. This presupposes that prior knowledge often is available; if this is not the case, testing the traditional null hypothesis is appropriate. In most applied articles, however, prior knowledge is indeed available in the form of specific expectations about the ordering of statistical parameters.

Let us illustrate this using an example of Van de Schoot et al. (

Suppose we want to compare these five sociometric status groups on the number of committed offenses reported to the police last year (minor theft, violence, and so on) and let the groups be denoted by μ1 for the mean on the number of committed offenses for the popular group, μ2 for the rejected group, μ3 for the neglected group, μ4 for the controversial group and μ5 for the average group. Different types of hypotheses can be formulated that are used in the procedures and are described in the remainder of this paper.

First, informative hypotheses can be formulated denoted by

Second, there is the traditional null hypothesis (denoted by _{0}), which states that nothing is going on and all groups have the same score, _{0}: μ_{1} = μ_{2} = μ_{3} = μ_{4} = μ_{5}. Third, if no constraints are imposed on any of the means and any ordering is equally likely, the hypothesis is called a “catch-all” alternative hypothesis, or an unconstrained hypothesis (denoted by _{U}_{U}_{1}, μ_{2}, μ_{3}, μ_{4}, μ_{5}. In the next section we present an overview of possible alternatives for traditional null hypothesis testing to evaluate one or more informative hypotheses.

Different procedures are described in a range of sources that allow for the evaluation of informative hypotheses. We present an overview of technical papers, software, and applications for two types of approaches: (1) hypothesis testing approaches and (2) model selection approaches. Note that we limit ourselves to a discussion of papers where software is available for applied researchers.

Some approaches reported in the literature render a _{I}_{0} or with _{U}

versus

and

versus

where in the second hypothesis test ^{4}

Testing informative hypotheses for structural equation models (SEM) is described in Stoel et al. (^{5}

The procedure described in Van de Schoot et al. (^{6}

A second way of evaluating an informative hypothesis is to use a model selection approach. This is not a test of the model in the sense of hypothesis testing, rather it is an evaluation between statistical models using a trade-off between model fit and model complexity. Several competing statistical models may be ranked according to their value on the model selection tool used and the one with the best trade-off is the winner of the model selection competition.

There is a variety of model selection procedures commonly used in practical applications, most notably Akaike's information criterion (AIC; Akaike,

Alternative model selection tools have been proposed in the literature. First, an alternative model selection procedure is the paired-comparison information criterion (PCIC) proposed by Dayton (1998, 2003), with an application in Taylor et al. (^{7}

Second, the literature also contains one modification of the AIC that can be used in the context of inequality constrained ANOVA models. It is called the order-restricted information criterion (ORIC; Anraku,

Alternatives for the BIC and the DIC are under construction: see Romeijn et al. (under review) and Van de Schoot et al. (under review-a), respectively.

Finally, one other method of model selection, which is receiving more and more attention in the literature, involves the evaluation of informative hypothesis using Bayes factors. In this method each (informative) hypothesis of interest is provided with a “degree of support” which tells us exactly how much support there is for each of the hypotheses under investigation. This process involves collecting evidence that is meant to provide support for or against a given hypothesis; as evidence accumulates, the degree of support for a hypothesis increases or decreases.

The methodology of evaluating a set of inequality constrained hypotheses has proven to be a flexible tool that can deal with many types of constraints. We refer to the book of Hoijtink et al. (

Software is available for^{8}

AN(C)OVA models (Klugkist et al.,

Multivariate linear models including time-varying and time-invariant covariates (Mulder et al.,

Latent class analyses (Hoijtink,

Order-restricted contingency tables (Laudy and Hoijtink,

Statistics have come a long way since the early beginnings of testing the traditional null hypothesis of “nothing is going on.” Developments in statistics, in particular specific developments in the evaluation of informative hypothesis, allow researchers to directly evaluate their expectations specified with inequality constraints. This mini-review illustrates that testing the traditional null hypothesis is not always an appropriate strategy. We argued that more can be learned from data by evaluating informative hypotheses, than by testing the traditional null hypothesis. These informative hypotheses were introduced by means of an example. Finally, we presented the current state of affairs in the area of evaluating informative hypotheses.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supported by a grant from the Netherlands organization for scientific research: NWO-VICI-453-05-002.

^{1}The historical figure Aristotle never denied that the earth was round; in fact, from the third century BC onward, no educated person in the history of Western civilization believed that the earth was flat. Indeed, Erasthenes (276–195 BC) gave a reasonable approximation of the earth's circumference and provided strong support for the hypothesis that the earth is round.

^{2}At the time, no one was able to see the earth as a whole and know it to be a sphere by direct observation. But it was possible to derive some conclusions from the hypothesis that the earth is a sphere and use these to test the null hypothesis. For example, one could predict that if someone sailed west for a sufficient amount of time, this person would return to the original starting point (Magellan did this). Or one could predict that if the earth was a sphere, ships at sea would first show their sails above the horizon, and then later, as they sailed closer, their hulls (Galileo observed this). These precise predictions, if exactly confirmed, would establish a provisional objective reality for the idea that the earth is a sphere.

^{3}Admittedly, not all methodologists would agree on this point. In response to Aristotle's imagined disappointment, Popper would have argued that this insight is all that Aristotelian science, or any science for that matter, can hope for. When it comes to general hypotheses, or hypotheses that are beyond the reach of direct verification, we can only be sure of their falsification. Direct positive evidence for hypotheses about the shape of the earth cannot be obtained, so there would be no reason for Aristotle to be disappointed. Popper would have argued that as there is no way to prove that the earth is spherical from direct verification, we can only hypothesize that it has the shape of a sphere. Since Aristotle found evidence demonstrating that the earth is not spherical, this hypothesis is rejected. In fact, according to Popperian reasoning, Aristotle should rejoice in the fact that at least he now knows the earth is not a sphere!

^{4}The software can be downloaded at

^{5}The corresponding scripts can be downloaded from the Web site of Psychological Methods.

^{6}The software can be downloaded at staff.fss.uu.nl/agjvandeschoot

^{7}The software can be downloaded at

^{8}The software can be downloaded at