^{*}

Edited by: Pietro Cipresso, IRCCS Istituto Auxologico Italianol, Italy

Reviewed by: Mark D. Reckase, Michigan State University, United States; Prathiba Natesan, University of North Texas, United States

*Correspondence: Javier Revuelta

This article was submitted to Quantitative Psychology and Measurement, a section of the journal Frontiers in Psychology

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

This article introduces Bayesian estimation and evaluation procedures for the multidimensional nominal response model. The utility of this model is to perform a nominal factor analysis of items that consist of a finite number of unordered response categories. The key aspect of the model, in comparison with traditional factorial model, is that there is a slope for each response category on the latent dimensions, instead of having slopes associated to the items. The extended parameterization of the multidimensional nominal response model requires large samples for estimation. When sample size is of a moderate or small size, some of these parameters may be weakly empirically identifiable and the estimation algorithm may run into difficulties. We propose a Bayesian MCMC inferential algorithm to estimate the parameters and the number of dimensions underlying the multidimensional nominal response model. Two Bayesian approaches to model evaluation were compared: discrepancy statistics (DIC, WAICC, and LOO) that provide an indication of the relative merit of different models, and the standardized generalized discrepancy measure that requires resampling data and is computationally more involved. A simulation study was conducted to compare these two approaches, and the results show that the standardized generalized discrepancy measure can be used to reliably estimate the dimensionality of the model whereas the discrepancy statistics are questionable. The paper also includes an example with real data in the context of learning styles, in which the model is used to conduct an exploratory factor analysis of nominal data.

Nominal variables are routinely obtained from a number of item response formats in the fields of ability measurement, attitude scales, sample surveys, market research, etc. One example is multiple-choice items that contain one correct option and several distractors. When the data come from multiple-choice items, the factorial analysis of nominal variables often proceeds by dichotomizing the data into right and wrong responses and submitting the dichotomous data matrix into a categorical factor analysis procedure. However, there are situations when dichotomization is not an option because the interest is in the relation between latent dimensions and the response categories. For example, in an item from a market research each category may represent a purchase option and there is not a natural way to dichotomize the data.

The factorial analysis of responses that have an implicit ordering has long been discussed in the psychometric literature as well as their estimation and testing procedures (Christoffersson,

Applications of constrained versions of the multidimensional nominal response model (MNRM) have been published in the psychometric literature. For example, Hoskens and de Boeck (

The estimation problems of the MNRM emerge because the contingency table of the response patterns is typically too sparse due to the large number of response categories that have to be modeled. Maximum likelihood estimates can be obtained using computer programs such as Latent GOLD (Vermunt and Magidson,

The statistical problems of the nominal model may be addressed by the definition of prior distributions for the parameters and moving the inference to a Bayesian context. Bayesian inference combines the information from the sample with the information in the prior distributions, which stabilizes estimates, alleviates the problems of lack of convergence for some parameters and provides a means for simulating the posterior distribution of model evaluation statistics.

The purpose of this article is to introduce a Bayesian inferential algorithm for the evaluation of the latent dimensionality of the MNRM. The proposed procedure is based on standard Bayesian estimation algorithms by Markov chain Monte Carlo (MCMC) procedures. Bayesian estimation has already been applied to ordinal responses (Kieftenbeld and Natesan,

The rest of the article is organized in the following sections. Section Multidimensional Nominal Response Model describes the MNRM, the constraints for parameter identification, and the rotation problem. The MCMC Bayesian estimation algorithm is presented in Section Bayesian Parameter Estimation, Section Bayesian Model Evaluation describes the model evaluation statistics. Section Simulation Study consists of a simulation study that evaluates the Bayesian inferential algorithm in realistic conditions. Section Real Data Analysis contains a real data study in the context of a questionnaire of learning styles whose response categories represent different learning styles, and there is no implicit order among them. Section Final Remarks concludes the article.

The MNRM was introduced by Takane and de Leeuw (_{1}, …, θ_{d}, …, θ_{D}), is given by the logistic function:

where _{k} is the response value of category

The parameters of the model in Equation (2) are the intercept _{k} and the slopes, _{k1},…, _{kd},…, _{kD}. The MNRM is usually estimated under the assumption that the mean of the dimensions is zero; then the intercept represents the value of the response value for an individual whose vector of dimensions is equal to the population mean. The slope _{kd} represents the relation of the response value _{k} with dimension

The model in Equations (1) and (2) with only one dimension (

An item with

The indeterminacy problem is resolved by imposing a constraint on the utilities. Possibly the easiest methods of identification for the parameters of Equation (2) are

Simple constraints consist of setting to zero the response value of one of the categories. Simple constraints are useful for those items that have a reference category against which the other categories are compared, for example, a _{K} and _{K} are set to 0, which implies that _{K} = 0. The utilities of the remaining categories are interpreted relative to _{K} using log-odds. In particular, the parameters of category

Deviation constraints consist of setting to zero the sum of the utilities,

Deviation constraints are useful for those items in which it is undesirable to have one category with zero parameters, which are items that do not have a reference category; Section Real Data Analysis below shows one example. Deviation constraints involve trade-offs between parameters because if one parameter increases, the others should decrease so that the sum of the parameters will be constant at zero. These trade-offs introduce technical complications in the estimation algorithm. For these reasons the model is estimated under simple constraints and the estimates are subsequently transformed to deviation constraints if necessary. Suppose that _{1}, …, _{K}), or a vector of slopes in the same dimension, _{1d}, …, _{Kd})]. Parameters can be transformed to deviation constraints by subtracting the mean of the vector:

For example, suppose that an item has the following intercepts under simple constraints: _{1} = 5, _{2} = 4, and _{3} = 0; these parameters indicate that the probability of categories 1 and 2 is higher than the probability of category 3 for an individual whose vector of dimensions is zero. According to Equation (6) the intercepts under deviation constraints are _{1} = 2, _{2} = 1, and _{3} = −3. Although simple and deviation constraints convey the same information regarding the probabilities of the categories, parameter values under simple constraints will vary depending on which category is used as a reference. When the choice of the reference category is arbitrary deviation constraints are preferred.

Both simple and deviation constraints imply that 1 +

Recent developments of the MNRM have been proposed by Thissen et al. (

The parameters of Equation (7) are the intercept, _{k}, a vector of item slopes, _{1}, …, _{d}, …, _{D} and the scoring parameters of category _{k1}, ⋯ , _{kd}, ⋯ , _{kD}. The intercept has the same interpretation as in Equation (2) and the constraint _{1} = 0 is imposed for identification. The scoring parameter _{kd} represents the weight of category _{d}, and the slope _{d} is the weight of the item in θ_{d}.

The model in Equation (7) assumes that there exists an ordering among the categories albeit unknown. The ordering is represented by the scoring parameters and is estimated from the data. Consider the scoring parameters of the _{d}, that _{1d}, …, _{kd}, …_{Kd}. These scores are used to obtain an ordering of the categories according to their weight in the dimension θ_{d}. The scoring parameters of two categories in θ_{d} must be fixed to constant values to identify the model and serve as anchor points. Typically the scores of the first and the last category are fixed as _{1d} = 0 and _{Kd} = _{kd} for the remaining categories are estimated.

The model in Equation (7) has

The slopes in Equation (2) and the slopes in Equation (7) are related by the equation:

where _{1d} = 0 and _{Kd} =

Note that the intercept parameter is the same in Equations (2) and (7) and there is no need to transform one another.

The model in Equation (7) has been applied to multiple-choice items, in which category _{1d} = 0 is arbitrarily assigned to the first distractor that serves as a reference, _{Kd} = _{kd} is estimated for distractors 2, …, _{kd} is smaller than 0, the interpretation is that distractor _{kd} is higher than

The classic parameterization of the MNRM is appropriate when the interest is to estimate the relation of each category with each dimension. On the other hand, the parameterization in Equation (7) would be preferable when the interpretation of item slopes and the ordering of the categories are meaningful. For instance, consider the following item taken from a sample survey about social attitudes.

The item in Table

Example of item from a survey of social attitudes.

Choose the most important attitude that children must learn at home |

- Independence |

- Hard work |

- Responsibility |

- Imagination |

- Tolerance and respect for other persons |

- Perseverance |

- Religious faith |

- Abnegation |

- Obedience |

- Don't know |

The focus of this paper is on the use of Bayesian methods to estimate the number of dimensions under the MNRM. From a computational point of view, simple constraints are preferable for simplicity and numerical stability. However, the other parameterizations shall be preferred in application depending on the specific items that are being analyzed. The results of this paper regarding Bayesian methods are irrespective of the parameterization and will be equally applicable when using deviation constraints or item slopes and scoring parameters to interpret results. The recommended computational strategy is to estimate the model under simple constraints and transform the output of the estimation algorithm to the other parameterizations if desired.

Akin to any other factor model, the parameters for the MNRM are subject to rotational indeterminacy. To fix rotation during estimation, we have implemented the same solution as in the NOHARM computer program, which estimates the normal ogive model for dichotomous data (Fraser and McDonald,

Let

where the subscripts refer to item, category, and dimension, respectively. For example, _{321} is the slope of item 3 and category 2 in dimension 1. Equation (10) shows the pattern of zeros and ones that have to be imposed on the slopes to avoid rotational indeterminacy during estimation. Bayesian estimation algorithms are applied assuming that these zeros and ones are constant values, and the remaining slopes are estimated. After estimation is complete, the resulting matrix

The vector of utilities can be written in matrix form as:

Rotation consists of finding a nonsingular rotation matrix ^{*}

Rotated scores, ^{*}, are given by:

Because ^{-1} = ^{*}^{*} = ^{−1}

Matrix

The process of estimating the model consists of three steps:

Apply the Bayesian estimation algorithm described in Section Bayesian Parameter Estimation to estimate the model under simple constraints and imposing the pattern of zeros and ones described in Section Rotation of Slopes to avoid rotational indeterminacy. Model evaluation statistics described in Section Bayesian Model Evaluation are used to test model fit. If the model does not fit, a model with a higher number of dimensions has to be estimated. The output of this step is a model parameterized with simple constrains that satisfactory fits the data.

Estimated parameters may be transformed to deviation constraints with Equation (6) or to the item slopes and scoring parameters with Equation (9). The transformation of parameterizations is optional and depends on the intended interpretation and the type of items.

Rotate the slopes using a rotation algorithm or by graphical rotation. This step is optional. The choice of a rotation method depends on the judgment of the data analyst.

The MNRM has a heavy parameterization because there are slopes for (

MCMC provides draws from the posterior distribution of item parameters. These samples can be summarized using descriptive statistics to obtain a point-estimate, the simulated expected a-posteriori estimate (EAP), and the posterior variance. Previous application of MCMC to factorial and multidimensional item response models can be seen, for example, in Béguin and Glas (

One property of factorial models is that the orientation of the dimensions can be reverse without altering the fit of the model. That is, if one dimension θ_{d} and the slopes in that dimension are multiplied by −1, the resulting model will be statistically equivalent. This problem is especially compelling for MCMC estimation because several Markov chains of simulated parameters are run in parallel and some procedure must be applied to ensure that all chains are oriented in the same direction. In this article we have fixed the orientation of the dimension trait by setting the first slope of each dimension trait to 1, as mentioned in Section Rotation of Slopes. This is compensated by freeing the standard deviations of the dimensions (σ_{1}, …, σ_{d}, …, σ_{D}), for the total number of estimated parameters to remain unchanged.

The estimated parameters are the intercepts (

The hyper-parameters δ, γ, μ, and τ will be held to constant values in this article. A more general procedure has been proposed by Natesan et al. (

A crucial problem when performing an exploratory factor analysis is the selection of the number of dimensions. In the frequentist framework, there are many criteria suitable for this purpose, the chi-square goodness of fit statistics, the RMSEA statistic for the hypothesis of close fit, parallel analysis, and many others (Brown,

One readily interpretable model evaluation statistic is the standardized generalized dimensionality discrepancy measure (SGDDM), introduced by Levy et al. (

The SGDDM applies to dichotomous and ordinal responses (Yel et al., _{ijk} takes the value 1 when the response is

where _{ijk} is the response function given by Equation (1). _{ijk} is computed conditional on the item parameters and the values of

Posterior predictive checks proceed as follows. Suppose that ω_{1}, …, ω_{l}, …, ω_{L} are vectors of parameters simulated in the MCMC chains, that is _{l} = (_{l}, simulate a matrix of predicted responses,

where δ(·) returns the value 0 or 1 when its argument is false or true, respectively.

A discrepancy statistic for the whole model is obtaining by averaging the value of

The posterior predictive _{post}, is the proportion of cases in which the

Alternatively, Levy and Svetina (^{pred}; ^{pred};

Model evaluation by posterior predictive checks is a computationally intensive method based on resampling data. Several summary statistics have been proposed to avoid resampling. Possibly, the most popular statistic within the Bayesian context is the deviance information criterion (DIC; Spiegelhalter et al.,

Recently, several alternatives to DIC have been proposed in the area of Bayesian inference to overcome the dependence of the DIC on a precise point-wise estimator and its assumption of posterior normality. These new statistics are the widely applicable information criteria (WAIC; Watanabe,

Similar to AIC and other measures of model adequacy based on information theory, WAIC and LOO quantify the discrepancy between the model and the data that also take into account model complexity. The purpose is not to test a hypothesis of model fit but to compare several competing models and select the one that most closely approaches the data. The WAIC closely approximates cross-validation although it is computed in a single sample instead of re-fitting the model using different samples. The WAIC is potentially useful in the psychometric context because it still works with highly parameterized models, where other alternatives such as AIC and DIC are no longer applicable. However, to our knowledge, they have not been previously applied to item response or factorial models.

A Monte Carlo simulation study was conducted to evaluate the performance of the SGDDM and the discrepancy measures (DIC, WAIC, and LOO) in recovering the true number of dimensions for the MNRM.

We simulated 50 data sets from models with one, two, and three dimensions. Models with one, two, and three dimensions are estimated from each simulated sample. We have used only a limited number of samples because MCMC is highly time consuming and the simulation study has to be kept within our limit of computational resources. The figure of 50 samples was taken from Levy et al. (

Two set of prior distributions were used, informative priors and uniform priors. Informative priors are given in Equation (14), the values of δ and γ set to 3 because, in our previous experience, this value renders a relatively flat prior that at the same time avoids the occurrence of extreme values in the estimated parameters. The prior distribution for σ_{d} was more stringent to avoid excessive indeterminacy in the scale of the dimension. σ_{d} had a lognormal (0, 0.5) prior, which has a median of 1, an expectation of 1.13, and a standard deviation of 0.6. This lognormal prior is the same as the one used by the BILOG computer software (Zimowski et al., _{d} are realized in the simulated samples if the distribution is too flat. Thus, the informative priors are:

And the uniform priors are:

The simulation was repeated with sample sizes of 250, 500, and 1,000 simulees for each number of dimensions. The total number of conditions is 18 (3 values of dimensions × 3 sample sizes × 2 sets of priors). Responses were simulated from a test with 4 items with four categories each one.

Data were simulated using R version 3.2.5. (R Development Core Team,

Deviance measures, DIC, WAIC, and LOO were computed using the loo R package (Vehtari et al.,

Item parameters used in data generation.

_{1} |
_{2} |
_{3} |
|||
---|---|---|---|---|---|

1 | 1 | −1 | 1.0 | 0.0 | 0.0 |

2 | 0 | 0.5 | 1.0 | 0.0 | |

3 | 1 | −1.0 | 0.5 | 1.0 | |

4 | 0 | 0.0 | 0.0 | 0.0 | |

2 | 1 | −1 | 1.0 | −0.5 | −1.0 |

2 | 0 | 0.5 | 1.0 | −0.5 | |

3 | 1 | −1.0 | 0.5 | 0.5 | |

4 | 0 | 0.0 | 0.0 | 0.0 | |

3 | 1 | −1 | 1.0 | 0.5 | 0.5 |

2 | 0 | 0.5 | −0.5 | −1.0 | |

3 | 1 | −1.0 | 1.0 | −0.5 | |

4 | 0 | 0.0 | 0.0 | 0.0 | |

4 | 1 | −1 | 1.0 | 1.0 | −0.5 |

2 | 0 | 0.5 | 0.5 | 0.5 | |

3 | 1 | −1.0 | −0.5 | −1.0 | |

4 | 0 | 0.0 | 0.0 | 0.0 |

_{1} is used for the model with one dimension, columns a_{1} and a_{2} are used for the model with two dimensions, and columns a_{1} to a_{3} are used for the model with three dimensions

The analysis of simulation results includes the means of the model evaluation statistics, the empirical proportion of rejections (EPR) of the estimated model, the empirical proportion of selection (EPS) and the root mean square errors (RMSE) of estimated parameters. The EPR applies to the SGDDM only. The SGDDM can be used to test the null hypothesis that a model fits using _{post} as the _{post} ≤ 0.05. The proportion of simulated samples in which the model is rejected is the EPS. When the model in the null hypothesis (that is, the model used to compute the SGDDM) is the same as the model used to simulate the samples, the EPR is an estimate of the Type I error rate of the SGDDM. When the model in the null hypothesis does not coincide with the model used to simulate the data, the SGDDM is an estimate of the statistical power of the test.

The EPS applies to the model discrepancy statistics, DIC, WAIC, and LOO. In contrast to the SGDDM, the discrepancy statistics are not used to test a hypothesis but to select the best model from a number of competing models. Recall that three models (with one, two, and three dimensions) are estimated from each simulated sample. The discrepancy statistic evaluates the distance between the model and the data, and the model that minimizes the discrepancy statistic is selected. The EPS of a model is the proportion of times that a model is selected in the 50 simulated samples.

The RMSE measures the difference between the true and the estimated parameters to evaluate parameter recovery (Natesan et al.,

Table _{post} > 0.05); models with two and three dimensions are also retained by the SGDDM, as they are generalizations of the one-dimension model. The DIC consistently supports the one-dimension model; however, WAIC and LOO showed a tendency to over-factor and supported the model with three dimensions.

Model evaluation statistics for the simulation study.

_{pos}_{t} |
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

250 | 1D | 0.072 | 0.070 | 0.42 | 0.00 | 2,542.0 | 10.00 | 2,295.7 | 0.00 | 2,312.2 | 0.22 |

2D | 0.069 | 0.070 | 0.55 | 0.00 | 3,079.5 | 0.00 | 2,281.7 | 0.24 | 2,305.9 | 0.58 | |

3D | 0.068 | 0.070 | 0.59 | 0.00 | 3,586.0 | 0.00 | 2,277.9 | 0.76 | 2,305.7 | 0.20 | |

500 | 1D | 0.058 | 0.058 | 0.47 | 0.00 | 5,023.3 | 10.00 | 4,594.5 | 0.00 | 4,621.4 | 0.16 |

2D | 0.057 | 0.058 | 0.57 | 0.00 | 6,101.4 | 0.00 | 4,571.7 | 0.20 | 4,608.8 | 0.64 | |

3D | 0.056 | 0.057 | 0.59 | 0.00 | 7,227.7 | 0.00 | 4,565.3 | 0.80 | 4,611.8 | 0.20 | |

1,000 | 1D | 0.049 | 0.048 | 0.44 | 0.00 | 10,047.3 | 10.00 | 9,099.0 | 0.00 | 9,142.3 | 0.18 |

2D | 0.047 | 0.048 | 0.58 | 0.00 | 12,103.0 | 0.00 | 9,066.0 | 0.04 | 9,122.5 | 0.46 | |

3D | 0.047 | 0.048 | 0.61 | 0.00 | 14,537.9 | 0.00 | 9,052.0 | 0.96 | 9,124.5 | 0.36 |

_{post} is the mean of the p_{post}. EPR is the empirical proportion of rejection multiplied by 100, where a model is rejected when p_{post} ≤ 0.05. EPS is the empirical proportion of selection

The results for the conditions with a two-dimensional generating model and informative priors appear in Table

Model evaluation statistics for the simulation study.

_{pos}_{t} |
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

250 | 1D | 0.089 | 0.070 | 0.02 | 0.86 | 2,534.9 | 10.00 | 2,285.6 | 0.00 | 2,304.5 | 0.00 |

2D | 0.069 | 0.069 | 0.47 | 0.00 | 2,980.9 | 0.00 | 2,199.9 | 0.12 | 2,199.9 | 0.50 | |

3D | 0.068 | 0.068 | 0.54 | 0.00 | 3,755.7 | 0.00 | 2,187.3 | 0.88 | 2,187.3 | 0.50 | |

500 | 1D | 0.079 | 0.058 | 0.00 | 0.98 | 5,114.2 | 10.00 | 4,594.7 | 0.00 | 4,623.5 | 0.00 |

2D | 0.057 | 0.056 | 0.42 | 0.00 | 6,027.7 | 0.00 | 4,464.8 | 0.04 | 4,527.8 | 0.64 | |

3D | 0.056 | 0.056 | 0.52 | 0.00 | 7,593.6 | 0.00 | 4,451.4 | 0.96 | 4,529.6 | 0.36 | |

1,000 | 1D | 0.078 | 0.049 | 0.00 | 10.00 | 10,204.2 | 10.00 | 9,089.4 | 0.00 | 9,136.8 | 0.00 |

2D | 0.048 | 0.047 | 0.40 | 0.00 | 12,013.9 | 0.00 | 8,780.5 | 0.02 | 8,907.7 | 0.54 | |

3D | 0.047 | 0.047 | 0.49 | 0.00 | 15,398.4 | 0.00 | 8,757.6 | 0.98 | 8,906.8 | 0.46 |

_{post} is the mean of the p_{post}. EPR is the empirical proportion of rejection multiplied by 100, where a model is rejected when p_{post} ≤ 0.05. EPS is the empirical proportion of selection

Table

Model evaluation statistics for the simulation study.

_{pos}_{t} |
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

250 | 1D | 0.093 | 0.072 | 0.01 | 0.92 | 2,564.9 | 10.00 | 2,272.3 | 0.00 | 2,294.0 | 0.00 |

2D | 0.078 | 0.070 | 0.19 | 0.12 | 3,238.9 | 0.00 | 2,222.2 | 0.00 | 2,262.0 | 0.10 | |

3D | 0.070 | 0.069 | 0.46 | 0.00 | 4,315.8 | 0.00 | 2,188.1 | 10.00 | 2,245.0 | 0.90 | |

500 | 1D | 0.099 | 0.059 | 0.00 | 10.00 | 5,308.4 | 10.00 | 4,726.4 | 0.00 | 4,760.0 | 0.00 |

2D | 0.066 | 0.057 | 0.05 | 0.66 | 6,267.0 | 0.00 | 4,585.6 | 0.00 | 4,653.5 | 0.08 | |

3D | 0.057 | 0.058 | 0.06 | 0.00 | 8,621.4 | 0.00 | 4,519.1 | 10.00 | 4,626.8 | 0.92 | |

1,000 | 1D | 0.082 | 0.050 | 0.00 | 10.00 | 10,598.8 | 10.00 | 9,337.7 | 0.00 | 9,403.6 | 0.00 |

2D | 0.056 | 0.046 | 0.02 | 0.86 | 13,185.2 | 0.00 | 9,070.7 | 0.00 | 9,197.4 | 0.00 | |

3D | 0.064 | 0.046 | 0.45 | 0.00 | 18,358.9 | 0.00 | 8,925.7 | 10.00 | 9,133.6 | 10.00 |

_{post} is the mean of the p_{post}. EPR is the empirical proportion of rejection multiplied by 100, where a model is rejected when p_{post} ≤ 0.05. EPS is the empirical proportion of selection

Figure

Scatterplot of the realized and predicted values of SGDDM. The line indicates equality of realized and predicted values and is included as a reference. Gray, black and white symbols refer to fitted models with one, two, and three dimensions, respectively. Circles, rhombs and triangles stand for 250, 500, and 1,000 simulees, respectively.

The results were almost the same when using uniform priors. For example, Table

Model evaluation statistics for the simulation study.

_{pos}_{t} |
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

250 | 1D | 0.091 | 0.071 | 0.01 | 10.00 | 2,478.4 | 10.00 | 2,279.5 | 0.00 | 2,301.4 | 0.00 |

2D | 0.069 | 0.068 | 0.43 | 0.00 | 3,624.4 | 0.00 | 2,150.1 | 0.00 | 2,230.8 | 0.02 | |

3D | 0.069 | 0.067 | 0.40 | 0.00 | 6,187.2 | 0.00 | 2,053.4 | 10.00 | 2,204.0 | 0.98 | |

500 | 1D | 0.077 | 0.058 | 0.00 | 0.98 | 5,004.4 | 10.00 | 4,588.4 | 0.00 | 4,619.1 | 0.00 |

2D | 0.057 | 0.055 | 0.40 | 0.00 | 9,812.9 | 0.00 | 4,372.8 | 0.00 | 4,499.0 | 0.02 | |

3D | 0.056 | 0.054 | 0.37 | 0.00 | 19,616.7 | 0.00 | 4,204.0 | 10.00 | 4,446.6 | 0.98 | |

1,000 | 1D | 0.081 | 0.050 | 0.00 | 10.00 | 9,888.4 | 10.00 | 9,079.6 | 0.00 | 9,128.1 | 0.00 |

2D | 0.056 | 0.045 | 0.32 | 0.00 | 26,458.3 | 0.00 | 8,542.0 | 0.02 | 8,820.6 | 0.06 | |

3D | 0.047 | 0.045 | 0.31 | 0.00 | 65,564.5 | 0.00 | 8,142.4 | 0.98 | 8,674.7 | 0.94 |

_{post} is the mean of the p_{post}. EPR is the empirical proportion of rejection multiplied by 100, where a model is rejected when p_{post} ≤ 0.05. EPS is the empirical proportion of selection

The results about recovery of parameters are summarized in Table

Average of the RMSE for the estimated parameters.

250 | 1D | 0.489 | 0.377 | 0.499 | 0.514 | 0.406 | 0.748 |

2D | 0.590 | 0.559 | 0.416 | 0.715 | 0.738 | 0.642 | |

3D | 0.614 | 0.559 | 0.428 | 1.001 | 1.007 | 0.573 | |

500 | 1D | 0.419 | 0.311 | 0.489 | 0.430 | 0.325 | 0.761 |

2D | 0.475 | 0.377 | 0.462 | 0.574 | 0.546 | 0.675 | |

3D | 0.511 | 0.455 | 0.456 | 0.840 | 0.924 | 0.595 | |

1,000 | 1D | 0.363 | 0.294 | 0.409 | 0.387 | 0.511 | 0.653 |

2D | 0.445 | 0.407 | 0.354 | 0.607 | 0.746 | 0.535 | |

3D | 0.458 | 0.437 | 0.423 | 0.770 | 0.924 | 0.562 |

In conclusion, the SGDDM has proven to be a reliable statistic to evaluate dimensionality in the conditions of this simulation. This statistic had a low tendency to reject the two-dimension model when the generating model has three dimensions and the sample is not large. In practice, the conservative behavior of the SGDDM can be seen as a desirable property, as it provides protection against the extraction of dimensions that are not well represented in the data. More investigation would be needed to take the SGDDM as a general measure to evaluate dimensionality of nominal response models in the Bayesian context. With respect to the discrepancy statistics, their real advantage is that they avoid resampling of posterior predictive data matrices and can be computed much more quickly and easily than the SGDDM. However, these results, preliminary as they are, indicate that these statistics should not be used to evaluate model dimensionality.

This section describes an exploratory nominal factor analysis in the Bayesian framework using a data sample in the context of learning styles. The purpose is to illustrate the proposed methods in the context of an investigation with real data.

The data set was adopted from a reduced version of the Kolb's (

The original version of the LSI consists of 12 self-report items with 4 response categories that should be rank ordered by the subjects according to their preferences. Each of the categories is designed to load on one of the poles of the bipolar variables: feeling, watching, thinking, and doing. However, the present study is based on a reduced version of the LSI to facilitate the task to the individuals. The reduced version contains four items and is shown in Table

Reduced version of the Kolb's Learning Style Inventory.

ITEM 1. I learn best when… |

- I rely on my feelings to guide me |

- I observe the situation |

- I set priorities |

- I try out different ways of doing it |

ITEM 2. I learn… |

- feeling |

- watching |

- thinking |

- doing |

ITEM 3. When I learn… |

- I like to deal with my feelings |

- I like to watch and listen |

- I like to think about ideas |

- I like to be doing things |

ITEM 4. I learn best from… |

- personal relationships |

- observation |

- rational theories |

- a chance to try out and practice |

Subjects were 448 students of the Universidad Católica del Norte (Chile). All the subjects were first-year graduate students: 38% of Psychology, 37% of Engineering, 13% of Architecture, 8% of Journalism, and 4% of Economics. Males and females were equally represented, and ages ranged from 17 to 37 years (mean 19.13 and standard deviation 1.75). These data were collected as part of a larger study of learning preferences involving several questionnaires; thus, it was important to reduce the number of items administered to each individual and to facilitate the task involved by each item. With four items and four categories each one, the number of different response patterns that can be observed is 4^{4} = 256, and there is less than twice the number of individuals than response patterns. The Bayesian framework is appealing with samples of moderate size like this to stabilize estimates.

The classic parameterization of the MNRM in Equation (2) was selected for the analysis of the LSI because the important information that we want to recover is the relation between the categories and the dimensions. Models were estimated using the prior distributions in Equation (19).

The results of the study are organized according to the three steps explained in Section Estimation of the Model.

Step 1 consists of estimating several models with an increasing number of dimensions and parameterized with simple constraints. Models between one and four dimensions were estimated and the simplest model that fits the data was selected. Table

Model evaluation statistics: posterior predictive checks and discrepancy measures.

No. of parameters | 11(12)448[1] | 21(12)896[2] | 30(12)1,344[3] | 38(12)1,792[4] |

0.071 | 0.057 | 0.057 | 0.057 | |

^{pred}; |
0.055 | 0.058 | 0.058 | 0.057 |

_{post} |
0.005 | 0.560 | 0.559 | 0.567 |

DIC | 4,291.0 | 5,612.7 | 8,669.7 | 13,097.4 |

WAIC | 3,791.8 | 3,504.1 | 3,423.4 | 3,250.2 |

elpd_{waic} |
−1895.9 | −1752.1 | −1711.7 | −1625.1 |

p_{waic} |
301.8 | 499.0 | 581.9 | 638.6 |

LOO | 3,867.0 | 3,669.6 | 3,654.1 | 3,591.1 |

elpd_{loo} |
−1933.5 | −1834.8 | −1827.0 | −1797.1 |

p_{loo} |
339.4 | 581.8 | 697.2 | 810.6 |

_{j}]. p_{waic} and p_{loo} are estimations of the effective number of parameters. elpd_{loo} is the expected log predictive density

The visual inspection of the dispersion plot of ^{pred}; ^{pred}; _{post} associated to SGDDM in Table

Scatterplot of the realized and posterior predictive values of the SGDDM for the models with one to four dimensions.

Estimated parameters under simple constraints appear in Table

Parameter estimates for the two-dimension model under simple constraints.

1 | 1 | 3.41 (0.49) | ||

2 | 2.38 (0.50) | 0.82 (0.09) | ||

3 | 1.51 (0.49) | 0.35 (0.11) | 0.52 (0.31) | |

4 | ||||

2 | 1 | 5.24 (1.00) | 1.77 (0.05) | −1.55 (1.07) |

2 | 1.77 (0.96) | 0.89 (0.37) | 0.88 (0.83) | |

3 | −0.70 (1.31) | 0.53 (0.35) | 3.89 (1.26) | |

4 | ||||

3 | 1 | 2.91 (0.54) | 1.57 (0.44) | −1.63 (1.10) |

2 | 2.42 (1.50) | 0.96 (0.27) | 0.80 (0.76) | |

3 | 0.61 (0.70) | 0.34 (0.24) | 3.27 (1.10) | |

4 | ||||

4 | 1 | 1.22 (0.43) | 0.75 (0.24) | −0.11 (0.71) |

2 | 0.24 (0.57) | 1.05 (0.54) | 4.19 (1.32) | |

3 | 0.57 (0.41) | 0.55 (0.23) | 2.67 (0.90) | |

4 |

_{1} = 2.65 (1.80) and σ_{2} = 0.76 (0.51). The slope for the categories 1 and 2 of item 1 has been set to 1 to fix the orientation of the dimension

Simple constraints are not useful for interpreting the LSI questionnaire because category 4 is not a reference category but a substantive one. The information about the relation between category 4 and the dimensions is lost if their parameters are set to zero to resolve the mathematical indeterminacies of the probabilistic model. Simple constraints were transformed to deviation constraints to obtain a more meaningful parameterization in which all categories can have nonzero parameters. The result is on the left part of Table

Transformed slopes for the two-dimension model.

1 | 1 | 0.46 | −0.38 | 0.50 | 0.32 |

2 | 0.27 | 0.62 | −0.50 | 0.45 | |

3 | −0.19 | 0.14 | −0.20 | −0.14 | |

4 | −0.54 | −0.38 | 0.20 | −0.63 | |

2 | 1 | 0.98 | −2.36 | 2.54 | 0.20 |

2 | 0.10 | 0.07 | −0.04 | 0.11 | |

3 | −0.27 | 3.09 | −3.02 | 0.69 | |

4 | −0.80 | −0.81 | 0.52 | −1.01 | |

3 | 1 | 0.85 | −2.24 | 2.40 | 0.11 |

2 | 0.24 | 0.19 | −0.11 | 0.29 | |

3 | −0.38 | 2.66 | −2.65 | 0.46 | |

4 | −0.72 | −0.61 | 0.36 | −0.87 | |

4 | 1 | 0.16 | −1.80 | 1.76 | −0.40 |

2 | 0.46 | 2.50 | −2.24 | 1.21 | |

3 | −0.04 | 0.98 | −0.95 | 0.27 | |

4 | −0.59 | −1.69 | 1.43 | −1.08 |

A visual inspection of parameters under simple constraints reveals that the interpretation could be benefited from a rotation. Rotation was not performed by algorithm procedures such as varimax, oblimin, etc. The reason is that in a simple example like this, with few slopes to rotate, a visual inspection of slopes and the judgment of the data analysis may provide a more meaningful interpretation that these algorithms that are blind to item content. Rotation was performed by a graphical method (Lawley and Maxwell,

Figures

Slopes under deviation constraints. The points are labeled with the number of item and category. For example, the point 4,2 refers to item 4 category 2.

Rotated slopes by an angle of 72°.

The items contain four response categories (feeling, watching, thinking, and doing) that represent the four extremes in the learning model by Kolb (

In our questionnaire, categories 1 and 3 (feeling and thinking) were designed to represent the two extremes of the first bipolar dimension, whereas categories 2 and 4 (watching and doing) are the two extreme of the second dimension.

The rotated slopes found in the data analysis are in concordance with the theoretical foundation of the questionnaire. The first category of the four items has a positive slope in Dimension 1, and the slope is negative for the third category. Therefore, the probability of category 1 is high when the location of the individual in Dimension 1 is high, and those individuals who are low in Dimension 1 will have a high probability of selecting category 3. Based on these results, Dimension 1 can be interpreted as a bipolar dimension with two extremes: feeling and thinking respectively, which is recognized as the first bipolar dimension in the theoretical model by Kolb.

Similarly, the second category of the four items has a positive rotated slope in Dimension 2 whereas category four has a negative rotated slope. Because the slope indicates the relation of the probability of the category with the dimension, the probability of category two increases with the dimension and the probability of category four increases when the dimension decreases. Thus, the second dimension found in our data is recognized as the second bipolar dimension by Kolb. However, the slopes in the second dimension have a smaller magnitude than those in the first dimension and thus the questionnaire provides less precise measurements in the second dimension.

In conclusion, the two theoretical dimensions postulated by Kolb emerged in our data, which constitutes support for this theoretical model. However, Dimension 1 seems more prominent according to the magnitude of the slopes, and an enlarged version of the questionnaire should be considered to obtain precise estimates in the two dimensions.

This article described Bayesian methods for evaluating the latent dimensionality of the MNRM, a simulation study, and an example with real data. The initial motivation for moving the inference for the MNRM to the Bayesian context was to alleviate the estimation problems originated by the complex parameterization. However, the drawback of leaving the frequentist framework is the loss of the chi-square and other measures of model fit. For these reasons, it was necessary to define an evaluate Bayesian measures of model adequacy.

The main focus of the article is on dimensionality assessment for the MNRM in the Bayesian context, in particular on the use of the SGDDM for the evaluation of dimensionality. An extension of the SGDDM to the nominal model is introduced and evaluated in a simulation study. Results reveal that the SGDDM is a useful statistic to evaluate dimensionality of the MNRM. This statistic was perhaps a little conservative in small samples, as it showed some tendency to under-factoring. However, this is not necessarily a drawback of the SGDDM because estimates tend to be unstable in small samples. The Bayesian methods implicitly take into account the imprecision of the estimates, and tend to avoid the extraction of those dimension that have a weak empirical support.

The SGDDM was compared in the simulations to three discrepancy measures (DIC, WAIC, and LOO). The discrepancy measures have computational advantages, as they do not require resampling. However, in the conditions of the present investigation they have little utility. The DIC has a strong tendency to under-factoring. The WAIC and LOO were more useful; WAIC was more liberal than LOO and exhibited a preference for models with more dimension, falling on the side of over-factoring in some cases. Thus, the LOO seems the most promising discrepancy measure but its performance is still far from those of SGDDM. All in all, resampling data and computing the SGDDM seems to be the most reliable method for dimensionality assessment in the Bayesian context.

The present investigation can be expanded in several ways. Regarding the discrepancy statistics the most important open problem is the identification of those conditions where these statistics provide valuable information in conjunction with the MRNM. That would be a valuable contribution because discrepancy statistics avoid resampling and are much more computationally cheaper than SGDDM. Vehtari et al. (

Although the SGDDM is a promising approach to evaluate model dimensionality, it has been tested in a limited number of conditions in the simulation study. The generalization of the present results to other conditions and instruments needs to be further investigated. Levy et al. (

The simulation study confirmed that prior distributions may help to avoid the problem of high standard errors associated with item parameters. Our analysis revealed that a normal prior is appropriate for the purposes of stabilizing estimates. However, the prior distribution has to be chosen carefully. A too concentrated prior may introduce a bias in the estimated parameters and, on the other hand, a vague prior may lead to problems of convergence and high standard errors (Sheng,

One second line of future research is the determination of the minimum number of simulated samples for a simulation study like this is. MCMC simulation is a computationally intensive method and estimation is typically much slower than maximum-likelihood. For these reasons simulation studies tend to use a limited amount of samples. However, a systematic investigation on the minimum number of samples, following the indications by Koehler et al. (

The contribution of JR consists of defining the Bayesian estimation and model evaluation procedures for the multidimensional nominal response using Bayesian procedures. JR is also responsible of writing the computer codes in the R and Stan languages and running the simulation study included in the last section of the article. The contribution of CX has focused on the real data analysis section of the article, which describes an application of the exploratory nominal factor analysis in the context of learning styles. This includes the data collection and the analyses of the results. Both authors, JR and CX, have collaborated in writing the paper.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

This research was partially supported by grants PSI2012-31958 and PSI2015-66366-P from the Ministerio de Economía y Competitividad (Spain). Computations have been run with the support of the Center for Scientific Computing at the Autonoma University of Madrid (CCC-UAM). We thank Carlos Calderon for collecting the data of the empirical application.