Contingency-Table Sparseness under Cumulative Logit Models for Ordinal Response Categories and Nominal Explanatory Variables with Two-Factor Interaction

Sujin Sukgumphaphan; Veeranun Pongsapukdee

Authors

Sujin Sukgumphaphan Department of Statistics, Faculty of Science, Silpakorn University, Nakhon Pathom 73000, Thailand.
Veeranun Pongsapukdee Department of Statistics, Faculty of Science, Silpakorn University, Nakhon Pathom 73000, Thailand.

Keywords:

contingency table, goodness of fit, interaction effect, multinomial cumulative logit models, sparseness

Abstract

In this article the sparseness and the assessing goodness of fit of cumulative models for ordinal response categories and nominal explanatory variables with two-factor interaction are investigated. The sparseness is computed from the number of occurrence of at least one empty cell in each simulation in 1,000 simulations. The magnitude of goodness-of-fit statistics, the coefficients of determination or R² analogs, the likelihood ratio statistic,G_M, AIC (Akaike Information Criterion, [2]),and BIC (Bayesian Information Criterion, Schwarz, 1978) are calculated. The simulations have been conducted for the multinomial logit models with K=3 response categories and two random explanatory variables X₁ and X₂whose joint distribution of (X₁,X₂) is assumed to be multinomial with probabilities $\inline \dpi{100} \pi$ ₁, $\inline \dpi{100} \pi$ ₂, $\inline \dpi{100} \pi$ ₃ and $\inline \dpi{100} \pi$ ₄, corresponding to (X₁,X₂) values of (0, 0), (0,1), (1, 0), (1, 1), respectively. Three sets of ( $\inline \dpi{100} \pi$ ₁, $\inline \dpi{100} \pi$ ₂, $\inline \dpi{100} \pi$ ₃, $\inline \dpi{100} \pi$ ₄) are studied to represent different distributional shapes, which were chosen to induce possibly strong effects such that β₁=log₂ ,β₂=log₃ and β₁₂= 0.0-4.5, namely (X₁, X₂)~multinomial(0.10,0.35,0.45,0.10), (X,1X2)~ multinomial (0.50,0.30,0.10,0.10), and (X₁, X₂)~multinomial (0.25,0.25,0.25,0.25). Four sets of the three ordered category distributing corresponding with the (X₁, X₂) were again generated through the models under the proportions of (p₁ , p₂ , p₃), namely Y~multinomial (p₁ , p₂ , p₃): (0.05,0.20,0.75), (0.25,0.50,0.25), (0.5,0.20,0.25), and (0.33,0.33,0.33) from which it follows that the true model intercepts are $\inline \dpi{100} \alpha _{1} = log \frac{p_{1}}{p_{2}+p_{3}}$ , $\inline \dpi{100} \alpha _{2} = log \frac{p_{1}+p_{2}}{p_{3}}$ , corresponding to the proportions of Y = 1, 2, 3 respectively. Four sample sizes of 600, 800, 1,000, and 1,500 units were performed. Each condition was carried out for 1,000 repeated simulations using the developed macro program run with the Minitab Release 11 [17].

The results indicate that the minimum sparseness of contingency tables and the maximum of goodness-of-fit statistics, R² analogs and BIC, occur for the distribution of Y~multinomial (0.05,0.20,0.75) with (X₁,X₂)~multinomial (0.25,0.25,0.25,0.25) as well as when each distribution of Y and (X₁,X₂) is equally symmetric proportions. In contrast, the maximum sparse cells occur for the distributions of Y~ multinomial (0.25,0.50,0.25) with (X₁,X₂)~multinomial (0.50,0.30,0.10,0.10). In addition, when (X₁,X₂) is (0.25,0.25,0.25,0.25), it always gives less tendency of sparseness than those when (X₁,X₂) are asymmetric, as the sample size become large. Moreover, the number of sparseness tends to increase as the interaction parameter, β₁₂ increases; however, it is also relatively decreased when the sample sizes increase. Hence, for the true model with correlated structures are presented, the sparseness of the contingency tables increases as the interaction- parameter increases, and the rate of increasing will decrease as the sample sizes increase. These results indicate and confirm some association patterns in the models and the contingency tables. Therefore, when the distribution of Y is either equally symmetry or that’s in increasing ordered proportions, corresponding with those of (X₁,X₂) are also symmetric, the moderate to small sample sizes are possible; however, when most distributions are asymmetric we do recommend only the large sample sizes for suitable analysis of the association and sparse contingency tables.

Contingency-Table Sparseness under Cumulative Logit Models for Ordinal Response Categories and Nominal Explanatory Variables with Two-Factor Interaction

Authors

Keywords:

Abstract

Downloads

How to Cite

Issue

Section

Make a Submission

Information

logo

ThaiES

visitor