Sample Size Requirements for the Bootstrap Likelihood Ratio Test in Latent Class Analysis: Based on Polytomous Items
【摘要】：Latent class analysis(LCA) is a statistical technique used to identify potentially heterogeneous subgroups based on some selected observed variables, aimed to an array of latent classes that explain the association between categorical variables. The correct class number is the key to detect the true heterogeneity of the population, which leading that the choice of the number of latent classes has a great influence on the substantive interpretation of the model results. Researchers usually determine the number of latent classes with a single set of optimal criteria. The most common methods for determining the number of latent classes are Information-based Criteria(IC), Likelihood Ratio Test(LRT), and Classification Uncertainty(CU). Bootstrap Likelihood Ratio Test(BLRT) is one of the LRT method, using bootstrap samples to empirically derive the sampling distribution of the LRT statistic, overcomes the limitation of non-convergence of model in LRT. Compared with other indices, BLRT test was proved the most accurate index to select model in existing simulation studies. Nevertheless, the performance of BLRT test also depended on a number of factors, such as the number of indicators, the sample size. Current researchers appeared hardly acquainting with using BLRT test suit for what conditions To explore the power of BLRT and the sample size requirement based on polytomous items, two pre-study and a formal Monte Carlo study were conducted. The pre-study investigated the power of BLRT based on different combinations of probability of membership, probability of response, and sample size. Four latent class models were designed in both two pre-study to compare the power performance. In formal study, we proposed effect size formulas based on the power based on proportion of P-values(PPP) measures in above various combined conditions. Blind hill climbing was applied to determine the sample size requirement leading to practical applied of BLRT in LCA. We simulated 100 datasets with sample size N = 5000. In each, we fit models and compared the power of BLRT using a bootstrap test with the 5× 4=20 combinations of J = 6, 7, 8, 9, or 10 5-point polytomous items and of K=2, 3, 4, or 5 classes. All data are generated and analyzed using package POLCA, MASS, and bootstrap in R. The results showed that: 1) the power of BLRT were related to probability of membership, probability of response and sample size. Under high probability of response and same probability of membership, the power of BLRT were exceeded the condition with unequal probability of membership, meanwhile the sample size requirement decreased. 2) the sample size requirement in LCA were related to numbers of items and number of classes. Generally, at least 200 of the sample size was required to achieve a statistical power level of 0.8 under any condition, and the sample size requirement would increase as the number of item or class increases. When number of classes were fixed, the fewer number of items, the larger sample size was needed; when numbers of items were fixed, the more number of classes, the larger sample size was required.