Main Article Content
This study was focus on comparing the estimation methods for missing data in simple linear regression. The methods that used to estimate missing data are Singh method and Expectation Maximization Algorithm (EM). The comparison was done under condition of sample sizes 40, 100, 500 and 1,000; variances 1, 10 and 50; percentages of missing data 5%, 10% and 15%; the correlation coefficient levels between the dependent and independent variable are -0.3, -0.6, -0.9, 0.3, 0.6 and 0.9. The criterion of determination is Root Mean Square Error (RMSE). The results show that the EM method is a better estimation method than Singh method for simple linear regression due to EM method give the lowest RMSE values for all levels of correlation coefficients, sample sizes, variances and percentages of missing data.
Each article is copyrighted © by its author(s) and is published under license from the author(s).
Laaksonen S. Regression-Based nearest neighbor hot decking, Computation Statistics. 2000: 15(1); 65-71.
Little Roderick JA, Rubin Donald B. Statistical Analysis with Missing Data. New York: John Wiley, 1987.
Singh S, Horn S. Compromised imputation in survey sampling, Metrika. 2000: 51; 266-276.
Singh GN, Kumari P, Jong MK. Estimation of population mean using imputation techniques in sample surveys, Journal of the Korean Statistic Society. 2010: 39: 67-74.
Wararit P. The monte carlo simulation for estimating the coefficients of skewness when observations are inverse gaussian distributed, Kasalongkham Research Journal. 2009; 3(1): 14-23.