Ali Hasmy


This research was conducted to see the effect of the number of testee (sample size), the number of items (test length), the number of options, and index of difficulty to various item discrimination statistics and test reliability. The data used are simulated data and analyzed using the Test Analysis Program (TAP) version 6.65 with a full factorial design. In general, the results show that the number of testee, the number of items, and index of difficulty (except number of options) significantly affect the various item discrimination statistics and test reliability. The statistics are robust to these three factors is only the Mean of Item Discrimination and Spearman Brown’s 1-2 Split-Half, while the most sensitive is Split-Half Odd-Even statistic.


Coakes, S. J. & Steed, L. G. (1996). SPSS for windows: Analysis without anguish. Brisbane: John Wiley & Sons.

Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Bulletin, 109: 512 – 519.

Field, a. (2000). Discovering statistics using SPSS for windows: Advanced techniques for the beginner. London: Sage Publications

Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (1998), Multivariate data analysis, 5th edition, Prentice Hall Internasional: UK

Hambleton, R. K., & Cook, L. L. (1983). Robustness of item response models and effects of test length and sample size on the precision of ability estimates. Dalam D. Weiss (Ed.). New horizon in testing, 31 – 49. New York: Academic Press

Harwel, M. R., Stone, C. A., Hsu, T. C., dkk (1996). Monte-Carlo studies in item response theory. Applied Psychological Measurement, 20, 101 – 125.

Jiao, H., & Kamata, A. (2003). Model comparison in the presence of local item dependence, Paper presented at the annual meeting of the AERA, Chicago, April 21-25, 2003

Joreskog, K. G., & Sorbom, D. (1996). LISREL 8: User’s reference guide. Chicago: Sciencetific Software International

Lewis, M. (2002). Test analysis program version 4.2.5: User’s guide. Gordon P. Brooks.

Kirk, R. E. (1995). Experimental design: Procedures for the behavioral sciences. Pacivic Grove, California: Brooks/Cole Publishing Company.

Mislevy, R. J., & Bock, R. D. (1990). BILOG 3: Item analysis & test scoring with binary logistic models. Moorseville: Sciencetific Software Inc.

Segall, D. O. (2000). General ability measurement: An application of multidimensional item respon theory. Psychometrica, 66, 79 – 97.

Article Metrics

This article has been viewed : 47 times
PDF files viewed : 103 times Full Text: PDF

DOI: 10.24260/at-turats.v8i2.113