Psychometric Properties of Chemistry Multiple-Choice Paper for Science Students: An Extensive Analysis By Rasch Model

Authors

Rose Enne Emellia Mohamed Razali

School of Inspectorate Kuala Lumpur, Tingkat 24, Blok Utama Menara Takaful Malaysia, No. 4 Jalan Sultan Sulaiman, 50000 Kuala Lumpur (Malaysia)

Ahmad Adnan Mohd Shukri

School of Educational Studies, Universiti Sains Malaysia, 11800 USM Pulau Pinang (Malaysia)

Harris Shah Abd. Hamid

Faculty of Medicine (M), Universiti Kuala Lumpur Royal College of Medicine Perak, 30450 Ipoh, Perak (Malaysia)

Article Information

DOI: 10.47772/IJRISS.2026.10100588

Subject Category: Education

Volume/Issue: 10/1 | Page No: 7586-7596

Publication Timeline

Submitted: 2026-01-30

Accepted: 2026-02-05

Published: 2026-02-19

Abstract

An achievement test is an essential element in the teaching and learning process as its primary purpose is to measure student performance. However, high-quality test items require extensive time and effort to be produced, particularly multiple-choice questions. Teachers are known as content experts and test developers however some of them may lack knowledge in test development. Therefore, some of the constructed items may be flawed, biased, or unreliable to measure the students’ performance. The present study aims to determine the psychometric properties of newly developed multiple-choice questions for the Chemistry test paper using the Rasch Model. A study was conducted among 435 respondents from four randomly selected secondary schools in Klang Valley, Malaysia. Data were analyzed using a software called Winsteps. From the point of unidimensionality, Principal Component Analysis explained the test dimension of the instrument was moderate and acceptable, with 27.8% of raw variance measured. The reliability estimates for items were 0.99 while for person reliability is 0.87. multiple-choice paper. The aforementioned findings provide a dimension of validity of the test. In conclusion, the Rasch model has proven that the Chemistry test paper is a valid and reliable unidimensional instrument in measuring students’ ability and item difficulty.

Keywords

Rasch model, chemistry education, unidimensionality, reliability estimates, separation index

Downloads

References

1. Abdellatif, H. (2023). Test results with and without blueprinting: Psychometric analysis using the Rasch model. Educacion Medica, 24(3), 1-14. https://doi.org/10.1016/j.edumed.2023.100802 [Google Scholar] [Crossref]

2. Ahmad, J., & Siew, N. M. (2021). Curiosity towards STEM education: A questionnaire for primary school students. Journal of Baltic Science Education, 20(2), 289-304. https://doi.org/10.33225/jbse/21.20.289 [Google Scholar] [Crossref]

3. Al-Kafawein, J., & Al-Hilal, M. (2025). Assessing students’ progress in chemistry: Using multiple-choice questions and performance-based assessments. Journal of Curriculum & Teaching, 14(1), 174-183. https://doi.org/10.5430/jct.v14n1p174 [Google Scholar] [Crossref]

4. Bakytbekovich, O. N., Mohammed, A., Alghurabi, A. M. K.,...& Afif, A. N. S. (2023). Distractor analysis in multiple-choice items using the Rasch model. International Journal of Language Testing, 13 (Special Issue), 69-78. https://doi.org/10.22034/ijlt.2023.387942.1236 [Google Scholar] [Crossref]

5. Boone, W. J. (2016). Rasch analysis for instrument development: Why, when, and how? CBE-Life Sciences Education, 15(4), rm4. https://doi.org/10.1187/cbe.16-04-0148 [Google Scholar] [Crossref]

6. Danushka, N. A. S., & Gamage, P. S. Y. (2024). Assurance of test authenticity: Power of table of specification (TOS). University of Vocational Technology. [Google Scholar] [Crossref]

7. Darmana, A., Sutiani, A., Nasution, H., Ismanisa, I., & Nurhaswinda, N. (2021). Analysis of Rasch model for the validation of chemistry national exam instruments. Jurnal Pendidikan Sains Indonesia, 9(3), 329-345, https://doi.org/10.24815/jpsi.v9i3.19618 [Google Scholar] [Crossref]

8. Deng, T., Sousa, L. M., Garg, V., & Bradley, M. S. A. (2023). Segregation of formulated powders in direct compression process and evaluations by small bench-scale testers. International Journal of Pharmaceutics, 647, 1-14. https://doi.org/10.1016/j.ijpharm.2023.123544 [Google Scholar] [Crossref]

9. Fitrah, M., Sofroniou, A., Ofianto, Judijanto, L., & Widihastuti. (2024). Reliability and separation index analysis of mathematics questions integrated with the cultural architecture framework using the Rasch model. Journal of Education & e-Learning Research, 11(3), 499-509. https://doi.org/10.20448/jeelr.v11i3.5861 [Google Scholar] [Crossref]

10. Hadzibajramovic, E., Schaufeli, W., & Witte, H. D. (2020). A Rasch analysis of the Burnout Assessment Tool (BAT). PLOS One, 15(11), Article e0242241. https://doi.org/10.1371/journal.pone.0242241 [Google Scholar] [Crossref]

11. Hlynsson, J. I., Sjoberg, A., Strom, L., & Carlbring, P. (2025). Evaluating the reliability and validity of the questionnaire on well-being: A validation study for a clinically informed measurement of subjective well-being. Cognitive Behaviour Therapy, 54(2), 208-230. https://doi.org/10.1080/16506073.2024.2402992 [Google Scholar] [Crossref]

12. Illene, S., Feranie, S., & Siahaan, P. (2023). Create multiple-choice tests based on experimental activities to assess students’ 21st century skills in heat and heat transfer topic. Journal of Education and Learning (EduLearn), 17(1), 44-57. https://doi.org/10.11591/edulearn.v17i1.20540 [Google Scholar] [Crossref]

13. Jegstad, K. M. (2023). Inquiry-based chemistry education: A systematic review. Studies in Science Education, 60(2), 251-313. https://doi.org/10.1080/03057267.2023.2248436 [Google Scholar] [Crossref]

14. Krishnan, S., & Idris, N. (2014). Investigating reliability and validity for the construct of inferential statistics. International Journal of Learning, Teaching and Educational Research, 4(1), 51-60. [Google Scholar] [Crossref]

15. Langitasari, I., Aisyah, R. S. S., Parmandhana, N., & Nursaadah, E. (2024). Enhancing students’ conceptual understanding of chemistry in a SiMaYang learning environment. KnE Social Sciences, 9(13), 191-200. https://doi.org/10.18502/kss.v9i13.15919 [Google Scholar] [Crossref]

16. Luperdi-Roman, C. J. M., Goni-Cruz, F. F., & Deroncele-Acosta, A. (2025). Design and psychometric validation of the research competency scale for university students in Peru. International Journal of Evaluation & Research in Education, 14(6), 4887-4902. https://doi.org/10.11591/ijere.v14i6.35752 [Google Scholar] [Crossref]

17. Malaysian Examination Council. (2023). Laporan Peperiksaan Kimia SPM. Pelangi Sdn. Bhd. [Google Scholar] [Crossref]

18. Murray, A. L., King, J., Xiao, Z., Ribeaud, D., & Eisner, M. (2024). Psychometric evaluation of a bried measure to capture general population-level variation in ADHD symptoms from childhood through the transition to adulthood. International Journal of Behavioral Development, 49(1), 12-25. https://doi.org/10.1177/01650254241268865 [Google Scholar] [Crossref]

19. Noroozi, S., & Karami, H. A. (2022). A scrutiny of the relationship between cognitive load and difficulty estimates of language test items. Language Testing in Asia, 12(13), 1-19. https://doi.org/10.1186/s40468-022-00163-8 [Google Scholar] [Crossref]

20. Rosli, R., Abdullah, M., Siregar, N. C., Hamid, N. S. A., Abdullah, S., Beng, G. K., Halim, L., Daud, N. M., Bahari, S. A., Majid, R. A., & Bais, B. (2020). Student awareness of space science: Rasch model analysis for validity and reliability. World Journal of Education, 10(3), 170-177. https://doi.org/10.5430/wje.v10n3p170 [Google Scholar] [Crossref]

21. Runnels, J. (2012). Using the Rasch model to validate a multiple-choice english achievement test. International Journal of Language Studies, 6(4), 141-153. [Google Scholar] [Crossref]

22. Sahin, M. G., Yildirim, Y., & Ozturk, N. B. (2023). Examining the achievement test development process in the educational studies. Participatory Educational Research, 10(1), 251-274. http://dx.doi.org/10.17275/per.23.14.10.1 [Google Scholar] [Crossref]

23. Salzberger, T., Cano, S., Abetz-Webb, L., Afolalu, E., Chrea, C., Weitkunat, R., & Rose, J. (2021). Addressing traceability of self-reported dependence measurement through the use of crosswalks. Measurement, 181, Article 109593. https://doi.org/10.1016/j.measurement.2021.109593 [Google Scholar] [Crossref]

24. Samsudin, M. A., Chut, T. S., Ismail, M. E., & Ahmad, N. J. (2020). A calibrated item bank for computerized adaptive testing in measuring science TIMSS performance. EURASIA Journal of Mathematics, Science & Technology Education, 16(7), 1-15. https://doi.org/10.29333/ejmste/8259 [Google Scholar] [Crossref]

25. Sandoval, I., Gilar-Corbi, R., Veas, A., & Castejon, J-L. (2021). Promoting equality in higher education: Development and internal validity of a selection test for science university degrees in Ecuador. Psychological Test & Assessment Modeling, 63(2), 191-204. [Google Scholar] [Crossref]

26. Shukri, A. A. M., Ahmad, C. N. C., & Daud, N. (2020). Integrated STEM-based module: relationship between students’ creative thinking and science achievement. Jurnal Pendidikan Biologi Indonesia, 6(2), 173-180. https://doi.org/10.22219/jpbi.v6i2.12236 [Google Scholar] [Crossref]

27. Sigudla, J., & Maritz, J. E. (2023). Exploratory factor analysis of constructs used for investigating research uptake for public healthcare practice and policy in a resource limited setting, South Africa. BMC Health Services Research, 23. https://doi.org/10.1186/s12913-023-10165-8 [Google Scholar] [Crossref]

28. Suryadi, N. R. S., Nurmegawati, L., Mu’aziyah, S. E. S., Perdani, A. S. (2025). Rasch model for analysis of scientific attitude instruments in the context of secondary school science education. Equator Science Journal, 3(2), 98-106. [Google Scholar] [Crossref]

29. Swan, K., Speyer, R., Scharitzer, M., Farneti, D., Brown, T., Woisard, V., & Cordier, R. (2023). Measuring what matters in healthcare: A practical guide to psychometric principles and instrument development. Frontiers in Psychology, 14, 1-18. https://doi.org/10.3389/fpsyg.2023.1225850 [Google Scholar] [Crossref]

30. Tesio, L., Caronni, A., Kumbhare, D., & Scarano, S. (2024). Interpreting results from Rasch analysis 1. The “most likely” measures coming from the model. Disability & Rehabilitation, 46(3), 591-603. https://doi.org/10.1080/09638288.2023.2169771 [Google Scholar] [Crossref]

31. Thomas, D., & Zubkov, P. (2023). Quantitative research designs. In Quantitative research for practical theology (pp. 103-114). Andrews University Press. [Google Scholar] [Crossref]

Metrics

Views & Downloads

Similar Articles