Evaluation of the Feasibility and Reliability of Arabic Multiple Choice Tests in Higher Education

Authors

  • Zakiyah Zakiyah UIN Sunan Ampel Surabaya
  • Sulfiatin Sulfiatin UIN Sunan Ampel Surabaya
  • M. Baihaqi UIN Sunan Ampel Surabaya
  • Nais Musyafaul Faiza UIN Sunan Ampel Surabaya
  • Muhammad Ilham Revo Alghani UIN Sunan Ampel Surabaya
  • Fatihuddin Fatihuddin Al Qasimy University Kingdom Saudi Arabia

DOI:

https://doi.org/10.21111/lisanudhad.v11i2.12163

Keywords:

Multiple Choice Test, Feasibility and Reliability Evaluation, Higher Education

Abstract

Multiple-choice tests are commonly used in education, including at the higher education level, as an efficient method for evaluating students' understanding. In the context of Arabic language learning, multiple-choice tests are believed to be able to assess students' level of comprehension and mastery of the language. However, it is essential to evaluate the feasibility and reliability of these tests to ensure that the results accurately reflect students' abilities. This study aims to evaluate the feasibility and reliability of multiple-choice tests in Arabic in higher education. The evaluation was conducted by assessing content validity, construct validity, reliability, and correlation with students' academic performance. The results showed that 60% of the test questions were aligned with the existing curriculum, although only 14 out of 25 questions met the criteria for construct validity. Despite some shortcomings in construct validity, the test demonstrated a high level of reliability, with a Cronbach's Alpha value of 0.88, indicating consistent test results. Additionally, there was a significant positive correlation between test scores and students' academic performance (r = 0.44), indicating that the test can reflect students' overall academic achievement. Despite certain limitations in construct validity, the conclusion of this study is that the multiple-choice test is still considered reliable as an evaluation tool. This conclusion provides insight into the test's effectiveness in measuring students' understanding and mastery of Arabic at the tertiary level.

References

Abdel-Hay, Da’ad, Osama Abdelhay, Hamza A. Ghatasheh, Sameer Al-Jarrah, Suhaib Eid, Mutaz A. Al Tamimi, and Ibrahim Al-Mayata. “The Arabic EAT-10 and FEES in Dysphagia Screening among Cancer Patients: A Comparative Prospective Study.” Scientific Reports 14, no. 1 (April 22, 2024): 9258. https://doi.org/10.1038/s41598-024-58572-z.

Alghamdi, A.A., K. Alghuthayr, S.S.S.M.M. Alqahtani, Z.A. Alshahrani, A.M. Asiri, H. Ghazzawi, M. Helmy, K. Trabelsi, M. Husni, and H. Jahrami. “The Translation and Validation of the Surgical Anxiety Questionnaire into the Modern Standard Arabic Language: Results from Classical Test Theory and Item Response Theory Analyses.” BMC Psychiatry 24, no. 1 (2024). https://doi.org/10.1186/s12888-024-06142-y.

Almeida, Sionara Tamanini de, Thais de Lima Resende, and Claus Dieter Stobäus. “Validity, Reliability and Convergent Analysis of Brazilian Version of Selection, Optimization and Compensation Questionnaire (QSOC).” Creative Education 7, no. 15 (September 6, 2016): 2074–87. https://doi.org/10.4236/ce.2016.715207.

Alsaleh, S., R. Alfallaj, H. Almousa, N. Alsubaie, Y. Akkielah, T.A. Mesallam, and I. Sumaily. “Reliability and Validity of the Arabic Version of the Brief Version of the Questionnaire of Olfactory Disorders.” Laryngoscope Investigative Otolaryngology 9, no. 3 (2024). https://doi.org/10.1002/lio2.1252.

Anastasi, Anne, and Susana Urbina. “Psychological Testing 7th Ed. Prentics-Hall International.” Inc, 1997.

Asmita, Wenda, and Wahidah Fitriani. “KONSEP DASAR PENGUKURAN.” Jurnal Mahasiswa BK An-Nur: Berbeda, Bermakna, Mulia 8, no. 3 (2022): 217–26.

Baihaqi, Muhammad. “Evaluasi Pembelajaran.” Surabaya: LAPIS PGMI, 2008.

Bani, Suddin. “OBJEK EVALUASI PENDIDIKAN.” Lentera Pendidikan : Jurnal Ilmu Tarbiyah dan Keguruan 15, no. 2 (December 20, 2012): 231–39. https://doi.org/10.24252/lp.2012v15n2a8.

Bitar, Dima, and Marie Oscarsson. “Arabic-Speaking Women’s Experiences of Communication at Antenatal Care in Sweden Using a Tablet Application—Part of Development and Feasibility Study.” Midwifery 84 (2020): 102660.

Brown, H. Douglas, and Priyanvada Abeywickrama. “Language Assessment.” Principles and Classroom Practices. White Plains, NY: Pearson Education, 2004, 20.

Deviana Ilyas, Intan. “PENGUJIAN USABILITY WEBSITE SMKN 1 PONOROGO MENGGUNAKAN SYSTEM USABILITY SCALE.” PhD Thesis, Universitas Muhammadiyah Ponorogo, 2020. http://eprints.umpo.ac.id/6236/.

Dhas, Brightlin Nithis, Samah Ahmad Abd Alfattah Abd Alhadi, Ghaith Mohammad Rizk Dhadl Al That, and Sultan Salim Hammam Al Abdulla. “Psychometric Properties of the Arabic Occupational Balance Questionnaire (OBQ11-A).” Annals of Medicine 56, no. 1 (December 31, 2024): 2346945. https://doi.org/10.1080/07853890.2024.2346945.

Fathi, Jalil, Lawrence Jun Zhang, and Mohammad Hossein Arefian. “Testing a Model of EFL Teachers’ Work Engagement: The Roles of Teachers’ Professional Identity, L2 Grit, and Foreign Language Teaching Enjoyment.” International Review of Applied Linguistics in Language Teaching 0, no. 0 (July 21, 2023). https://doi.org/10.1515/iral-2023-0024.

Fauziah, Indah Rahmi Nur, Syihabudin Syihabudin, and Asep Sopian. “Analisis Kualitas Tes Bahasa Arab Berbasis Higher Order Thinking Skill (Hots).” لسـانـنـا (Lisanuna): Jurnal Ilmu Bahasa Arab Dan Pembelajarannya 10, no. 1 (2020): 45–54.

Glaser, Barney, and Anselm Strauss. Discovery of Grounded Theory: Strategies for Qualitative Research. Routledge, 2017. https://www.taylorfrancis.com/books/mono/10.4324/9780203793206/discovery-grounded-theory-barney-glaser-anselm-strauss.

Ida, Farida Far, and Anna Musyarofah. “Validitas Dan Reliabilitas Dalam Analisis Butir Soal.” Al-Muarrib : Jurnal Pendidikan Bahasa Arab 1, no. 1 (December 6, 2021): 34–44. https://doi.org/10.32923/al-muarrib.v1i1.2100.

Indriana, Dina. “Evaluasi Pembelajaran Dan Penilaian Autentik Dalam Pembelajaran Bahasa Arab.” Al-Ittijah: Jurnal Keilmuan Dan Kependidikan Bahasa Arab 10, no. 2 (2018): 34–52.

Jankauskiene, Rasa, Danielius Urmanavicius, and Migle Baceviciene. “Associations between Perceived Teacher Autonomy Support, Self-Determined Motivation, Physical Activity Habits and Non-Participation in Physical Education in a Sample of Lithuanian Adolescents.” Behavioral Sciences 12, no. 9 (2022): 314.

Magdy, R., A. Hassan, Z. Mohammed, M.A. Abdeltwab, N.F.A. Ghaffar, and M. Hussein. “Validity and Reliability of Arabic Version of Pediatric Migraine Disability Assessment Scale (Child Self-Report versus Parent Proxy-Report): A Multi-Center Study.” Journal of Headache and Pain 25, no. 1 (2024). https://doi.org/10.1186/s10194-024-01713-6.

Messick, S. “Validity. Em r. Linn (Org.), Educational Measurement.(13-103).” New York, NY: American Council on Education and Macmillan Publishing Company, 1989.

Mohan, Radha. Measurement, Evaluation and Assessment in Education. PHI Learning Pvt. Ltd., 2023.

Nunnally, J. C. “Psychometric Theory 2nd Edition (New York: McGraw),” 1978.

Oktaviyanti, Itsna, and N. K. R. Awal. “Korelasi Antara Hasil Tes Lisan Dengan Hasil Tes Tertulis Pada Mahasiswa PGSD UNRAM.” Jurnal Ilmu Pendidikan 2, no. 1 (2019): 9–19.

Quintão, Cátia, Pedro Andrade, and Fernando Almeida. “How to Improve the Validity and Reliability of a Case Study Approach?” Journal of Interdisciplinary Studies in Education 9, no. 2 (2020): 264–75.

Qureshi, Mustapha, Dinnah Mahdiyyah, Yassine Mohamed, and Mounika Ardchir. “Scale for Measuring Arabic Speaking Skills in Early Children’s Education.” JILTECH: Journal International of Lingua & Technology 1, no. 2 (2022).

Safdari, Maryam, and Jalil Fathi. “Investigating the Role of Dynamic Assessment on Speaking Accuracy and Fluency of Pre-Intermediate EFL Learners.” Edited by Richard Kruk. Cogent Education 7, no. 1 (January 1, 2020): 1818924. https://doi.org/10.1080/2331186X.2020.1818924.

Shohamy, Elana, Iair G. Or, and Stephen May. Language Testing and Assessment. Springer Cham, 2017.

Siregar, Amy Fitriani, Siti Nurhasana Mokodompit, Muhajir Muhajir, and Nila Alfiroh. “Test Analysis of Durūs Al-Lughah Al-‘Arabiyyah Volume 1 by Imam Zarkasyi and Imam Syubani.” Lisanudhad: Jurnal Bahasa, Pembelajaran, Dan Sastra Arab 11, no. 01 (June 25, 2024): 153–75. https://doi.org/10.21111/lisanudhad.v11i01.11427.

Surkan, P.J., D. Rayes, L. Bertuzzi, N. Figueiredo, M. Melchior, and A. Tortelli. “A Qualitative Evaluation of the Use of Problem Management Plus (PM+) among Arabic-Speaking Migrants with Psychological Distress in France–The APEX Study.” European Journal of Psychotraumatology 15, no. 1 (2024). https://doi.org/10.1080/20008066.2024.2325243.

Wicaksono, Andri. Metodologi Penelitian Pendidikan: Pengantar Ringkas. Garudhawaca, 2022.

Winarti, Muncar, Abdurrachman Faridi, and Fahrur Rozi. “Evaluating the Validity, Reliability and Authenticity of English Achievement Test for the Twelfth Grade Students of SMAN 4 Tebo, Jambi.” English Education Journal 11, no. 1 (March 15, 2021): 130–38. https://doi.org/10.15294/eej.v11i1.44176.

Wulan, Ana Ratna. “Pengertian Dan Esensi Konsep Evaluasi, Asesmen, Tes, Dan Pengukuran.” Jurnal, FPMIPA Universitas Pendidikan Indonesia, 2007. https://www.academia.edu/download/34534033/pengertian_asesmen.pdf.

Wurjanti, Erna. Study Group Solusi Meningkatkan Motivasi Dan Hasil Belajar. Penerbit P4I, 2022.

Downloads

Published

2024-12-21

How to Cite

Zakiyah, Z., Sulfiatin, S. ., Baihaqi , M., Faiza, N. M. ., Alghani, M. I. R. ., & Fatihuddin, F. . (2024). Evaluation of the Feasibility and Reliability of Arabic Multiple Choice Tests in Higher Education. Lisanudhad: Jurnal Bahasa, Pembelajaran, Dan Sastra Arab, 11(2), 164–192. https://doi.org/10.21111/lisanudhad.v11i2.12163