Korpus Arab Pesantren: Digitizing the work of Arabic non-Arabic speakers at Modern Islamic Institution Darussalam Gontor





Korpus Arab, Digitizing the work of Arabic, Gontor, data korpus, Sketch Engine


Modern Islamic Institution Darussalam Gontor (PMDG) is One of the educational institutions in Indonesia that has been consistent in learning Arabic since its inception due to there are many Arabic works produced by the student community component within the institution. The purpose of this study is to specifically describe the works of non-Arabic Speaker Arabic students and explain the process of collecting and digitizing Arabic corpus data sources made by non-Arabic Speaker Arabic students at Modern Islamic Institution Darussalam Gontor, Ponorogo, East Java. This research employs a field study with a descriptive qualitative analysis approach as a research method. The results reveal that, first, the works of Arabic language students based on the type of data can be categorized into two parts, namely written Arabic works (written products) and spoken Arabic works (Spoken Products). Conventional written data can be read by software and conventional data cannot be read. Second, the process of collecting and digitizing Arabic corpus data by Arabic non-Arabic Speaker students at Modern Islamic Institution Darussalam Gontor was carried out in three stages. The first stage converts the conventional data in the form of handwriting and voice recordings to digital data in *doc format. The second stage is the conversion of digital data in *doc format into plain text (*txt) format. The third stage is to enter data in *txt format into a web-based data processing engine, namely Sketch Engine. Thus, the digital corpus data is ready to be processed into a study based on certain linguistic objectives.


Adolphs, Svenja. Introducing Electronic Text Analysis A Practical Guide for Language and Literary Studies. 1st ed. New York: Routledge, 2006.Al-sulaiti, Latifa, Noorhan Abbas, Claire Brierley, Eric Atwell, and Ayman Alghamdi. “Compilation of an Arabic Children’s Corpus.” In LREC 2016: 10th Language Resources and Evaluation Conference, edited by Nicoletta Calzolari. Portorož, Slovenia, 2016. https://eprints.whiterose.ac.uk/100839/.Alfaifi, Abdullah. “The Arabic Learner Corpus Website.” Last modified 2015. https://www.arabiclearnercorpus.com/.Alfaifi, Abdullah, and Eric Atwell. “Potential Uses of the Arabic Learner Corpus” (2013). Accessed December 14, 2021. http://www.uclouvain.be/en-cecl-longdale.html.Azzahra, Siti Fatimah, Nur Hizbullah, and Iin Suryaningsih. “Penyusunan Kamus Kedokteran Arab – Indonesia Dengan Pendekatan Linguistik Korpus.” Tsaqofiya : Jurnal Pendidikan Bahasa dan Sastra Arab 2, no. 2 (2020): 60–66.Emzir. Metodologi Penelitian Kualitatif : Analisis Data. 6th ed. Depok: Rajawali Pres, 2018.Hizbullah, Nur, Zaqiatul Mardiah, Yoke Suryadarma, Luthfi Muhyiddin, Oyong Sofyan, and Ferry Hidayat. “Arabic Learners’ Corpora in Pesantrens for Developing Arabic Language Researches in Indonesia.” KnE Social Sciences, no. July (2019): 3–4.———. “Arabic Learners’ Corpora in Pesantrens for Developing Arabic Language Researches in Indonesia.” KnE Social Sciences 2019 (2019): 980–989. https://www.kne-publishing.com/index.php/KnE-Social/article/view/4922.Hizbullah, Nur, and Muchlis Madian Muhammad. “Projected Characteristics and Content of Arabic Corpus in Indonesia.” Advances in Social Science, Education and Humanities Research (ASSEHR) 154, no. Icclas 2017 (2018): 172–174.Hizbullah, Nur, Fazlur Rachman, and Fuzi Fauziah. “Linguistik Korpus Dalam Kajian Dan Pembelajaran Bahasa Arab Di Indonesia.” In Konferensi Nasional Bahasa Arab (KONASBARA) II, 385–393, 2016.Nesselhauf, Nadja. Corpus Linguistics : A Practical Introduction. Anglistisches Seminar. Heidelberg: Uniheidelberg, 2011. http://www.as.uniheidelberg.de/personen/Nesselhauf/files/Corpus-Linguistics-Practical -Introduction.pdf.Sugiyono. Metode Penelitian Dan Pengembangan (Research and Development). 4th ed. Bandung: Alfabeta, 2019.Sugiyono. Metode Penelitian Pendidikan. Cetakan 27. Bandung: CV. Alfabeta, 2018.Suryadarma, Yoke, and Alinda Zakiyatul Fakhiroh. “Optimalisasi Penggunaan Corpus Linguistics Dalam Penyusunan Kamus Az- Ziro’ah Sebagai Media Pembelajaran Bahasa Arab.” In International Seminar on Language, Education, and Culture (ISoLEC) 2020, 123–128. Malang: Universitas Negeri Malang, 2020. http://isolec.um.ac.id/proceeding/index.php/issn/article/view/59.Suryadarma, Yoke, and Alinda Zakiyatul Fakhiroh. “Tashmīm Qāmus ‘al-Zirā‘Ah’ Kawasīlah Ta‘allum Al-‘Arabiyyah Li Thalabah Qism Al-Tiknūlūjiyā Al-Shinā‘iyyah Al-Zirā‘iyyah Muassasn ‘Alā Al-Mudawwanah Al-Lughowiyyah.” LISANUDHAD 7, no. 2 (December 17, 2020): 37–56. Accessed October 20, 2021. https://ejournal.unida.gontor.ac.id/index.php/lisanu/article/view/6744.Zaid, Abdul Hafidz. “تكنولوجيا التعليم المقترحة لتعليم مهارة الكلام لطلاب المستوى المتوسط في إندونيسيا.” LISANUDHAD 1, no. 2 (December 8, 2014). Accessed November 7, 2020. https://ejournal.unida.gontor.ac.id/index.php/lisanu/article/view/446.




How to Cite

Suryadarma, Y., & Zakaria, G. A. N. (2022). Korpus Arab Pesantren: Digitizing the work of Arabic non-Arabic speakers at Modern Islamic Institution Darussalam Gontor. At-Ta’dib, 17(1), 52–66. https://doi.org/10.21111/at-tadib.v17i1.7067