Pengukuran Kemiripan Makna Kalimat dalam Bahasa Indonesia Menggunakan Metode Path
DOI:
https://doi.org/10.21111/fij.v6i2.4844Abstract
AbstrakPengukuran kemiripan makna kalimat bertujuan untuk didapatkan nilai kemiripan antar satu kalimat dengan kalimat yang lain. Nilai kemiripan yang didapatkan tersebut kemudian dapat diimplementasikan untuk pengembangan sistem yang berbasis matching sentence, misalnya search engine. Pada beberapa penelitian yang dilakukan sebelumnya, membahas mengenai efektivitas algoritma semantik dalam mengukur kemiripan makna kalimat dalam bahasa Inggris. Sedangkan, penelitian ini mencari kemiripan makna antar kalimat dalam bahasa Indonesia. Dataset dalam pencarian dan pengukuran makna kalimat pada penelitian ini menggunakan sinonimkata.com yang berupa node atau percabangan. Pada perhitungan kemiripan makna kalimat menggunakan WordNet, pendekatan yang digunakan ialah Wu Palmer, Lin, Path, Resnik, dan Hirst-St (HSO). Pada penelitian ini menggunakan pendekatan path karena paling sesuai untuk menghitung jumlah node atau relasi yang terhubung antar node lain dalam sinonimkata.com. Pengukuran ini dilakukan dengan 5 eksperimen yakni, berdasarkan susunan kalimat kata kerja – kata benda, kalimat aktif – kalimat pasif (makna sama), 2 kalimat aktif, 2 kalimat pasif, dan kalimat aktif – kalimat pasif (makna berbeda). Menghitung nilai kemiripan kata diurai dengan kriteria kelas kata kerja, dan kata benda kemudian dihitung berdasarkan contextual menggunakan pendekatan path yang kemiripan katanya dicari menggunakan sinonimkata.com. Dari proses perhitungan kelima eksperimen tersebut, dapat dihasilkan kemiripan kalimat dalam bahasa Indonesia yang memiliki tingkat kemiripan yang tinggi bernilai 0,875 pada eksperimen kriteria kalimat dengan susunan kata kerja – kata benda.Kata kunci: kemiripan makna kalimat, sinonimkata.com, path Abstract[Measurement of the sentence similarity in Indonesian using the path method] Measurement of the sentence similarity aims to obtain the value of similarity between one sentence and another sentence. The similarity value obtained can be implemented for the development of a based system on matching sentences, for example, search engines. In several previous studies, discussed the effectiveness of semantic algorithms in measuring the sentence similarity meanings in English. Meanwhile, this research looks for similarities of the meaning of one sentence to another in Indonesian. The dataset in the search and measurement of the sentence similarity in this study using sinonimkata.com in the form of nodes or branches. In the calculation of the sentence similarity meanings using WordNet, the approach used is Wu Palmer, Lin, Path, Resnik, and Hirst-St (HSO). This study uses a path approach because it is best suited to calculate the number of nodes or relationships connected between other nodes in sinonimkata.com. This measurement was done with 5 experiments, based on the composition of verb sentences – nouns, active sentences – passive sentences (same meaning), 2 active sentences, 2 passive sentences, and active sentences – passive sentences (different meanings). Calculating the likeness value of a word is parsed with the criteria of a verb class, and the noun is then calculated based on contextual using a path approach whose similarity is searched using sinonimkata.com. From the calculation process of the five experiments, it can be produced a sentences similarity in Indonesian that has a high level of similarity worth 0.875 in the experiment of sentence criteria with the arrangement of verbs – nouns.Keywords: sentence similarity, sinonimkata.com, pathReferences
[1]A. W. Prasetya, M. A. Yaqin dan S. Zaman, “Common Process Extraction pada Scalable Model Proses Bisnis,” Konferensi Nasional Sistem Informasi (KNSI) 2018, 2018.[2]A. Ali, F. Alfayez dan H. Alquhayz, “Semantic Similarity Measures Between Words: A Brief Survey,” Sci. Int (Labore), vol. 30, pp. 907-914, 2018.[3]S. Christina, “Kinerja Cosine Similarity dan Semantic Similarity Dalam Pengidentifikasian Relevansi Nomor Halaman Pada Daftar Indeks Istilah,” Sentika 2014, no. 2089-9813, 2014.[4]P. Kharismadita dan F. Rahutomo, “Implementasi Tokenizing Plus pada Sistem Pendeteksi Kemiripan Jurnal Skripsi,” Jurnal Informatika Polinema, vol. 2, no. 1, p. 24, 2017.[5]D. Guessoum, M. Miraoui dan C. Tadj, “A Modification of Wu and Palmer Semantic Similarity Measure,” pp. 42-46, 2016.[6]L. D, “An information-theoretic definition of similarity,” Proceeding of International, Canada, 1998.[7]T. Slimani, “Description and Evaluation of Semantic Similarity Measure Approaches,” International Journal of Computer Applications, vol. 80, pp. 25-33, 2013.[8]H. Thamrin dan A. Sabardilla, “Evektivitas Algoritma Semantik dengan Keterkaitan Kata dalam Mengukur Kemiripan Teks Bahasa Indonesia,” Khazanah Informatika, vol. 1, p. 1, 2015.[9]G. U. Abriani dan M. A. Yaqin, “Implementasi Metode Semantic Similarity untuk Pengukuran Kemiripan Makna antar Kalimat,” Ilkomnika, vol. 1, p. 2, 2019.[10]B. Mcinnes dan T. Pedersen, “Evaluating Measure of Semantic Similarity and Relatedness to Disambiguate Terms in Biomedical Text,” International Journal of Biodemical Informatics, vol. 46, 2013.
Downloads
Submitted
Accepted
Published
Issue
Section
License
Copyright (c) 2021 Fountain of Informatics Journal
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Please find the rights and licenses in the Fountain of Informatics Journal (FIJ). By submitting the article/manuscript of the article, the author(s) agree with this policy. No specific document sign-off is required.
1. License
The non-commercial use of the article will be governed by the Creative Commons Attribution license as currently displayed on Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
2. Author(s)' Warranties
The author warrants that the article is original, written by the stated author(s), has not been published before, contains no unlawful statements, does not infringe the rights of others, is subject to copyright that is vested exclusively in the author, and free of any third party rights, and that any necessary written permissions to quote from other sources have been obtained by the author(s).
3. User/Public Rights
FIJ's spirit is to disseminate articles published are as free as possible. Under the Creative Commons license, FIJ permits users to copy, distribute, display, and perform the work for non-commercial purposes only. Users will also need to attribute authors and FIJ on distributing works in the journal and other media of publications. Unless otherwise stated, the authors are public entities as soon as their articles got published.
4. Rights of Authors
Authors retain all their rights to the published works, such as (but not limited to) the following rights;
- Copyright and other proprietary rights relating to the article, such as patent rights,
- The right to use the substance of the article in own future works, including lectures and books,
- The right to reproduce the article for own purposes,
- The right to self-archive the article (please read out deposit policy),
- The right to enter into separate, additional contractual arrangements for the non-exclusive distribution of the article's published version (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal (Jurnal Optimasi Sistem Industri).
5. Co-Authorship
If the article was jointly prepared by more than one author, any authors submitting the manuscript warrants that he/she has been authorized by all co-authors to be agreed on this copyright and license notice (agreement) on their behalf, and agrees to inform his/her co-authors of the terms of this policy. FIJ will not be held liable for anything that may arise due to the author(s) internal dispute. FIJ will only communicate with the corresponding author.
6. Royalties
Being an open accessed journal and disseminating articles for free under the Creative Commons license term mentioned, author(s) aware that FIJ entitles the author(s) to no royalties or other fees.
7. Miscellaneous
FIJ will publish the article (or have it published) in the journal if the article’s editorial process is successfully completed. FIJ's editors may modify the article to a style of punctuation, spelling, capitalization, referencing, and usage that deems appropriate. The author acknowledges that the article may be published so that it will be publicly accessible and such access will be free of charge for the readers as mentioned in point 3.