Implementasi Algoritma Ambiguity Measure Feature Selection pada Kategorisasi Dokumen Teks Bahasa Indonesia <br> Implementation Ambiguity Measure Feature Selection Algorithm on Categorization of Indonesian Text Document

Taajul Arifin

Implementasi Algoritma Ambiguity Measure Feature Selection pada Kategorisasi Dokumen Teks Bahasa Indonesia <br> Implementation Ambiguity Measure Feature Selection Algorithm on Categorization of Indonesian Text Document

Taajul Arifin

Informasi Dasar

Implementasi Algoritma Ambiguity Measure Feature Selection pada Kategorisasi Dokumen Teks Bahasa Indonesia <br> Implementation Ambiguity Measure Feature Selection Algorithm on Categorization of Indonesian Text Document

Dilihat

270 kali

No. Katalog

113088028

Klasifikasi

005.1

Jenis katalog

Karya Ilmiah - Skripsi (S1) - Reference

Abstraksi

ABSTRAKSI: Perkembangan teknologi internet sangat cepat membuat jumlah informasi berupa dokumen teks semakin banyak, oleh karena itu diperlukan suatu metode yang memudahkan pembaca untuk mencari informasi melalui proses kategorisasi. Namun tingginya dimensi data dapat mengganggu performansi hasil kategorisasi. Oleh karena itu dibutuhkan pemilihan feature yang berpengaruh besar terhadap kategorisasi yaitu feature selection. Ada beberapa algoritma dalam feature selection salah satunya yaitu Ambiguity Measure. Pada tugas akhir ini mengimplementasikan algoritma feature selection yaitu AM (Ambiguity Measure), dilakukan analisis hasil feature selection dengan menggunakan nilai threshold untuk memilih feature-feature yang berpengaruh terhadap proses kategorisasi. Kemudian diamati nilai precision dan recall menggunakan algoritma naïve bayess yang terdapat pada tools WEKA. Setelah dilakukan percobaan dengan menetapkan nilai threshold untuk pemilihan feature, menunjukan semakin tinggi nilai threshold yang ditetapkan maka jumlah feature yang dipilih oleh sistem semakin sedikit, namun performansi hasil kategorisasi meningkat. Performansi kategorisasi mencapai nilai tertinggi ketika ditetapkan threshold 0.95. Kemudian dilakukan perbandingan akurasi antara dataset sebelum dilakukan feature selection dan dataset setelah dilakukan feature selection, menunjukan akurasi yang dihasilkan setelah dilakukan feature selection lebih tinggi daripada dataset sebelum dilakukan feature selection.Kata Kunci : feature selection, kategorisasi teks, Ambiguity MeasureABSTRACT: The development of Internet technology is very fast to make the amount of information of text documents is increasing. Therefore, the method is needed to find information though the categorization process. But the high-dimensional data can interfere with the performance results of categorization. Therefore feature selection is needed greatly affect categorization is feature selection. There are several algorithms in feature selection, one of which is ambiguity Measure. In this final report implements AM (ambiguity Measure) feature selection algorithm, to analyze the results of feature selection using a threshold value for selecting these features that influence the categorization process. Then the observed value of precision and recall using naïve bayess algorithm contained in the Weka tool. After doing the experiment by setting the threshold value for feature selection show the higher set threshold value then number of features selected by the system decrease, but the performance of the categorization increases. Categorization performance reaches the highest value when the specified threshold with 0.95. Then compare the accuracy of the dataset prior to feature selection and feature datasets after selection, the results yielded accuracy after feature selection is higher than the dataset prior to feature selection.Keyword: feature selection, text categorization, ambiguity measure

Subjek

Subjek utama

Rekayasa Perangkat Lunak

Subjek tambahan

Katalog

Judul

Implementasi Algoritma Ambiguity Measure Feature Selection pada Kategorisasi Dokumen Teks Bahasa Indonesia
Implementation Ambiguity Measure Feature Selection Algorithm on Categorization of Indonesian Text Document

ISBN

Kolasi

Bahasa

Indonesia