Classification of Closely Related Indonesian Article based on Latent Semantic Analysis

DEWI FATMARANI SURIANTO

Informasi Dasar

116 kali
19.05.153
005.13
Karya Ilmiah - Thesis (S2) - Reference

Latent Semantic Analysis (LSA) is one of the most popular methods used in classification task. LSA is used to extract and represent the contextual-usage meaning of words in the document. Commonly, TF-IDF is used as the method to build a term-document matrix or to generate the feature before applying Singular Value Decomposition (SVD) in LSA. Based on the initial experiment, TF-IDF feature in LSA has not performed well to classify the similar text such as between aqidah and ibadah articles as well as between national and regional articles. This could happen due to the gap of TF-IDF to capture semantic information in the article. Referring to this issue, this study contributes to the use of semantic vector as word representation in text classification with word vector, word2vec which is then processed to the LSA method. This study reveals that the result obtained is better than the previous method. In national and regional article classification, the use of word2vec feature in LSA successfully increased the f-score from 74% (LSA with TF-IDF) to 75% (LSA with word2vec) as well as in the accuracy scores that increases from 74% (LSA with TF-IDF) to 78% (LSA with word2vec), meanwhile in aqidah and ibadah article classification the f-score improved from 63% (LSA with TF-IDF) to 73% (LSA with word2vec) as well as in the accuracy score improved significantly from 49% (LSA with TF-IDF) to 72% (LSA with word2vec).

Subjek

Natural language processing
 

Katalog

Classification of Closely Related Indonesian Article based on Latent Semantic Analysis
 
 
Indonesia

Sirkulasi

Rp. 0
Rp. 0
Tidak

Pengarang

DEWI FATMARANI SURIANTO
Perorangan
ARIEF FATCHUL HUDA
 

Penerbit

Universitas Telkom, S2 Informatika
Bandung
2019

Koleksi

Kompetensi

  • CSH533 - ANALISA ALGORITMA
  • CSH5D3 - ANALISIS BIG DATA
  • CSH6G3 - ANALISIS DAN PENAMBANGAN TEKS
  • CSH6F3 - INTELLIGENT BIG DATA MINING
  • CSH553 - KOMPUTASI SOSIAL
  • MTH502 - MANAJEMEN BISNIS TIK
  • MTH503 - METODOLOGI PENELITIAN
  • CSH513 - PEMODELAN DAN OPTIMASI
  • CSH563 - PRATESIS I
  • CSH613 - PRATESIS II
  • CSH522 - PROYEK
  • CSH5E3 - SCIENCE OF ONLINE NETWORK
  • CSH573 - SISTEM CERDAS LANJUT
  • CSH583 - STATISTIKA DAN ANALISIS DATA
  • CSH623 - TESIS
  • IEH3N2 - PRAKTIKUM PERANCANGAN BISNIS DAN FASILITAS INDUSTRI
  • IEH4G2 - PERANCANGAN PROSES BISNIS
  • IEH4CC3 - PERANCANGAN PROSES BISNIS LANJUT
  • IEH4EF3 - SISTEM BISNIS RETAIL
  • IEH4GB5 - PENGEMBANGAN INISIATIF BISNIS
  • IEI5F3 - PERANCANGAN PROSES BISNIS
  • IEI443 - PENGEMBANGAN INISIATIF BISNIS
  • CII632 - PROYEK
  • CII6F3 - ANALISIS BIG DATA
  • CII7E3 - ANALISIS DAN PENAMBANGAN TEKS
  • CII733 - TESIS
  • CII632 - PROYEK
  • CII6F3 - ANALISIS BIG DATA
  • CII7E3 - ANALISIS DAN PENAMBANGAN TEKS
  • CII733 - TESIS
  • TTI7Z4 - TESIS
  • CII9H5 - PENELITIAN DISERTASI DAN SEMINAR 1
  • CII9J5 - PENELITIAN DISERTASI DAN SEMINAR 2
  • CII9L5 - PENELITIAN DISERTASI DAN SEMINAR 3
  • CII9I1 - PENULISAN PUBLIKASI ILMIAH 1
  • CII9K2 - PENULISAN PUBLIKASI ILMIAH 2
  • CII9M3 - PENULISAN PUBLIKASI ILMIAH 3

Download / Flippingbook

 

Ulasan

Belum ada ulasan yang diberikan
anda harus sign-in untuk memberikan ulasan ke katalog ini