Decision Boundary Setting and Classifier Combination for Text Classification

Moch Arif Bijaksana

Informasi Dasar

16.22.002
003.56
Disertasi - Reference
R1

Text classification is a popular and important text mining task. Many document collections are multi-class and some are multi-label. Both multi-class and multilabel data collections can be dealt with by using binary classifications. A big challenge for text classification is the noisy text data. This problem becomes more severe in corpus with small set of training documents, moreover accompanied by few positive documents. A set of natural language text contains a lot of words. This results another important problem for text classification, namely, high dimension data. Therefore we must select features. A classifier must identify boundary between classes optimally. However, after the features are selected, the boundary is still unclear with regard to mixed positive and negative documents. Recently, relevance feature discovery (RFD) has been proposed as an effective pattern mining-based feature selection and weighting model. Document weights are significant for ranking relevant information. However, so far, an effective way to set the decision boundary for ranking relevant information for classification has not found. This thesis presents a promising boundary setting method for solving this challenging issue to produce an effective text classifier, called RFD? . A classifier combination to boost effectiveness of the RFD? model is also presented. The experiments carried out in the study demonstrate that the proposed classifier significantly outperforms existing, including state of the art, classifiers.

Subjek

Text mining
 

Katalog

Decision Boundary Setting and Classifier Combination for Text Classification
 
195p., PDF FILE; 6,8MB
Inggris

Sirkulasi

Rp. 0
Rp. 0
Tidak

Pengarang

Moch Arif Bijaksana
Perorangan
 
 

Penerbit

Queensland University Of Technology
Queensland
2015

Koleksi

Kompetensi

  • CSH6G3 - ANALISIS DAN PENAMBANGAN TEKS
  • CSH4G3 - PENAMBANGAN DATA
  • CSH4H3 - PENAMBANGAN TEKS
  • CII4I3 - PENAMBANGAN DATA
  • CII7E3 - ANALISIS DAN PENAMBANGAN TEKS
  • CII7E3 - ANALISIS DAN PENAMBANGAN TEKS
  • CPI4I3 - PENAMBANGAN DATA

Download / Flippingbook

 

Ulasan

Belum ada ulasan yang diberikan
anda harus sign-in untuk memberikan ulasan ke katalog ini