Home / Papers / Multi-Label Classification for Scientific Conference Activities Information Text Using Extreme...

Multi-Label Classification for Scientific Conference Activities Information Text Using Extreme Gradient Boost (XGBoost) Method

1 Citations2021
C. A. E. Piter, Setiwan Hadi, I. Yulita
2021 International Conference on Artificial Intelligence and Big Data Analytics

This study aims to classify the information data on scientific conferences in Indonesia with a total of 1005 data into three labels, namely Economics, Science and Engineering, and Social Studies using the Extreme Gradient Boost (XGBoost) method.

Abstract

One of the government’s aims is to improve the quality of education that requires lecturers to conduct scientific research. This scientific research must be disseminated later, for example through scientific conferences. One important aspect of submitting scientific articles is field suitability. A large number of scientific conferences is accompanied by a large number of fields. This of course will make it difficult for researchers to find suitable scientific conferences. Therefore, to assist in the selection, an automatic grouping of information on scientific conference activities was carried out. This study aims to classify the information data on scientific conferences in Indonesia with a total of 1005 data. The data that has been collected will be classified into three labels, namely Economics, Science and Engineering, and Social Studies. This classification is multi-label, meaning that each data can have more than one label. This study uses the Extreme Gradient Boost (XGBoost) method. Testing of the hyperparameters is carried out to get the optimal model. This test produces a Word2Vec hyperparameter with a dimension of 100 and a window size of 15. For the XGBoost hyperparameter, optimal conditions are obtained when estimators 100, learning rate 0,1, maximum depth 6, minimum child weight 10, and gamma 3,6. This model was then evaluatedusing the K-fold Cross-Validation which resulted in an average hamming score of 79.52% and an f1 score of 85.88%.