MODEL OTOMATISASI SELEKSI RESUME LAMARAN KERJA MENGGUNAKAN NATURAL LANGUAGE PROCESSING (NLP) DAN MACHINE LEARNING (ML)

Afdi Fauzul Bahar, NIM.: 22206051011 (2024) MODEL OTOMATISASI SELEKSI RESUME LAMARAN KERJA MENGGUNAKAN NATURAL LANGUAGE PROCESSING (NLP) DAN MACHINE LEARNING (ML). Masters thesis, UIN SUNAN KALIJAGA YOGYAKARTA.

[img]
Preview
Text (MODEL OTOMATISASI SELEKSI RESUME LAMARAN KERJA MENGGUNAKAN NATURAL LANGUAGE PROCESSING (NLP) DAN MACHINE LEARNING (ML))
22206051011_BAB-I_IV-atau-V_DAFTAR-PUSTAKA.pdf - Published Version

Download (2MB) | Preview
[img] Text (MODEL OTOMATISASI SELEKSI RESUME LAMARAN KERJA MENGGUNAKAN NATURAL LANGUAGE PROCESSING (NLP) DAN MACHINE LEARNING (ML))
22206051011_BAB-II_sampai_SEBELUM-BAB-TERAKHIR.pdf - Published Version
Restricted to Registered users only

Download (5MB) | Request a copy

Abstract

In the context of workforce recruitment, the resume selection process often requires careful and efficient evaluation to handle a large volume of applications. This study aims to identify the best algorithm for automating the resume selection process using a public dataset. The research compares the performance of three machine learning algorithms: Random Forest, Naive Bayes, and Logistic Regression in automating resume selection using Natural Language Processing (NLP) and Machine Learning (ML). Four dataset splitting scenarios are used to evaluate the algorithm's performance. The preprocessing steps include punctuation removal, tokenization, stemming, lemmatization, and TF-IDF vectorization. The results indicate that Random Forest performs the best with high accuracy and F1-Score across all scenarios without signs of overfitting. The best model achieved an accuracy of 99.48%, an F1-Score of 99.49%, and a cross-validation score of 99.61%. Logistic Regression also performed very well, with perfect accuracy in several scenarios but with a risk of overfitting. Naive Bayes, while fast and efficient, showed lower performance with an accuracy of 92.23%, an F1-Score of 91.35%, and a cross-validation score of 87.00%. In conclusion, Random Forest is recommended due to its consistent performance and strong generalization capability. Logistic Regression is suitable when model interpretability is important, with additional regularization to reduce overfitting. Naive Bayes is more appropriate for situations requiring speed and efficiency

Item Type: Thesis (Masters)
Additional Information: Pembimbing: Prof. Dr. Ir. Shofwatul 'Uyun, S.T., M.Kom., IPM., ASEAN Eng
Uncontrolled Keywords: NLP, Machine Learning, Random Forest, Naive Bayes, Logistic Regression, Seleksi Resume
Subjects: 000 Ilmu Komputer, Ilmu Informasi, dan Karya Umum > 000 Karya Umum > 005.36 Sistem Informasi
Divisions: Fakultas Sains dan Teknologi > Informatika (S2)
Depositing User: Muh Khabib, SIP.
Date Deposited: 02 Oct 2024 13:32
Last Modified: 02 Oct 2024 13:32
URI: http://digilib.uin-suka.ac.id/id/eprint/67438

Share this knowledge with your friends :

Actions (login required)

View Item View Item
Chat Kak Imum