ANALISA HASIL PERBANDINGAN PERINGKASAN TEKS OTOMATIS UNTUK BERITA ONLINE MENGGUNAKAN METODE TERM FREQUENCY– INVERSE DOCUMENT FREQUENCY DENGAN CORPUS DAN NON CORPUS (STUDI KASUS BBC MAGAZINE ONLINE)

ADITYA DWI PUTRA , NIM. 10651046 (2014) ANALISA HASIL PERBANDINGAN PERINGKASAN TEKS OTOMATIS UNTUK BERITA ONLINE MENGGUNAKAN METODE TERM FREQUENCY– INVERSE DOCUMENT FREQUENCY DENGAN CORPUS DAN NON CORPUS (STUDI KASUS BBC MAGAZINE ONLINE). Skripsi thesis, UIN SUNAN KALIJAGA.

[img]
Preview
Text (ANALISA HASIL PERBANDINGAN PERINGKASAN TEKS OTOMATIS UNTUK BERITA ONLINE MENGGUNAKAN METODE TERM FREQUENCY– INVERSE DOCUMENT FREQUENCY DENGAN CORPUS DAN NON CORPUS (STUDI KASUS BBC MAGAZINE ONLINE))
BAB I, V, DAFTAR PUSTAKA.pdf

Download (571kB) | Preview
[img] Text (ANALISA HASIL PERBANDINGAN PERINGKASAN TEKS OTOMATIS UNTUK BERITA ONLINE MENGGUNAKAN METODE TERM FREQUENCY– INVERSE DOCUMENT FREQUENCY DENGAN CORPUS DAN NON CORPUS (STUDI KASUS BBC MAGAZINE ONLINE))
BAB II, III, IV.pdf
Restricted to Registered users only

Download (766kB)

Abstract

Development of Internet technology results the presence of many websites, including the website of a long article such as magazine online. This establishment of automatic text summarization expected to help reducing the time to read the entire contents of the article, so people can speed up read online magazines without losing the meaning of article This research begins with the preprocessing 1, including web scraping process which starts from text downloading from the website server until filtering text to get the normal text. The next step is pre-processing II, including text processing whose aim to get the word term. These processes consist of case folding, sentence splitting, tokenization and filtering stop words. However, if using the approach without the use of filtering stopwords then there are only sentence splitting and tokenization process. The results of this processes were then calculated its weight tf- idf. To generate a summary, the researcher does sentence extraction process whose high value of tf- idf. Data testing is using article from the bbc magazine online. Testing is done in two ways. Firstly, the researcher compares the result of system summary with the result of manual summary (from abstractor). Secondly, the data testing is using questionnaires to the respondents to assess the summary result. The results of the evaluation system that uses filtering corpus generate a summary with 47.50% accuracy (with standard deviation of 6.9 and a range of values between 41.66% to 56.25%). While the results of the evaluation system that does not use stopwords filtering produces 45.46% accuracy (with standard deviation of 6.2 and a range of values between 36.84% to 50.00%). Meanwhile, the result of the second testing using respondents, in general, it can be concluded that produces 4% of respondents strongly agreed, 72% agreed, 23.2% neutral, 0.8% disagreed, and stated strongly disagree 0%.

Item Type: Thesis (Skripsi)
Additional Information: Pembimbing : Agung Fatwanto, Ph.D
Uncontrolled Keywords: Key words : automatic text summarization, tf-idf, web data extraction, text Processing
Subjects: Tehnik Informatika
Divisions: Fakultas Sains dan Teknologi > Teknik Informatika (S1)
Depositing User: Miftahul Ulum [IT Staff]
Date Deposited: 01 Jul 2014 12:46
Last Modified: 14 Mar 2016 11:45
URI: http://digilib.uin-suka.ac.id/id/eprint/13273

Share this knowledge with your friends :

Actions (login required)

View Item View Item
Chat Kak Imum