A Designing Model System for Identification Violation of the Law Using Web Crawling and Text Mining

Authors

  • Isnin Faried Institut Perbanas Jakarta
  • Dwi Atmodjo W.P Perbanas Institute - Jakarta
  • Lely P.D. Tampubolon

Keywords:

Model System, , Violation the ITE Law, , Focused Web Crawling, Text Mining

Abstract

This study aims to create a model of an opinion detection system that will be used as a system to prevent violations of the ITE Law. Source of opinion from Indonesian-language twitter. The system was developed using web crawler technology and text mining with the implementation of one of the classification methods. This means that the research does not focus on sentiment analysis.

A web crawler with a focused crawling algorithm can make it easier to find legal sources that are used as guidelines for identifying violations from a user's opinion. The source of law in question is the ITE Law. The resulting convenience is being able to compile document sources on only specific topics and crawl relevant areas of the web. The resulting impact can reduce the amount of network traffic, resulting in significant savings in hardware and network resources.

Opinion Mining in this research uses the Naïve Bayes Multinominal Text (NBMT) algorithm. This algorithm is one of the algorithms in accordance with the opinion classification of twitter, capable of producing good accuracy. Another result obtained from the implementation of the NBMT algorithm is the speed of the process in the opinion classification process.

The resulting model is used as an alternative that can be used to prevent opinions that have potential violations, especially the ITE Law. To be more perfect, the author plans to develop research that focuses on opinion mining which has better accuracy, especially the detection of violations of the ITE Law.

 

Keywords : Model System, Violation the ITE Law, Focused Web Crawling, Text Mining

References

Adikara, P.P., Adinugroho, S. and Insani, S., Detection of Cyber Harassment (Cyberbullying) on Instagram Using Naïve Bayes Classifier with Bag of Words and Lexicon Based Features, Proceeding of The 7th International Conference on Sustainable Information Engineering and Technology (SIET’20), pp. 64-68, Malang, Indonesia, November 16-17, 2020

Alsanad, A., Arabic Topic Detection Using Discriminative Multinominal Naïve Bayes and Frequency Transforms, Proceedings of the 2018 International Conference on Signal Processing and Machine Learning (SPML’18), pp. 17-21, Shanghai, China, November 28-30, 2018

Arum, N.S., Wahyudin, A. and Kardina, A., Panduan Penanganan Perkara Pelanggaran Kebebasan Ekspresi Daring bagi Pendamping Hukum, SAFEnet, 2022

Kalokasari, D.H., Shofi, I.M. and Setyaningrum A.H., Implementasi Algoritma Multinominal Naïve Bayes Classifier Pada Sistem Klasifikasi Surat Keluar, Jurnal Teknik Informatika, vol. 10, no. 2, 2017

Kausar, M.A., Dhaka, V.S. and Singh, S. K., Web Crawler : A Review, International Journal of Computer Applications, vol. 63, no. 2, pp. 31-36, 2013

Kusumo, V.K., Junia, I.L..R., Prianto, Y. and Ruchimat, T., Pengaruh UU ITE Terhadap Kebebasan Berkespresi Di Media Sosial, Proceeding of Seminar Nasional Hasil Penelitian dan Pengabdian Masyarakat 2021, pp. 1069-1078, Jakarta, Indonesia, October 21st , 2021.

Panduan Memahami Larsa Bahasa Hukum, Available : https://www.hukumonline.com/berita/a/panduan-memahami-laras-bahasa-hukum-lt53c489209fd8e/ , Last Accessed on December 10, 2020

Purwiantono, F.E. and Aditya, A., Klasifikasi Sentimen Sara, Hoaks Dan Radikal Pada Postingan Media Sosial Menggunakan Algoritma Naïve Bayes Multinominal Text, Jurnal TEKNOKOMPAKI, vol. 14, no. 2, pp. 68-73, 2020

Republik Indonesia, “Undang-undang tentang Informasi dan Transaksi Elektronik (ITE).” 2016.

Setiawan, R., Black Box Testing untuk Menguji Perangkat Lunak, Available : https://www.dicoding.com/blog/black-box-testing/ , Last Accessed on December 10, 2020

Vaseeharan, T. and Aponso, A., Review On Sentiment Analysis of Twitter Posts About News Headlines Using Machine Learning Approaches and Naïve Bayes Classifier, Proceeding of the 12th International Conference on Computer and Automation Engineering (ICCAE 2020), pp. 33-37, Sydney, NSW, Australia, February 14-16, 2020

Yu, L., Li, Y., Zeng, Q., Sun, Y., Bian, Y. and He, W., Summary of web crawler technology research, Journal of Physics : Conference Series (ISPECE), 2020

Xue, J., Liu, K., Lu, Z. and Lu, H., Analysis of Chinese Comments on Douban Based on Naïve Bayes, Proceeding of the International Conference on Big Data Technology (ICBDT 2019), pp. 121-124, Jinan, China, August 28-30, 2019

Zhirui, Y. and Chunyan, L., Analysis of Sentiment Classification of Hotel Reviews Based on Multinominal Naïve Bayes, Proceeding of the 11th International Conference on E-business, Management and Economic (ICEME’20), pp.11-14, Beijing, China, July 15-17, 2020

Downloads

Published

2023-01-11

How to Cite

Faried, I., Atmodjo W.P, D., & Tampubolon, L. P. (2023). A Designing Model System for Identification Violation of the Law Using Web Crawling and Text Mining. Adpebi Science Series. Retrieved from https://adpebipublishing.com/index.php/AICMEST/article/view/201

Issue

Section

Articles