Toxic Text Detection In Vietnamese Language
Corressponding author's email:
tanlm@hcmute.edu.vnDOI:
https://doi.org/10.54644/jte.2024.1528Keywords:
Machine Learning, Natural Language Processing, Text Classification, Long Short-Term Memory, Gated Recurrent UnitAbstract
The rapid growth of online platforms in recent years, such as social In recent years, the online world has seen an explosion of platforms for communication and sharing. Social networks, forums, and countless websites have created a vast and diverse online landscape. This abundance of content, while exciting, has also introduced new challenges, particularly when it comes to protecting children. The ease of access to the internet can expose them to potential risks, such as encountering toxic language and online bullying. Traditional methods of mitigation, like blocking connections or restricting screen time, can be cumbersome and may not be entirely effective. This paper proposes a novel solution that leverages the power of deep learning. By training deep learning models to identify malicious phrases, our models can recognize various forms of inappropriate language, including both sensitive words and seemingly harmless words used with harmful intent. This intelligent filtering system can be implemented on both the server-side and client-side of online platforms, offering a robust layer of protection for users as they navigate the digital world.
Downloads: 0
References
"Vietnam 'in' top 5 countries with poor online behavior." https://vtc.vn/viet-nam-lot-top-5-ung-xu-kem-van-minh-tren-internet-ar529256.html, 2020. Accessed: Dec. 21, 2023.
C. C. Aggarwal, "Neural networks and deep learning: A textbook." Cham, Switzerland: Springer Nature, Jun. 2023. DOI: https://doi.org/10.1007/978-3-031-29642-0
C. C. Aggarwal, "Machine learning for text." Cham, Switzerland: Springer Nature, May 2022. DOI: https://doi.org/10.1007/978-3-030-96623-2
S. A. Amidi, "Recurrent neural networks cheatsheet." https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networks, 2019. Accessed: Dec. 13, 2023.
"What is word embedding? Why is it important?." https://trituenhantao.io/kien-thuc/word-embedding-la-gi-tai-sao-no-quan-trong/, 2019. Accessed: Dec. 13, 2023.
B. Q. Manh, "Viblo - word embedding - understanding basic concepts in NLP." https://viblo.asia/p/word-embedding-tim-hieu-khai-niem-co-ban-trong-nlp-1Je5E93G5nL, 2020. Accessed: Dec. 13, 2023.
V. A. Krithika, "Introduction to fasttext embeddings and its implication." https://www.analyticsvidhya.com/blog/2023/01/introduction-to-fasttext-embeddings-and-its-implication/, 2023. Accessed: Dec. 13, 2023.
"Toxic comment classification challenge." https://www.kaggle.com/competitions/jigsaw-toxic-comment-classification-challenge/data, 2018. Accessed: Dec. 21, 2023.
K. Dubey, R. Nair, M. U. Khan, and S. Shaikh, "Toxic comment detection using LSTM," in Proc. 2020 Third International Conference on Advances in Electronics, Computers and Communications (ICAECC), Dec. 2020, doi:10.1109/icaecc50550.2020.9339521. DOI: https://doi.org/10.1109/ICAECC50550.2020.9339521
A. K. Bala, "Toxic comments identification and classification using Deep Neural Networks," Academia.edu, https://www.academia.edu/41458366/Toxic_Comments_Identification_and_Classification_Using_Deep_Neural_Networks. Accessed: Dec. 21, 2023.
R. Sharma and M. Patel, "Toxic comment classification using neural networks and machine learning," IARJSET, vol. 5, no. 9, pp. 47–52, Sep. 2018, doi:10.17148/iarjset.2018.597. DOI: https://doi.org/10.17148/IARJSET.2018.597
V. M. Krešňáková, M. Sarnovský, P. Butka, and K. Machová, "Comparison of deep learning models and various text pre-processing techniques for the toxic comments classification," Applied Sciences, vol. 10, no. 23, p. 8631, Dec. 2020, doi:10.3390/app10238631. DOI: https://doi.org/10.3390/app10238631
Downloads
Published
How to Cite
Issue
Section
Categories
License
Copyright (c) 2024 Journal of Technical Education Science

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright © JTE.


