General Understanding of The Procedures for Applying The Naïve Bayes Classifier to Classify Topics

Narzillo Aloyev  Raxmatilloyevich

doi:10.51699/cajlpc.v7i1.1400

Authors

Narzillo Aloyev Raxmatilloyevich Alisher Nayoiy Tashkent State University of Uzbek Language and Literature, Foundational doctoral student

DOI:

https://doi.org/10.51699/cajlpc.v7i1.1400

Keywords:

ideology, Naive Bayes, classification, Bayes' theorem, Maximum Likelihood Estimation, text classification, spam filtering, confusion matrix

Abstract

The Naive Bayes classifier is one of the first and most famous examples of a supervised machine learning algorithm based on the Bayes theorem. Naive Bayes is mainly used for text classification and based on the principles of probability, with certain assumptions that make it computationally efficient. The Naive Bayes classifier can be quite efficient, but assumptions like the conditional independence of features, which are often untrue, can lead to reduced performance in real-world applications. The purpose of this paper is to introduce the mechanisms behind the Naive Bayes classifier and to demonstrate the implementation of Naive Bayes in text classification. If we apply Naive Bayes in spam email filtering, the model calculates conditional, prior probabilities to predict if an email is spam or not. It talks about the use of Maximum Likelihood Estimation (MLE) to compute the probabilities used in text classification along with some information on the use of confusion matrices to evaluate the performance of classifiers. These results indicate the importance of data preprocessing and addressing feature dependence in real-life applications of Naive Bayes and suggest meaningful avenues for improving its performance.

References

J.C. Catford, A Linguistic Theory of Translation, London: Oxford University Press, 1965.

E.A.Nida, Toward a Science of Translating: With Special Reference to Principles and Procedures Involved in Bible Translating, Leiden: E. J. Brill, 1969.

J. -P. Vinay and J. Darbelnet, Comparative Stylistics of French and English: A Methodology for Translation, Amsterdam and Philadelphia: John Benjamins Publishing Company, 1995.

P.Newmark, A Textbook of Translation, New York: Prentice Hall, 1988.

L.S.Barkhudarov, Yazyk i Perevod (Voprosy Obshchey i Chastnoy Teorii Perevoda), Moscow: Mezhdunarodnye Otnosheniya, 1975.

V.N.Komissarov, Sovremennoe Perevodovedenie: Uchebnoe Posobie, Moscow: ETS, 2002.

D.Crystal, The Cambridge Encyclopedia of Language, Cambridge: Cambridge University Press, 2003.

O‘zbekiston Milliy Ensiklopediyasi, Toshkent: O‘zbekiston Milliy Ensiklopediyasi Davlat Ilmiy Nashriyoti, 2005.

B.Elov, R. Alayev, and N. Aloyev, “Modern methods of thematic modeling,” Digital Transformation and Artificial Intelligence, vol. 2, no. 1, pp. 8–16, 2024.

B.Elov and N. Alayev, “Methods for thematic modeling and classification of texts,” Journal of Sustainability and Leading Research, vol. 3, no. 12, pp. 263–276, 2023.

B.Elov, N. Aloyev, and A. Yuldashev, “Thematic modeling using SVD and NMF methods,” Uzbekistan: Language and Culture (Computational Linguistics), vol. 2, no. 6, pp. 55–66, 2023.

R.Alghamdi and K. Alfalqi, “A survey of topic modeling in text mining,” International Journal of Advanced Computer Science and Applications, vol. 6, no. 1, 2015.

R.Tao, Y. Wei, and T. Yang, “Metaphor analysis method based on latent semantic analysis,” Journal of Donghua University (English Edition), vol. 38, no. 1, 2021.

W.Darmalaksana et al., “Latent semantic analysis and cosine similarity for hadith search engine,” Telkomnika (Telecommunication Computing Electronics and Control), vol. 18, no. 1, 2020.

Z.T. Ke and M. Wang, “Using SVD for topic modeling,” Journal of the American Statistical Association, 2022.