Document Type : Review Article

Authors

1 Department of Computer Engineering, Isfahan (Khorasgan) Branch, Islamic Azad University, Isfahan, Iran.

2 Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran.

Abstract

With the availability of websites and the growth of comments, reviews of user-generated content published on the Internet. Sentiment Classification is one of the most common problems in text mining, which applies to categorize reviews into positive and negative classes. Pre-processing has an important role when these textual contexts employed by machine learning techniques. Without efficient pre-processing methods, unreliable results will achieve. This research probes to investigate the performance of pre-processing for the Sentiment Classification problem on three popular datasets. We suggest a high-performance framework to enhance classification performance.  First, features of user's opinions are extracted based on three methods: (1) Backward Feature Selection; (2) High Correlation Filter; and (3) Low Variance Filter. Second, the error rate of the primary classification for each method calculated through the perceptron. Finally, the best method selected through the fuzzy analytic hierarchy process. This framework is beneficial for companies to observe people's comments about their brands and for many other applications. The current authors have provided further evidence to confirm the superiority of the proposed framework. The obtained results indicate that on average this proposed framework outperformed its counterparts. This framework yields 90.63 precision, 90.89 accuracy, 91.27 recall, and 91.05% f-measure.

Keywords

[1] A. Alsaeedi and M. Z. Khan, “A Study on Sentiment Analysis Techniques of Twitter Data,” International Journal of Advanced Computer Science and Applications, Vol. 10, No. 2, pp. 361-374, 2019.
[2] R. Asgarnezhad and K. Mohebbi, "A comparative classification of approaches and applications in opinion mining," International Academic Journal of Science and Engineering, Vol. 2, No. 1, pp. 68.80, 2015.
[3] S. R. Ahmad, M. Z. M. Rodzi, N. S. S. Nurhafizeh, M. M. Yusop, and S. Ismail, “A Review of Feature Selection and Sentiment Analysis Technique in Issues of Propaganda,” International Journal of Advanced Computer Science and Applications, Vol. 10, No. 11, pp. 240-245, 2019.
[4] J. J. Shynk, “Performance surfaces of a single-layer perceptron,” IEEE Transactions on Neural Networks, Vol. 1, No. 3, pp. 268-274, 1990.
[5] D. Michie, D. J. Spiegelhalter, and C. Taylor, “Machine learning,” Neural and Statistical Classification, Vol. 13, pp.1-298, 1994.
[6] B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up? sentiment classification using machine learning techniques," in Proc. of the ACL-02 conference on Empirical methods in natural language processing, Vol. 10, 2002, pp. 79-86.
[7] P. Chaovalit and L. Zhou. “Movie review mining: A comparison between supervised and unsupervised classification approaches,” in Proc. of the 38th annual Hawaii international conf. on system sciences, IEEE, 2005, pp.1-9.
[8] E. Kouloumpis, T. Wilson, and J. Moore, “Twitter sentiment analysis: The good the bad and the omg!,” in Fifth International AAAI conf. on weblogs and social media, 2011, pp.538-341.
[9] N. F. Da Silva, E. R. Hruschka, and E. R. Hruschka Jr., “Tweet sentiment analysis with classifier ensembles,” Decision Support Systems, Vol. 66, pp. 170-179, 2014.
[10] G. Tripathi and S. Naganna, "Feature selection and classification approach for sentiment analysis," Machine Learning and Applications: An International Journal, Vol. 2, pp. 1-16, 2015.
[11] A. Hassan, A. Abbasi, and D. Zeng, "Twitter sentiment analysis: A bootstrap ensemble framework," in International Conf. on Social Computing, 2013, pp. 357-364.
[12] N. Cummins, S. Amiriparian, S. Ottl, M. Gerczuk, M. Schmitt, and B. Schuller, “Multimodal Bag-of-Words for cross domains sentiment analysis,” in IEEE International Conf. on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2018, pp. 4954-4958.
[13] M. Dragoni, S. Poria, and E. Cambria, “OntoSenticNet: A commonsense ontology for sentiment analysis,” IEEE Intelligent Systems, Vol. 33, No. 3, pp. 77-85, 2018.
[14] R. Asgarnezhad, S. A. Monadjemi, M. Soltanaghaei, and A. Bagheri, "SFT: A model for sentiment classification using supervised methods in Twitter," Journal of Theoretical & Applied Information Technology, Vol. 96, No. 8, pp. 2242-2251, 2018.
[15] S. Rosenthal, N. Farra, P. Nakov, “SemEval-2017 task 4: Sentiment analysis in Twitter,” in 11th international workshop on semantic evaluation (SemEval-2017), 2019, pp. 502–518.
[16] F. Ali, K. S. Kwak, and Y. G. Kim, “Opinion mining based on fuzzy domain ontology and Support Vector Machine: A proposal to automate online review classification,” Applied Soft Computing, Vol. 47, pp. 235-250, 2016.
[17] F. Ali, S. El-Sappagh, and D. Kwak, “Fuzzy ontology and LSTM-based text mining: a transportation network monitoring system for assisting travel,” Sensors, Vol. 19, No. 2, pp. 1-23, 2019.
[18] M. B. Alvi, N. M. Mahoto, M. A. Unar, and M. A. Shaikh, “An Effective Framework for Tweet Level Sentiment Classification using Recursive Text Pre-Processing Approach,” International Journal of Advanced Computer Science and Applications, Vol. 10, No. 6, pp. 572-581, 2019.
[19] C. D. Manning, P. Raghavan, and H. Schütze, “Introduction to information retrieval,” Cambridge university press, 2008.
[20] D. Q. Nguyen, D. Q. Nguyen, T. Vu, and S. B. Pham, “Sentiment classification on polarity reviews: an empirical study using rating-based features,” in 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, 2014, pp. 128–135.
[21] A. Nawaz, S. Asghar, and S. H. A. Naqvi, “A segregational approach for determining aspect sentiments in social media analysis,” The Journal of Supercomputing, Vol. 75, No. 5, pp. 2584-2602, 2019.
[22] Movie review data set, Available: http://www.cs.cornell.edu/people/pabo/movie-review-data/.
[23] Dataset available at http://www.sananalytics.com
[24] J. Han and M. Kamber, “Data mining: concepts and techniques,” Morgan Kaufmann Publishers–An Imprint of Elsevier, Vol. 500, pp. 105-150, 2006.
[25] B. McDonald, D. Waugaman, and C. Kettleborough, “A statistical analysis of a packed tower dehumidifier,” Drying Technology, Vol. 10, No. 1, pp. 223-237, 1992.