Phrase Level Sentiment Analysis Approach

1694 Words7 Pages
PHRASE LEVEL SENTIMENT ANALYSIS: A Naïve Bayes Approach Mahnoor Yaqoob Middle Eastern Technical University, Ankara, Turkey Email: Mahnoor.yaqoob@metu.edu.tr Abstract: Sentiment Analysis is an enduring field of research in text processing field. Sentiment Analysis is a process of extracting opinions from the text. The aim of this paper is to perform a sentence level sentiment analysis on text data. For this purpose we have used Naïve Bayes Classifier along with Unigrams in a Bag-of-Word manner. We have evaluated the performance of our model by various methods and we have achieved an accuracy of 71% by using this technique. I. Introduction Sentiment analysis (SA) also referred as opinion mining (OM) is a computational study of opinions,…show more content…
Therefore in [6] authors have shed light on a new approach of phrase level sentiment analysis. They have first identified if the sentence is neutral or polar and then identified the contextual polarity of the sentence. They focused on the main idea as the contextual priority of words may be different from their prior probabilities. So they presented new method of automatically differentiating between prior probabilities and contextual priority of the words. For performing their experiment they manually annotated their…show more content…
Classification Method In our algorithm we will train Naïve Bayes Classifier for sentiment analysis task. We will train the classifier on testing set and test on different test data. Equation 1: Naïve Bayes Classifier C. Preprocessing Methods The dataset will go through the text pre-processing phase. In pre-processing sentence will go through stop words removal and lower case conversion. Stop words are usually removed before classification as they are not related to the specific topic. And all uppercase characters are converted into lower case form. D. Datasets Description For our experiment, the dataset we have used is obtained from University of Michigan SI650 (Information Retrieval). Every instance in the dataset (a line in the data file) is a sentence extracted from social media (blogs). The columns of dataset are separated by tab and 1 is assigned to positive sentences and 0 is assigned to the negative sentences. Training set consists of 7086 instances and testing set consists of 33052 instances. The dataset is available at https://www.kaggle.com/c/si650winter11/data. E. Feature Extraction Feature extraction is plays a vital role in the performance of the machine learning algorithm. In feature extraction we transform the raw data into numerical features so that it is understandable by the machine learning

More about Phrase Level Sentiment Analysis Approach

Open Document