In order to better understand their consumer base, businesses usually utilize surveys or other types of statistics in order to better determine how to improve their products. What if you could skip parts of this painstaking process and automatically process data from product reviews? It is this particular issue that sentiment analysis provides an answer to.
Sentiment analysis is a collection of techniques e used in order to analyze subjectivity in text. This analysis could be applied to different levels of text. You can analyze entire documents, sentences or on a more fine-grained level of aspects within sentences.
This article will give a brief overview of sentiment analysis using product review data. By product review data I mean the type of data where a customer leaves an unstructured review that contains no score or ranking other than the text itself.
To illustrate this further, a product review might be:
“This computer is much better than my previous one, although the fan sometimes gets quite loud”,
In this particular sentence the sentiment analysis techniques could classify the review as positive, negative or neutral.
What Is Sentiment Analysis?
Sentiment analysis consists of several techniques that are used to classify text. The techniques are derived from an interdisciplinary field called Natural Language Processing (NLP) which utilizes machine learning, computer science and linguistics to analyze human text and speech.
Researchers have shown that sentiment analysis applied to twitter data corresponds to public opinion polls, can help predict the stock market and has successfully been used for determining sentiment towards twitter query terms.
Why Is Sentiment Analysis Using Product Review Data Important For Businesses?
For businesses, sentiment analysis is arguably the available NLP technique that is most useful. Sentiment analysis allows shortening the communicative distance between businesses and consumers without having to rely on surveys or other time-consuming opinion mining options. As a result the sentiment analysis saves both the business and the consumer time, which is ultimately beneficial to both parties.
Sentiment analysis using product review data can be useful for businesses, particularly given the amount of data available nowadays in the form of product reviews and comments on social media.
Manual Sentiment Analysis Approach And Automated Approach
The large amount of data can often prevent manual analysis from being a suitable option. Furthermore, manual analysis prevents real-time access to the consumer’s opinions. If the extraction of sentiment is dependent on the additional step of manual investigation of the data, the analysis will be delayed.
The advantage of sentiment analysis is that it allows automatic information extraction from text. This results in real-time analysis.
This provides companies with quick access to any new developments along with instant feedback. Speeding up the company’s potential response time to developing issues.
Examples of Sentiment Analysis Using Product Review Data
Classifying the Sentiment of Sentences
Product reviews are an available source for subjectivity in text that can be utilized for sentiment analysis in order to benefit a business.
Customers might generally be satisfied with a product, and this level of satisfaction can be automatically detected using sentiment analysis.This allows the businesses to gauge how satisfied their consumer base is with a certain product.
To illustrate this further: consider an example comment made by a customer online. There is no label given by the customer but we can use freely available sentiment analysis methods to classify the sentence’s sentiment as a probability score between 1 to 5 stars. In this case the review is positive, and is correctly classified as such given that 5 stars is the most probable classification.
Classifying Sentences: Informative Enough?
However, this type of information is not necessarily informative enough to be useful.
Let’s say negative sentiment towards a product or company has increased by 5% from one year to the next. What exactly should be done with this information? In order to answer this question we need to step down from the level of paragraphs and sentences to the level of aspects within the sentences.
Aspect based sentiment analysis
More fine-grained analysis is possible with sentiment analysis techniques.Customers might be satisfied with certain aspects of a product while being dissatisfied with other aspects of the product. The sentiment towards particular aspects or entities is known as aspect based sentiment analysis.
For example: The general view among customers who have purchased a computer from a company might be that they are satisfied with the computer’s quality, but dissatisfied with the price or the appearance of the computer.
This more fine-grained analysis is also possible using freely available sentiment analysis models as shown in the image below. In this case the customer thought that the computer performed well but disliked its appearance. We get an output that captures this granularity of opinions in the table to the right.
Aspect based sentiment analysis allows the business to understand not just what in particular customers are satisfied or dissatisfied with, but why. The why in turn allows businesses to make better informed decisions that are based on data.
Sentiment Analysis Applications
Sentiment analysis using product review data allows businesses to make better informed decisions by helping them decide what in particular should be prioritized in order to increase customer satisfaction. Furthermore: issues on social media can develop quickly as it pertains to brand names or products and sentiment analysis helps companies keep up to date with any nuances
I will provide some further examples of areas where applying sentiment analysis using product review data can be useful.
Marketing
We could apply sentiment analysis to the domain of marketing by asking: How does a particular company measure up to its competitors in different domains?
For example, if we were to compare all customer reviews on the market in relation to particular aspects of a product, lets say “camera quality” and find the aspects where customers generally prefer your company’s product.
We could then use this information for marketing purposes. A business could provide evidence, through the collected data, for marketing slogans like “The best camera quality among all competitors according to customers”.
Decreasing churn rate
Decreasing churn rate might be the most important use of sentiment analysis using product review data. A lost customer might not only mean the loss of a sale, but is furthermore bad PR which should be avoided. Recognizing exactly what aspects customers are dissatisfied with is therefore crucial to keeping the customers from churning, since it allows the company the possibility of rectifying those issues in the quickest way possible.
Main Techniques Of Sentiment Analysis
We have established that there is a lot of latent information available in text. The question we will now attempt to answer is: what techniques are available that allow the extraction of as much useful information from the text as possible?.
Previous methods
Earlier, more primitive sentiment analysis techniques were based on statistics or rules. For example, given a list of negative, neutral and positive words, one option when classifying the polarity of a sentence could be to count the number of positive sento negative words in that sentence in order to obtain a sentence score. The sentence polarity would then be the sum of the polarity of the individual words.
Sample Sentence
“This camera is great! I really like it, although it was quite expensive”
- First, clean the sentence and keep only adjectives and verbs: “Great, Like, Expensive”.
- Second, compare the three words to lists containing positive and negative words.
- Third, retrieve a predetermined score for each word and calculate the score for the entire sentence: Great: +2, Like: +1, Expensive: -1. Sentence score: (+2, +1, -1 = +2)
- Finally, retrieve a sentence label, in our toy example we assume sentences with a score higher than 0 to be positive, since our sentence had a score of +2, we label it positive.
However, this is a sub-par solution since natural language is oftentimes complex and the process of manually creating rules is both time-consuming and unscalable.
Consider a sentence like “I hate how much I enjoy your cheeseburgers”. If we count the negative and positive words in the sentence, then based on our previous method we might fail to capture that this particular review is positive since the sentence contains both “hate” and “enjoy”.
Examples of difficult sentences
In order to realize the difficulty of the task, consider these examples:
“My friend told me this was an amazing product, he was wrong” (Negation)
“Microsoft makes really good computers, your company however does not” (Positive towards other company’s product, not your company)
“Great product if you want to waste your money” (Satire)
To deal with what might actually be written in a review, a method for sentiment analysis needs to deal with sentence structure, satire and additionally misspelled words and emoticons in order to accurately perform the analysis. The complexity of the task requires a different method than rules or statistics altogether, which can be found in using machine learning models.
Machine learning methods
The current state of the art ways to perform sentiment analysis is by using machine learning models trained specifically for this task. Human text varies tremendously, and for models to correctly detect sentiment in text with a high degree of accuracy requires complex models that are able to generalize over the diversity found in the data.
For example, there are many ways to express the same sentiment and the models need to be able to generalize over this sentence variety. In that way the sentence form does not impact the semantic interpretation of the sentence too much.
As we have previously shown, the solution could be as simple as using an already trained machine learning model. However, models learn from data (in this case the data is text), and if the data that the model is trained on is too different from the particular area which it is to be applied to, the result will be sub-par.
General Machine Learning Methods
Machine learning models trained for sentiment analysis will usually take as input some text and output one or several labels.
In order to train a model it generally needs both text and correct labels for that text in order to learn how to return the correct label for that type of text.
In that way, the model can learn to generalize over text inputs. When new input is seen the model should be able to approximate the correct label for that unseen input as well, despite not having seen it previously. If the data that the model is trained on is sufficiently representative of the type of data it will encounter then the model will generally perform well.
Sentence Classification
The general procedure in which machine learning models are trained for the task of sentiment analysis is detailed below:
- We have a dataset consisting of several product reviews and a rating, for example:
[Comment: “This restaurant is fantastic!”, Rating: 5/5 stars]
We assume there is a correspondence between each comment and the associated review, since the review indicates how much the customer liked the product. We take all of our data and split it into two groups, one for training and one for testing.
- We then train a model on the training set, using the ratings as a target for the model in order to teach the model to associate a review with a rating. The model will through the training process adjust its internal parameters to produce more accurate ratings given a review. The model in a sense “learns” to associate certain words and phrases with a certain rating.
- The model is then tested on the test set data with ratings that it has not seen before. If the model predicts a sentiment score that is similar to the actual ratings, it is considered a good model. It has in a sense learned what words and phrases usually accompany a good score.
- The model can then be confidently applied to new text where there are no labels.The model has learned to produce accurate ratings from text, at least given our dataset If the dataset is similar to what the model will be asked to predict we should get a good score.
A similar implementation using machine learning via the data analytics platform KNIME can be found on our blog (article 1, article 2). The general implementation is the same, i.e training a machine learning model to learn how to classify text based on labels. There is also a “workflow” readily available for sentiment classification of sentences should you be interested in trying out KNIME.
Aspect Based Sentiment Analysis
The only difference between training a model for aspect-based sentiment analysis would be to change the particular model architecture used along with the dataset used for training.
In order to perform aspect based sentiment analysis we need to first define what particular aspects we are interested in for a particular type of business. We could define these aspects ourselves by determining what features we are primarily interested in, like quality, appearance and price.
If the number of aspects are too many for our purposes we might decide to combine some of them. It might make sense to combine mentions of several aspects like [Color, Appearance, Looks] into a single aspect: Appearance, in order to simplify the information. We might also do this automatically, by associating different terms with particular aspects through their automatically detected semantic similarity.
We could then use machine learning models to determine what sentiment is expressed towards those aspects that we are interested in.
Conclusion
In short, there are many ways to make the task of sentiment analysis fit a business’s particular demands. These often include adapting models to a certain domain, like text from restaurant reviews or technology reviews. Businesses might also vary in what particular aspects related to their business they are interested in. Yet more complex sentiment analysis is possible, where we can look at particular psychological states expressed in text, like anger, sadness or excitement7.
Sentiment analysis has a proven usefulness for businesses but it is important to keep in mind what type of sentiment analysis is to be performed.
Most businesses have more to gain by looking into aspect based sentiment analysis since classifying sentences based on their polarity is not necessarily that useful.
Once the type of sentiment analysis to be performed has been chosen, a process of selecting and adapting a machine learning model to a business’s particular demand can begin and this is where Redfield review analysis can come to your aid.