Elsevier

Expert Systems with Applications

Volume 84, 30 October 2017, Pages 117-126
Expert Systems with Applications

An expert system for extracting knowledge from customers’ reviews: The case of Amazon.com, Inc.

https://doi.org/10.1016/j.eswa.2017.05.008Get rights and content

Highlights

  • A system for the prediction of the success of products available on Amazon.

  • Prediction based on the feedbacks of the users.

  • A system that outperforms existing state of the art techniques.

  • A system based on the concept of semantics.

  • A system able to process a large amount of data in an acceptable amount of time.

Abstract

E-commerce has proliferated in the daily activities of end-consumers and firms alike. For firms, consumer satisfaction is an important indicator of e-commerce success. Today, consumers’ reviews and feedback are increasingly shaping consumer intentions regarding new purchases and repeated purchases, while helping to attract new customers. In our work, we use an expert system to predict the sentiment of a product considering a subset of available customers’ reviews.

Introduction

In the marketing and management literature, customer satisfaction is increasingly emphasized as a vital factor for increasing sales and thus corporate performance (Anderson, Fornell, Lehmann, 1994, Balasubramanian, Konana, Menon, 2003, Szymanski, Henard, 2001). As firms engage ever more in e-commerce initiatives, customer satisfaction is an important indicator of e-commerce success. Despite the growth in sales and volumes seen around the world, e-commerce sites often tend to underestimate customer reviews’ importance for new purchases and repeat purchases, as well as for attracting new customers. More often than not, customer reviews are neglected on the list of what firms believe are the critical success factors of an e-commerce initiative (Liu & Arnett, 2000). In fact, firms tend to focus chiefly on optimizing the design of the website, customer service and support, and the administrative activities related to the management of e-commerce (Bendoly, Kaefer, 2004, Bergendahl, 2005, Teo, Liu, 2007).

Although all of these activities require valuable effort, overlooking customer reviews and feedback can curtail the success of e-commerce (Cui, Lui, Guo, 2012, Kim, Galliers, Shin, Ryoo, Kim, 2012, Lee, Han, Suh, 2014, Qiu, Lin, Li, 2015). More specifically, according to the “Social Commerce2007” report of Bazaarvoice (Bazaarvoice, 2007) as a result of the opportunity given to customers to provide feedback and reviews on purchased products, 42% of e-commerce managers reported a significant rise in the volume and average order value. On the other hand, only 6% of e-commerce managers revealed a decline in orders after the reviews were made public. Similarly, large e-commerce sites such as Amazon and eBay have found that many consumers appreciate the opportunity to evaluate products before their (possible) purchase. About 40% of customers participating in a survey by Nielsen (Nielsen, 2010) stated they would not have purchased an electronic item without having had access to the opinions of other customers, whereas 85.57% of the participants said they read reviews often or very often before buying online. In this perspective, an additional factor with a significant impact on online purchasing behavior is the customer social network. As reported in Verbraken, Goethals, Verbeke, and Baesens (2014), knowledge of a person’s social network can help in predicting that person’s e-commerce acceptance of different products. The quality of reviews is also considered to be very important. A study reported in Lackermair, Kailer, and Kanmaz (2013) found that 75% of customers reported that the quality of such reviews greatly influences their decision to purchase a product from an online store. Last but not least, the possibility to use comments made by customers in order to improve the SEO (Search Engine Optimization) process also needs to be considered as a relevant factor for firms. In fact, the content of customers’ opinions could be indexed and used to produce search engine results. In this case, the advantage is that the reviews are written in natural language, which is able to match many keywords and thus give a further boost to the SEO task.

Obviously, customer feedback cannot be considered the only variable that can (positively or negatively) affect the average value of orders. Thus, firms need to consider the broader interplay of factors to fully comprehend which enable and which inhibit e-commerce success (Gefen, 2000). An analysis of the role of all possible components is very complex, yet, the strong correlation between reviews and firm sales, as well as studies demonstrating the importance of reviews in establishing an e-commerce website’s reliability, make e-commerce sites’ inclusion of customer reviews an established practice.

The ability to predict the score of future reviews is useful in many applications: for instance, it is possible to suggest, among objects with similar ratings, the one that has the highest expected future review score. It may also be useful for predicting issues with the items: from the detection of sellers with counterfeit objects to issues concerning the manipulation of the reviews (Hu, Liu, & Sambamurthy, 2011). In both cases, a score that varies too much with respect to the predicted one can be interpreted as a signal of a possible anomaly. Other studies dealing with the importance of predicting the review score of an item are presented in Qu, Ifrim, and Weikum (2010), Gupta, Di Fabbrizio, and Haffner (2010) and Ganu, Elhadad, and Marian (2009).

All aspects mentioned so far show that customer feedback is an important asset for e-commerce managers. Hence, extracting non-trivial knowledge and manageable information from such rich data pools is a challenging issue of paramount importance for e-commerce managers.

To answer this call, in this paper we propose the use of a machine learning (ML) technique. The application of a ML technique tries to overcome the limitations of traditional statistic-based linear regression methods. Although these techniques and models are reliable, they are the best choice in managing unstructured data or data where no previous knowledge of the underlying model is available. Hence, more sophisticated means must be employed to extract meaningful information from data. ML methods have shown an ability to perform better when dealing with non-linearity and unstructured and complex data. While existing ML techniques have been successfully used to address problems in different domains, researchers continuously seek to advance existing methods and provide novel ones for analyzing data sets to make sense of the data, extract useful information, and build knowledge to inform decision-making. In this light, and considering the large amount of data available today, in this paper we propose an artificial intelligence system for extracting useful information considering the feedback of e-commerce customers. The proposed algorithm is a variant of the standard genetic programming (GP) algorithm but, unlike the standard one, it is able to scale beyond data sets of a few million elements and it is based on a solid theoretical background that guarantees the existence of certain properties that will help the search process produce more reliable solutions

The paper is organized as follows: Section 2 describes the standard GP algorithm and the operators used in the search process. Section 3 presents the geometric semantic operators used in this paper. More specifically, we highlight the benefits of the operators on the search process. Section 4 describes the experimental phase and discusses the results obtained. Section 5 concludes the paper, providing some directions for possible future research.

Section snippets

Genetic programming

Genetic Programming (GP) is a technique that comes from a larger computational intelligence research area called evolutionary computation (EC). GP consists of the automated learning of computer programs by means of a process inspired by biological evolution (Koza, 1992). Generation by generation, GP stochastically transforms populations of programs into new, hopefully improved, populations of programs. The quality of a solution is expressed by using an objective function. The value of this

Geometric semantic operators

Despite the large number of human-competitive results achieved with the use of GP (Koza, 2010), researchers continue to investigate new methods in order to improve GP’s ability to produce optimal or quasi-optimal solutions. In recent years, an emerging idea is to include the concept of semantics in the evolutionary process performed by GP. While several studies exist (i.e. Beadle, Johnson, 2009, Castelli, Vanneschi, Silva, 2014b, Vanneschi, Castelli, Silva, 2014a), the definition of semantics

Experiments

This section describes the business problem that was considered, the available data, the experimental settings, and the obtained results.

Conclusions

We proposed a genetic programming system for predicting review scores based on a subset of existing reviews. The proposed system uses genetic operators that are able to integrate semantic awareness into the search process. The use of these operators induces a unimodal fitness landscape in every problem that entails finding a match between predicted values and targets (like regression and classification problems). Considering the particular problem under scrutiny, it is possible to draw some

References (37)

  • S. Balasubramanian et al.

    Customer satisfaction in virtual environments: A study of online investing

    Management Science

    (2003)
  • Bazaarvoice (2007). Social commerce report 2007....
  • L. Beadle et al.

    Semantically driven mutation in genetic programming

  • M. Castelli et al.

    A c++ framework for geometric semantic genetic programming

    Genetic Programming and Evolvable Machines

    (2014)
  • M. Castelli et al.

    Semantic search-based genetic programming and the effect of intron deletion

    Cybernetics, IEEE Transactions on

    (2014)
  • G. Cui et al.

    The effect of online consumer reviews on new product sales

    International Journal of Electronics and Commerce

    (2012)
  • G. Ganu et al.

    Beyond the stars: Improving rating predictions using review text content

    Webdb

    (2009)
  • N. Gupta et al.

    Capturing the stars: predicting ratings for service and product reviews

    Proceedings of the naacl hlt 2010 workshop on semantic search

    (2010)
  • Cited by (0)

    View full text