2022-01-21 14:42:16 - Atualizado em 2022-01-21 14:42:40

Opinion mining for app reviews: an analysis of textual representation and predictive models

Por Ricardo Marcacini

Popular mobile applications receive millions of user reviews. These reviews contain relevant information for software maintenance, such as bug reports and improvement suggestions. The review’s information is a valuable knowledge source for software requirements engineering since the apps review analysis helps make strategic decisions to improve the app quality. However, due to the large volume of texts, the manual extraction of the relevant information is an impracticable task.

Opinion mining is the field of study for analyzing people’s sentiments and emotions through opinions expressed on the web, such as social networks, forums, and community platforms for products and services recommendation. In this paper, we investigate opinion mining for app reviews. In particular, we compare textual representation techniques for classification, sentiment analysis, and utility prediction from app reviews. We discuss and evaluate different techniques for the textual representation of reviews, from traditional Bag-of-Words (BoW) to the most recent state-of-the-art Neural Language models (NLM).

Our findings show that the traditional Bag-of-Words model, combined with a careful analysis of text pre-processing techniques, is still competitive. It obtains results close to the NLM in the classification, sentiment analysis and utility prediction tasks. However, NLM proved to be more advantageous since they achieved very competitive performance in all the predictive tasks covered in this work, provide significant dimensionality reduction, and deals more adequately with semantic proximity between the reviews’ texts.

Datasets and Source-Code: https://github.com/adailtonaraujo/app_review_analysis

Araujo, A.F., Gôlo, M.P.S. & Marcacini, R.M. Opinion mining for app reviews: an analysis of textual representation and predictive models. Autom Softw Eng 29, 5 (2022). https://doi.org/10.1007/s10515-021-00301-1