dc.description.abstract | Online reviews have become an important reference for consumers in their purchasing decisions, providing them with a wealth of information, but also presenting them with the problem of data overload. online reviews can be divided into two kinds: one is based on experience reviews. Generally, search-based reviews focus more on product features and experience-based reviews are more subjective and emotional because each person′s experience is different, It is difficult to judge the usefulness of the review in making a purchase decision.
The research object of this study is movie reviews. Using NLP technology to analyze the characteristic categories of criticism on electricity the influence of the beneficial effect of film reviews helps consumers to find out the beneficial effect of experience products from a large number of reviews. Via IMDb the website uses crawler technology to collect movie reviews and movie information as the data set of this study. After preprocessing, data is total 116,593. The data were categorized into three movie types: Action, Adventure, and Drama movies, nine feature categories, and whether the feature categories were split into text types and non-text types using text exploration techniques. In this study, we adopt a supervised machine learning model, use Random Forest, XGBoost, and Adaboost in the regression prediction method, and design five sets of experiments for research, use the stepwise regression forward selection method was used to test the combination of different feature categories the best combination of feature categories was selected.
After the experiment, it was found that the Random Forest method could achieve better results. In the non-text category, readers will trust the reviews of trusted reviewers and reference that reviewer′s vote for the movie, the older the review and the shorter the release date of the review and the release date of the film both predicted the beneficial effects of the review text vector. BERT is the most important single text feature category. In terms of combination, the number of textual feature categories to refer to in the Drama genre is larger than the Action/Adventure genre because the reader not only focus in the referring to the comments themselves in terms of mood or keywords, etc., as well as from research experiments it is known that when non-text and text-related feature categories are used, the feature categories of text are beneficial to improve the review of all film categories the accuracy of the model does not necessarily help.Finally, stepwise regression that it can effectively improve the accuracy of the beneficial prediction.
| en_US |