Analyzing Steam Reviews with Topic Modeling and Natural Language Processing

In this 9-month project, I explored a general pipeline of text data mining and iterated on them to establish a dedicated pipeline to analyze reviews on the Steam platform. I employed Natural Language Processing and topic modeling techniques to improve the efficiency and accuracy of analysis using the pipeline and tested the results using reviews of Dota 2, one of the most downloaded and reviewed games on Steam.


While I worked on this project, I dived into the area of data analysis and got hands-on experience building natural language processing and topic modeling tools with R language.


The result of this work could be beneficial to all developers, especially community managers, to discover insights from large volumes of player feedback and improve the quality of their products.

