Text Analysis

Author

Tiffany Sun

Last updated

May 6, 2025

We can uncover meaningful insights and patterns about the tens of thousands of games people play on STEAM through text analysis. What are the prominent patterns in game descriptions? What words and sentimental patterns appear the most common in all game descriptions combined? Do text tell us the differences and similarities of released games through the passage time? In what ways can we apply these findings?

Analysis Process

To address these inquiries, I analyzed text through predominantly word-clouds and bar-charts. Word clouds are achieved through the package (Timdream 2018) Games with only english text are included. Words excluded definitively are stop words. In some visualizations, words without AFINN values are excluded to place emphasis on analyzing sentiment across text. A total of eight datasets were used for this analysis, five were to explore the differences in text across time and were achieved through dynamic variables in a for-loop, which I was aided by (Globe 2020)

Variables Explored:

There are four main variables that are explored in various visualizations:

  • word - Singular words extracted from game’s descriptions.

  • year - A categorical variable of time.

  • n - Count of instances, which in this context refer to words.

  • value - AFINN Sentimental Value

Packages Utilized:

  • tidyverse

  • readr

  • wordcloud2

  • RColorBrewer

  • plotly

AFINN:

AFINN, developed by Finn Arup Nielson, is a lexicon (list) of words rated for their emotional valence. Each word can be assigned to a value from -5 to +5, with negative values indicative of negative sentiments and positive values indicative of positive sentiments. For instance, words such as “happy”, “adventurous”, and “magic” would have positive values, while words like “disgust’ and”evil” would have negative values.

Stop Words:

Stop words are essentially your everyday words that don’t really contribute to meanings or ideas. They include articles such as “the” and “is”.

Note

Disclaimer: Visualizations exclude most games which descriptions are in other languages.

Overall Analysis

Word Cloud

This word cloud represents words in all Steam games’ descriptions excluding those of other languages AND are associated with their corresponding sentimental values. Hover to see the prevalence of each word.

Figure 1

Top Ten Sentimental Words Across 5 Year Intervals

Explore the top 10 most common sentimental words found in all recorded STEAM games across time! I decided to visualize the top ten words found in games released the following years to explore and determine potential differences and similarities found after each five years intervals: 2005, 2010, 2015, 2020, and 2025.

Note

Disclaimer: The color feature was assisted with support from StackOverflow (benson23 2023)

Figure 2: 2025’s Top Ten

Figure 3: 2020’s Top 10

Figure 4: 2015’s Top 10

Figure 5: 2010’s Top 10

Figure 6: 2005’s Top 10

Game-Specific Analysis

While it can be informative to see patterns and trends from a VERY big picture, potential crucial, small details may go unnoticed.

Elder Scrolls vs. Grand Auto Theft?

If you explored Surya’s General Top Ten Network, you’ll notice that game series Elder Scrolls and Grand Auto Theft are two of the most top Metacritic rated games. This poses a few questions: What textual commonalites do they both share that may contribute to their high ratings? What does that say about STEAM’s audience?

Figure 7

Hover to see the word associated with the value.

Figure 8

Figure 9

Hover to see the word associated with the value.

Figure 10

It’s interesting to see that while both games differ in terms of settings and their overall themes, they share several terms, one especially is “world”. Other terms such as “characters” and “explore” hint their similarity as games that empower players to roam in their respective realms and engage in roleplaying. Perhaps this similarity also contribute to their statuses as top rated games, as these games that enable high immersive interaction and exploration may appeal to most of STEAM’s audience. While there are many similarities, a noticeable difference between the two games lies in their average sentimental values. Elder Scrolls IV has a balanced mixture of both positive and negative words while Grand Theft Auto V has more positive words.

Take-Aways

A recurring theme found in most words is action. Whether it be analyzing game descriptions from big, small, or chronological scale, I tend to see most words related to action. Additionally, prevalent words that have a negative AFINN values are typically about challenges such as combat or war while those with positive AFINN values are typically descriptive- “fun”, “glory”, “powerful”, etc. What piqued my curiosity was the fact that prevalent words remain similar across time, and there were several that even stayed in the top 10. Perhaps this is due to the fact that action games tend to be more popular, as reinforced by my partners’ analyses. Action games entice players with prospects of fun challenges and immersive exploration. Hence, more games that are released are in the genre of action than others genres. Overall, textual analysis can be used to reveal trends and insights such as those uncovered in this project, which can contribute to market research in the field of entertainment such as video-games production.