Introduction
Wordle is a web-based word game created and developed by Welsh software engineer Josh Wardle, and owned and published by The New York Times Company since 2022.
Why? Well, the word game that has grown from 90 users in 2021 to around 2 million in February 2022.1
Players have six attempts to guess a five-letter word, with feedback given for each guess in the form of colored tiles indicating when letters match or occupy the correct position.2
Aim of this notebook
Wordle is so addicting that the fact that there’s only one word per day if frustrating (but it explains the addiction in the first place!). If a player is unable to correctly guess it, they are forced to wait until the next day to try again.
Why don’t just recreate the game in Python? This is exactly what I’m going to do in this notebook. I took inspiration from this project, from which I expanded some corner case (double letters) and words frequencies.
Before starting, let’s correctly understand the rules and corner cases.
Rules & corner cases
Wordle is quite easy to understand: - You have to guess the Wordle in six goes or less - Every word you enter must be an English word, presumably it should exists on a dictionary. - A correct letter turns green - A correct letter in the wrong place turns yellow - An incorrect letter turns gray - Letters can be used more than once
Despite its semplicity, we should be careful with its implementation.
As a matter of fact, there are 3 main pain points: 1. Guessed word not existing or with lenght not equal to 5 2. Repeated letters 3. Repeated guess by the user
In this Python implementation, I handled them differently: whilst 1 and 3 are easy if
conditions with membership operators
, 2 needs some caution.
Repeated letters
Wordle words may have repeated letters. But how does Wordle give hints if a repeated letter is input in any of the five boxes in a row?
I found a blog that helped me understand this rule:
Repeated letters are treated just like any other letter in the game. The second “B” in “Babes” is highlighted green, whereas the “B”s in yellow indicates their wrong letter position.
What about repeating a letter more than it appears in the target word?
If you repeat a letter more than it appears, then the excess will be highlighted in grey.
With this in mind, we are ready to implement Wordle!
Imports & Game setup
To implement Wordle, we’ll need: - a set of english words - some colors - random selection
For this reason, we’ll exploit the this Kaggle dataset for extracting english wordlist, termcolor
to print text with colors and random
for picking a random word from nltk.
from random import choice
from termcolor import colored
Setups
From the Kaggle dataset, we’ll read the csv file using Pandas.
This dataset contains the counts of the 333,333 most commonly-used single words on the English language web, as derived from the Google Web Trillion Word Corpus.
import pandas as pd
= pd.read_csv('../input/english-word-frequency/unigram_freq.csv') words
words.head()
word | count | |
---|---|---|
0 | the | 23135851162 |
1 | of | 13151942776 |
2 | and | 12997637966 |
3 | to | 12136980858 |
4 | a | 9081174698 |
words.tail()
word | count | |
---|---|---|
333328 | gooek | 12711 |
333329 | gooddg | 12711 |
333330 | gooblle | 12711 |
333331 | gollgo | 12711 |
333332 | golgw | 12711 |
We see that a lower count will cause the Game to use very rare words, so let’s set a reasonable threashold of frequency.
'count']>=1000000)].tail(10) words.loc[(words[
word | count | |
---|---|---|
26278 | punching | 1000845 |
26279 | lagrange | 1000790 |
26280 | distinguishes | 1000579 |
26281 | treadmills | 1000577 |
26282 | poi | 1000422 |
26283 | bebop | 1000401 |
26284 | streamlining | 1000369 |
26285 | dazzle | 1000224 |
26286 | trainings | 1000194 |
26287 | seeding | 1000093 |
= list(words.loc[(words['count']>=1000000)].astype(str).word.values)
common_words 10] common_words[:
['the', 'of', 'and', 'to', 'a', 'in', 'for', 'is', 'on', 'that']
= [i.upper() for i in common_words if len(i) == 5]
english_5chars_words print(len(english_5chars_words))
print(english_5chars_words[5:10])
3254
['FIRST', 'WOULD', 'THESE', 'CLICK', 'PRICE']
We have more than 3k 5-letter words to guess!
Now it’s time to set the colored tiles. For this, we’ll simply copy paste from internet the three colored squares we need.
= {
TILES 'correct_place': '🟩',
'correct_letter': '🟨',
'incorrect': '⬛'
}
How can we color the text using print? Here’s how the termcolor library come to the rescue. We simply need to use the colored() function, specifying the color we want.
print(colored('Example in green', 'green'))
print(colored('Example in yellow', 'yellow'))
[32mExample in green[0m
[33mExample in yellow[0m
'Example in green', 'green') colored(
'\x1b[32mExample in green\x1b[0m'
colored
simply returns our text with some other strings attached, later rendered in the desired color using print
Utils
We need now to implement the core of our Wordle, that is the validate_guess
function, which should be able to: - return the correct colors if the letters of the user are correct (green if the place is correct, yellow if they appear, black otherwise) - deal with repeated letters, so that if the target is abbey
and the user inputs keeps
, only one e
will be colored in yellow. Similarly, if the user inputs kebab
, one b
will be green and the last one yellow.
To do so, we’ll exploit the replace
method, using the count
parameter, which, according to the doc: >A number specifying how many occurrences of the old value you want to replace. Default is all occurrences
Let’s use again the abbey
(target) and keeps
(guess) example: when checking the first e
, we color it in green and replace the target word’s letter with -
(abb-y
). By doing so, we are making sure that the second e
will be colored in grey.
This can be obtained target.replace('e', '-', 1)
, with 1
specifying that we want to replace only the first occurence of the letter.
def validate_guess(guess, answer):
= []
guessed = []
tile_pattern # Loop through every letter of the guess
for i, letter in enumerate(guess):
# If the letter is in the correct spot, color it in green and add the green tile
if answer[i] == guess[i]:
+= colored(letter, 'green')
guessed 'correct_place'])
tile_pattern.append(TILES[# Replace the existing letter in the answer with -
= answer.replace(letter, '-', 1)
answer # whereas if the letter is in the correct spot, color it in yellow and add the yellow tile
elif letter in answer:
+= colored(letter, 'yellow')
guessed 'correct_letter'])
tile_pattern.append(TILES[# Replace the existing letter in the answer with -
= answer.replace(letter, '-', 1)
answer # Otherwise, the letter doens't exist, just add the grey tile
else:
+= letter
guessed 'incorrect'])
tile_pattern.append(TILES[
# Return the joined colored letters and tiles pattern
return ''.join(guessed), ''.join(tile_pattern)
Game implementation
We are ready to implement the game, now that we have our core function in place.
The game itself is nothing but a while loop checking the correctness of the user’s input and some prints!
= 6
ALLOWED_GUESSES
def wordle_game(target):
= False
GAME_ENDED = []
history_guesses = []
tiles_patterns = []
colored_guessed
# Keep playing until the user runs out of tries or finds the word
while not GAME_ENDED:
= input().upper()
guess = True
BAD_GUESS # Check the user's guess
while BAD_GUESS:
# If the guess was already used
if guess in history_guesses:
print("You've already guessed this word!!\n")
= input().upper()
guess # If the guess has not 5 letters
elif len(guess) != 5:
print('Please enter a 5-letter word!!\n')
= input().upper()
guess # If the guess is not in the dictionary
elif guess not in english_5chars_words:
print('This word does not exist!')
= input().upper()
guess else:
= False
BAD_GUESS
# Append the valid guess
history_guesses.append(guess)# Validate the guess
= validate_guess(guess, target)
guessed, pattern # Append the results
colored_guessed.append(guessed)
tiles_patterns.append(pattern)
# For each result (also the previous ones), it'll print the colored guesses and the tile pattern
for g, p in zip(colored_guessed, tiles_patterns):
print(g, end=' ')
print(p)
print()
# If the guess is the target or if the user ran out of tries, end the game
if guess == target or len(history_guesses) == ALLOWED_GUESSES:
= True
GAME_ENDED
# Print the results
if len(history_guesses) == ALLOWED_GUESSES and guess != target:
print("\nDANG IT! YOU RAN OUT OF TRIES. THE CORRECT WORD WAS {}".format(colored(target, 'green')))
else:
print("\nGOOD JOB, YOU NAILED IT IN {}/{} TRIES\n".format(len(history_guesses),
ALLOWED_GUESSES))
# Select a random 5 letter word to guess
= choice(english_5chars_words)
target_word
print('WELCOME TO WORDLE')
print('NOW GUESS! YOU HAVE {} TRIES\n'.format(ALLOWED_GUESSES))
wordle_game(target_word)
WELCOME TO WORDLE
NOW GUESS! YOU HAVE 6 TRIES
heart
HEART ⬛🟨⬛⬛⬛
orbit
HEART ⬛🟨⬛⬛⬛
ORBIT 🟨⬛⬛⬛⬛
mmmmmmm
Please enter a 5-letter word!!
mmmm
Please enter a 5-letter word!!
mmmmm
This word does not exist!
cello
HEART ⬛🟨⬛⬛⬛
ORBIT 🟨⬛⬛⬛⬛
CELLO ⬛🟨🟩⬛🟨
melon
HEART ⬛🟨⬛⬛⬛
ORBIT 🟨⬛⬛⬛⬛
CELLO ⬛🟨🟩⬛🟨
MELON ⬛🟨🟩🟨⬛
golem
This word does not exist!
moles
This word does not exist!
poles
HEART ⬛🟨⬛⬛⬛
ORBIT 🟨⬛⬛⬛⬛
CELLO ⬛🟨🟩⬛🟨
MELON ⬛🟨🟩🟨⬛
POLES ⬛🟩🟩🟩⬛
soled
This word does not exist!
foley
HEART ⬛🟨⬛⬛⬛
ORBIT 🟨⬛⬛⬛⬛
CELLO ⬛🟨🟩⬛🟨
MELON ⬛🟨🟩🟨⬛
POLES ⬛🟩🟩🟩⬛
FOLEY 🟩🟩🟩🟩🟩
GOOD JOB, YOU NAILED IT IN 6/6 TRIES
Conclusions and next steps
This was a fun experiment and now I’m planning on using streamlit to implement a webapp with this game!
Another possible future implementation could be the creation of two distinct datasets containing the words, one easy and one difficult, so that you can play in two easy or hard modes.