First word of Wordle

In the last week, I have started playing the online word game Wordle by Josh Wardle. I was lured in after getting curious about some strange Twitter status updates that showed rows of green, grey and yellow blocks. It turns out it’s a fun game, too.

The basic idea is to try to guess a five-letter word, and you get six guesses. Each day there is a new word, and everyone gets to guess the same one. After each guess (which must be an actual word), you get some information on how close the guess was because the letters in a guess are shown as green (correct letter in correct position), yellow (correct letter in incorrect position) or grey (incorrect letter). After you’ve finished guessing the word, you can share a status update that shows how well you went, in a way that doesn’t give away any information about the word. That’s what I was seeing on Twitter.

I’ve done it four times now, and a natural question is what word should be the first guess. At that point in time, there is no information about the daily word, so it makes sense to me that the first guess should be the same each day. However, what is the best word to use for that first guess?

The conclusion I’ve reached is that the best word should have five different letters, together which are the top five most likely letters to match in a word, i.e. maximise the chance of getting yellows. Additionally, those letters should ideally be in a position that is most likely to match the correct position, i.e. maximise the chance of getting greens.

To figure this out properly, I would need to know the word list being used by Wordle, which unfortunately I don’t. In fact, there may be two word lists: the word list used to allow guesses, and the word list used to pick the daily word. So, I’ll make a big assumption and use the Collins Scrabble Words from July 2019.

My tool of choice is going to be zsh on my MacBook Air. It doesn’t require anything sophisticated. Also, I’ve removed any extra headers from my word list, and run it through dos2unix to ensure proper end-of-line treatment.

First job is to extract just the 5 letter words:

% grep '^.....$' words.txt > words5.txt
%

Now we need to figure out how many words each letter of alphabet appears in:

% for letter in {A..Z}
for> do
for> echo $letter:`grep -c -i $letter words5.txt`
for> done | sort -t : -k 2 -n -r | head -n 10
S:5936
E:5705
A:5330
O:3911
R:3909
I:3589
L:3114
T:3033
N:2787
U:2436
%

That wasn’t very efficient, but it doesn’t need to be. We have our answer – the most popular letters are S, E, A, O and R. Putting these letters into a free, online anagram tool, it turns out that there are three words made up from these letters: AEROS, AROSE and SOARE.

Okay, so while only one of these is a word that you’d actually use, it turns out that Wordle accepts them all. It looks like Wordle might use the Scrabble word list for its guesses.

In any case, this looks like a pretty good set of letters, as the words in the word list are highly likely to have one of these letters:

% grep -c . words5.txt
12972
% grep -c -i -e A -e R -e O -e S -e E words5.txt
12395
%

Of the 12,972 words in the word list, 12,395 (96%) will have at least one letter match!

The next job is to figure out which of these three words is most likely to have letters in the same position as other words in the word list.

% grep -c -e A.... -e .E... -e ..R.. -e ...O. -e ....S words5.txt 
6578
% grep -c -e A.... -e .R... -e ..O.. -e ...S. -e ....E words5.txt
3742
% grep -c -e S.... -e .O... -e ..A.. -e ...R. -e ....E words5.txt
5726
%

We have a winner! A letter in AEROS is in the right position for 6,578 words (51%).

So, it looks like using AEROS as your first guess in Wordle is a pretty good choice. Just, don’t tell anyone that’s what you’re doing, or if you share the standard Wordle status update, it will actually contain spoilers.

Puzzles and Mysteries

I make no secret of the fact that I like reading (what you might call) “ideas books”. Currently, I’m reading the latest Malcolm Gladwell book, What The Dog Saw, which is chock-full of ideas. Every chapter is an essay he’d previously written for the New Yorker magazine.

A particularly interesting chapter (you can also read it in full here) introduces the concept of puzzles and mysteries. For this framework, Gladwell credits Gregory Treverton (who you can read in Smithsonian Magazine discussing it here). While neither Gladwell nor Treverton go so far as precisely defining puzzles or mysteries, let me summarise the examples they give and how they characterise some of the differences between them.

Puzzles

Examples:

  • How many missiles did the Soviet Union have?
  • Where were they located?
  • How accurate were they?
  • Where is Osama bin Laden?
  • What are the proven oil reserves in country X?

Characterised by:

  • New information makes it easier to solve
  • Relatively stable answer over time
  • Clear measures of effectiveness of problem-solving

Mysteries

Examples:

  • What is the next Al Qaeda plan?
  • What would happen in Iraq after removing Saddam?
  • What is causing a sick person’s symptoms?
  • How much oil will be produced by a given well in its lifetime?

Characterised by:

  • Too much information, some (much?) of which is conflicting
  • Depends on future interactions of many factors

Gladwell argues that the circumstances leading to the collapse of Enron were a mystery, despite many people (especially those involved in the related court cases) considering it to be a puzzle. While, Treverton argues that the world of intel has in the past being structured to solve puzzles but from now on will need to handle mysteries if it is to successfully deal with terrorism.

This is an interesting concept and these are interesting arguments. I found myself wondering how this applied to knowledge workers in general. One of the points made by those authors is that special skill sets and organisations are required to tackle the different kinds of problems. I found this appealing, as many of the problems that I tackle in technology strategy might be considered mysteries of this sort, and I naturally like the idea of being special.

However, upon reflection, there may be a trap here. Dividing knowledge workers into two groups has a sense of introducing a class system – an upstairs-downstairs split – that serves to build barriers between groups that ought to work together.

Also, it isn’t at all clear that all problems can be classified as either a puzzle or a mystery, or even that any particular problem can’t be both. In fact, Gladwell gives an example of a WWII problem concerning a German secret super-weapon that was treated (by different groups) as a puzzle and a mystery.

But despite these concerns, the framework of puzzles and mysteries seems valuable. I currently ask a problem-solving question as part of job interviews, and perhaps I ought to tweak it to be more like a mystery in order to better test if people will fit into the work environment.

In any case, it is apt to quote a fabulous line from Winston Churchill that he spoke in 1939 suggesting people have been considering mysteries further back than our recent “age of terror”, although perhaps we need a couple more terms:

I cannot forecast to you the action of Russia. It is a riddle, wrapped in a mystery, inside an enigma; but perhaps there is a key. That key is Russian national interest.

http://www.aes.id.au/?p=460