Wednesday 13 April 2022

To Wordle or not to Wordle

You may have come across those shared grids that have begun to infiltrate the anti-social media. A picture being worth a thousand words, those 5x6 grids of squares of increasing verdure...

(which sounds inappropriately botanical; I just mean greenness, green being  the signifier of success – olive green for the right letter in the wrong place, and emerald green for the right letter in the right place)

... or 5x5, or 5x4, or 5x3, or even 5x2 (my PBUPDATE, though someone interviewed on a fairly recent Newscast had got a first-time guess) say "look how clever I am" or, depending on your point of view, "Look how much time I've wasted", or "Look how sad I am, not waving but drowning in a sea of pointless guesswork."

But what's a winning strategy?
Ever since Sherlock Holmes told us, we've known that E is the commonest letter in English: "Elementary, my dear Watson"...
(which, in Conan Doyle's text, the great detective never said, although what happens in plays or on film or TV is anyone's guess)

.... Where would we be without E.? Lmntary my dar Watson. And Conan Doyle's source was presumably Samuel Morse's calculation:


But look at the sample size – fewer than 110,000 letters, which, using the rule of thumb used in my publishing days (about 6 letters per word), amounts to fewer than 18,000 words. And that means that the balance is skewed towards whatever kind of text the "sets of printer's type" happened to include: if they were, say, recipe books, then words like boil and heat and teaspoon would be over-represented; it's hardly a representative sample.

As that Notre Dame page goes on to say, the problem of this tiny sample size is solved by using a dictionary as the source...

(and strangely, for a US seat of learning, they chose the OED  rather than say Webster's.)
See here for an explanation of the 3rd column


...with the result that, whereas Morse calculated that E was 24 times as common as Q, in the OED it is nearly 57 times more common.

More common in glossed words that is. Not content with this, real Wordle-nerds have calculated the relative frequencies of letters in 5-letter words...

Not that letter-frequency is anywhere near the whole story. What matters more is morphemes (word-building blocks, represented by groups of letters). To take a trivial example from a recent answer:
The target was CHUNK. By chance, my first guess ended CH. So I got two olive green (right letter, wrong place) squares. If I had been a devotee of the  "letter frequencies of  letters in 5-letter words" school, I'd have had to consider 4 possible alternative places for C AND 4 possible alternative places for H. But I imagine (if I were a betting man, I'd put money on it) that in any word... 
(particularly in any short word; in composite words there are more opportunities for c to follow h; in spatchcock, for example, there's an -hc-, but 5-letter words don't allow for that sort of juxtaposition)

...that includes both C and H, the odds are that they will fall together and in that order. So instead of the 4x4 set of possibilities, there were just 3; and among those 3, CH??? was by far the most probable.
Morphemes matter more than letters.
</reservation> if that sort of thing floats your boat ...

<tangent subject="5-letter vessels">
(incidentally, one of the few words that fit the pattern ??CH?; Onelook lists o
nly 63 "common" words here, but a definition of common that includes words such as "zuche" and "elche" seems to me rather dubious.)


... a Google search will lead you further down this rat-hole. And only wimps stop at a mere five letters;  there are more variants than most viruses.

But that's enough for me; and more than enough for now.



Update: 2022.04.15.15:40 – Added PS in red.

Update: 2022.04.10.17:20 – Added <inline-ps />

Update: 2023.06.04.17:25 – Added a few interesting solutions (where "interesting" is, as we used to say in the software engineering world, a signed variable.)

I've done it in 2 before
bu not after so inauspicious a start

A Christmas tree, and all bright green

1,2,3,4,5, and still bright green

Not so symmetrical
just frustratingly regular

And an epic fail

Update: 2023.06.08.16:25 – Added PPPS.
PPPS A rather different five-letter based word game is this.

Update: 2023.08.18.19:45 – Added P4S.

No longer: this happened this morning:

The app, which comments on a successful guess, was rendered nearly wordless: 'Genius'. I'm not so sure. Maybe Someone was saying 'Ha! Thought you could kill another 10 minutes? Think again, sucker.'

No comments:

Post a Comment