Wednesday, 13 April 2022

To Wordle or not to Wordle

You may have come across those shared grids that have begun to infiltrate the anti-social media. A picture being worth a thousand words, those 5x6 grids of squares of increasing verdure...

<parenthesis>
(which sounds inappropriately botanical; I just mean greenness, green being  the signifier of success – olive green for the right letter in the wrong place, and emerald green for the right letter in the right place)
<parenthesis>

... or 5x5, or 5x4, or 5x3, or even 5x2 (my PBUPDATE, though someone interviewed on a fairly recent Newscast had got a first-time guess) say "look how clever I am" or, depending on your point of view, "Look how much time I've wasted", or "Look how sad I am, not waving but drowning in a sea of pointless guesswork."

But what's a winning strategy?
 
Ever since Sherlock Holmes told us, we've known that E is the commonest letter in English: "Elementary, my dear Watson"...
<parenthesis>
(which, in Conan Doyle's text, the great detective never said, although what happens in plays or on film or TV is anyone's guess)
</parenthesis>

.... Where would we be without E.? Lmntary my dar Watson. And Conan Doyle's source was presumably Samuel Morse's calculation:

Source

But look at the sample size – fewer than 110,000 letters, which, using the rule of thumb used in my publishing days (about 6 letters per word), amounts to fewer than 18,000 words. And that means that the balance is skewed towards whatever kind of text the "sets of printer's type" happened to include: if they were, say, recipe books, then words like boil and heat and teaspoon would be over-represented; it's hardly a representative sample.

As that Notre Dame page goes on to say, the problem of this tiny sample size is solved by using a dictionary as the source...

<parenthesis>
(and strangely, for a US seat of learning, they chose the OED  rather than say Webster's.)
</parenthesis>
See here for an explanation of the 3rd column

 








...with the result that, whereas Morse calculated that E was 24 times as common as Q, in the OED it is nearly 57 times more common.

More common in glossed words that is. Not content with this, real Wordle-nerds have calculated the relative frequencies of letters in 5-letter words...

<reservation>
Not that letter-frequency is anywhere near the whole story. What matters more is morphemes (word-building blocks, represented by groups of letters). To take a trivial example from a recent answer:
<example>
The target was CHUNK. By chance, my first guess ended CH. So I got two olive green (right letter, wrong place) squares. If I had been a devotee of the  "letter frequencies of  letters in 5-letter words" school, I'd have had to consider 4 possible alternative places for C AND 4 possible alternative places for H. But I imagine (if I were a betting man, I'd put money on it) that in any word... 
<inline-pps> 
(particularly in any short word; in composite words there are more opportunities for c to follow h; in spatchcock, for example, there's an -hc-, but 5-letter words don't allow for that sort of juxtaposition)
</inline-pps> 

...that includes both C and H, the odds are that they will fall together and in that order. So instead of the 4x4 set of possibilities, there were just 3; and among those 3, CH??? was by far the most probable.
</example>
Morphemes matter more than letters.
</reservation>
..so if that sort of thing floats your boat ...

<tangent subject="5-letter vessels">
CANOE, KETCH, SKIFF, SLOOP, YACHT...
<meta-tangent>
(incidentally, one of the few words that fit the pattern ??CH?; Onelook lists o
nly 63 "common" words here, but a definition of common that includes words such as "zuche" and "elche" seems to me rather dubious.)
</meta-tangent>

</tangent>

... a Google search will lead you further down this rat-hole. And only wimps stop at a mere five letters;  there are more variants than most viruses.

But that's enough for me; and more than enough for now.

b

 

Update: 2022.04.15.15:40 – Added PS in red.

Update: 2022.04.10.17:20 – Added <inline-ps />

Update: 2023.06.04.17:25 – Added a few interesting solutions (where "interesting" is, as we used to say in the software engineering world, a signed variable.)


I've done it in 2 before
bu not after so inauspicious a start


A Christmas tree, and all bright green


1,2,3,4,5, and still bright green


Not so symmetrical
just frustratingly regular


And an epic fail
ning

Update: 2023.06.08.16:25 – Added PPPS.
PPPS A rather different five-letter based word game is this.

Update: 2023.08.18.19:45 – Added P4S.

No longer: this happened this morning:

The app, which comments on a successful guess, was rendered nearly wordless: 'Genius'. I'm not so sure. Maybe Someone was saying 'Ha! Thought you could kill another 10 minutes? Think again, sucker.'



No comments:

Post a Comment