<parenthesis>
(which sounds inappropriately botanical; I just mean greenness, green being the signifier of success – olive green for the right letter in the wrong place, and emerald green for the right letter in the right place)
<parenthesis>
But what's a winning strategy?
<parenthesis>(which, in Conan Doyle's text, the great detective never said, although what happens in plays or on film or TV is anyone's guess)</parenthesis>
.... Where would we be without E.? Lmntary my dar Watson. And Conan Doyle's source was presumably Samuel Morse's calculation:
Source
But look at the sample size – fewer than 110,000 letters, which, using the rule of thumb used in my publishing days (about 6 letters per word), amounts to fewer than 18,000 words. And that means that the balance is skewed towards whatever kind of text the "sets of printer's type" happened to include: if they were, say, recipe books, then words like boil and heat and teaspoon would be over-represented; it's hardly a representative sample.
As that Notre Dame page goes on to say, the problem of this tiny sample size is solved by using a dictionary as the source...
<parenthesis>
(and strangely, for a US seat of learning, they chose the OED rather than say Webster's.)
</parenthesis>
See here for an explanation of the 3rd column |
...with the result that, whereas Morse calculated that E was 24 times as common as Q, in the OED it is nearly 57 times more common.
More common in glossed words that is. Not content with this, real Wordle-nerds have calculated the relative frequencies of letters in 5-letter words...
<reservation>
Not that letter-frequency is anywhere near the whole story. What matters more is morphemes (word-building blocks, represented by groups of letters). To take a trivial example from a recent answer:<example>
The target was CHUNK. By chance, my first guess ended CH. So I got two olive green (right letter, wrong place) squares. If I had been a devotee of the "letter frequencies of letters in 5-letter words" school, I'd have had to consider 4 possible alternative places for C AND 4 possible alternative places for H. But I imagine (if I were a betting man, I'd put money on it) that in any word...
<inline-pps>(particularly in any short word; in composite words there are more opportunities for c to follow h; in spatchcock, for example, there's an -hc-, but 5-letter words don't allow for that sort of juxtaposition)</inline-pps>
..so if that sort of thing floats your boat ......that includes both C and H, the odds are that they will fall together and in that order. So instead of the 4x4 set of possibilities, there were just 3; and among those 3, CH??? was by far the most probable.Morphemes matter more than letters.
</example>
</reservation>
<tangent subject="5-letter vessels">
CANOE, KETCH, SKIFF, SLOOP, YACHT...<meta-tangent>
(incidentally, one of the few words that fit the pattern ??CH?; Onelook lists only 63 "common" words here, but a definition of common that includes words such as "zuche" and "elche" seems to me rather dubious.)
</meta-tangent>
</tangent>
... a Google search will lead you further down this rat-hole. And only wimps stop at a mere five letters; there are more variants than most viruses.
But that's enough for me; and more than enough for now.
b
Update: 2022.04.15.15:40 – Added PS in red.
Update: 2022.04.10.17:20 – Added <inline-ps />
Update: 2023.06.04.17:25 – Added a few interesting solutions (where "interesting" is, as we used to say in the software engineering world, a signed variable.)
I've done it in 2 before bu not after so inauspicious a start |
And an epic fail |
No comments:
Post a Comment