Harmless Drudgery: plurals

Showing posts with label plurals. Show all posts

Friday, 11 October 2019

Assassins and Dutch courage

The starting point for today's ramblings is the word assassin. Followers of The Old Man of the Mountains (shaik-al-jibal) were known for (in the words of Etymonline) "murdering opposing leaders after intoxicating themselves by eating hashish." It goes on:

1530s (in Anglo-Latin from mid-13c.), via medieval French and Italian Assissini, Assassini, from Arabic hashīshīn "hashish-users," an Arabic nickname for the Nizari Ismaili sect in the Middle East during the Crusades, plural of hashishiyy, from the source of hashish (q.v.).

The Etymonline entry for hashish reads

hashish (n.)

also hasheesh, 1590s, from Arabic hashīsh "powdered hemp, hemp," extended from sense "herbage, dry herb, rough grass, hay."

and quotes English Words of Arabic Ancestry:

Its earliest record as a nickname for cannabis drug is in 13th century Arabic. Its earliest in English is in a traveller's report from Egypt in 1598. It is rare in English until the 19th century. The word form in English today dates from the early 19th century. The word entered all the bigger Western European languages in the early to mid 19th century if you don't count occasional mentions in travellers' reports before then.

That mention of cannabis invites the reflection that the English word canvas is related. Unstressed vowels between consonants (like the second a in cannabis) are, as students of language change over time say, unstable: they tend to disappear.

Here , relatively early in the life of this blog, I was writing about a spiral ring found in Pompeii, with an inscription that included the word domnus (sic, no i).

... no-one could presumably suggest that there was not room, in a 10-15 cm spiral ring, for one little I, or that this one-stroke character was too complex for an otherwise impeccable craftsman! No, people were dropping the unstressed I in speech; and this accounts for words like the Italian Donna and Spanish Doña when the Latin was Domina . (I changed the sex of the lordly person, because in the masculine the attrition of an unstressed vowel has gone one step further in Spanish – Don [which dropped its unstressed vowel {HD 2019: that is, after dropping the unstressed i it dropped the unstressed u}].)

It would have been less contentious to cite the Portuguese donna, as in current Spanish the change has gone further, with the introduction of the ñ.

Anyway, the same happened to the unstressed a in cannabis (though in a different context, of course – not, as linguists are wont to say diachronic) to produce the word "canvas" – woven from that "herbage, dry herb, rough grass, hay.".

The -in of assassin is, incidentally, a false plural, like "a criteria", "a panini", "a cherubim".

<THE_USUAL_PROVISO prescriptivism="0">
(I hasten to add that that "false" is an indication of how the word was formed, not a value judgement. Some of these mistakes are becoming standardized. I won't say "a panini" but at some stage that sort of finger-in-the-dikery will become misplaced A mistake is at the root of many words. My favourite, and oft-cited, example is the French word for bat – discussed at length here. [I recommend that piece, but if you don't have time the short version is this: a chauve-souris is not a bald-mouse but an owl-mouse.])
</THE_USUAL_PROVISO>

If the notion of a fighting force getting high before spilling blood seems odd, try your preferred search engine with the string US Army Vietnam drug-taking. I get nearly 22 million hits.

But Vietnam was by no means the first theatre of war that encouraged....

<QUOD_SCRIPSI_SCRIPSI translation="Youi'd better believe it">
"Substance abuse in the Vietnam War wasn’t just limited to the marijuana and heroin enlistees could buy on the black market. Military commanders also heavily prescribed pills to help improve soldiers' performance."

History.com
</QUOD_SCRIPSI_SCRIPSI>

.... drug-taking. The Phrase Finder writes

'Dutch courage' derives from the English derision of the Dutch which came about during the Anglo-Dutch wars.

Strictly, the Phrase Finder is at pains to point out that the use of alcohol to "stiffen the sinews" wasn't the chief aim of the original users of the expression. Rather, the Anglo-Dutch wars encouraged the use of 'Dutch as a pejorative:

Dutch bargain - a contract made when one is drunk.
Dutch concert - where several tunes are played at the same time.
Dutch feast - where the host gets drunk before the guests.
Dutch treat - a 'treat' at which one has to pay one's own share.
Double Dutch - nonsense.

I'm not sure I buy the pejorative idea. After all, a "Dutch auction" isn't a substandard or risible auction, it's just a different sort of auction So I am not so quick to dismiss the idea that Dutch fighters had a nip of the hard stuff before an engagement. They wouldn't have been the first to do it, and gin was cheap and plentiful

Time to return to real life.

b

Monday, 21 March 2016

Ex unibus plurum - "wuggen" revisited

In my last post I stumbled, in passing, on an idea; some of you may have noticed the "hmmm". I thought it might take the form of an update, but it has turned out to be a bit more substantial than that. The idea was to quantify the different ways English has of forming a plural. When I wrote this post nearly two years ago it didn't occur to me to wonder. I was content to say

English has lots of ways of pluralizing a noun – no change (sheep, fish...), change -us to -i (radius → radii...), add -en (ox → oxen [or do something else involving '-en' {child → children, brother → brethren...}]), change -ex or -ix to -ices (matrix → matrices)^† etc, but by far the most common device is to add an s (though this simple idea hides several options [/s/ {rabbits}, /z/ {gardeners}, /ɪz/ {radishes}]. What is the word for 'more than one wug'? Wugs, of course, with /gz/.

It's fairly obvious to a native speaker that the most common way is to add an s. In fact, this rule becomes apparent whenever a young language learner mistakenly adds an s to an irregular plural – sheep becomes sheepS rather than sheep, for example, and when an adult corrects mouseS to mice, the compulsion to keep faith in the add-an-s rule is so strong that the next attempt is quite often miceS.

But I wondered how I could put a number on that. The obvious source of data seemed to me to be the British National Corpus, though it is relatively small, at a mere 100,000,000 words. Some of the publicly available corpuses...

<I_know_ I_know subject="corpora">
There are people who say that corpora is the"correct" plural; some readers may have had the misfortune of being taught by someone who believed so; Firefox is trying to correct my spelling. The latinate plural is not wrong, but I adhere to Fowler's belief (in The King‘s English)

...that all words not English in appearance are in English writing ugly and not pretty, and that they are justified only (1) if they afford much the shortest or clearest, if not the only way to the meaning ... or (2) if they have some special appropriateness of association or allusion in the sentence they stand in.
Elsewhere (maybe Modern English Usage) he gives the advice that you‘re less likely to make an embarrassing mistake (like mistaking a latinate -us word for a second declension noun instead of a 3rd [such as corpus] or 4th declension one [syllabus, for example], and giving it an -i ending), and more likely to be understood, if you use a native English s plural ending whenever it's possible.
</I_know_ I_know>

... have many more.

It is possible in principle to construct a query that requires a search engine to return all the nouns in it that end with a certain string. But accompany me, if you will, in a thought experiment. Suppose for the sake of argument that in any text in the corpus the percentage of nouns is N%.

<back_of_fag_packet>

A few examples, followed by the count of plural nouns:

The cat sat on the mat. N=0

There is a tide in the affairs of men... N=1

Softly softly catchee monkey. N=0

The wages of sin is death. N=1

When shall we three meet again? N=0

Honey I shrunk [sic] the kids N=1

Where have you been all day? N=0

In this mini-corpus (perhaps I should make that nano-corpus) there are 3 plural nouns out of a total of 50 words. They're not too common, plural nouns; 6% in this case, though in for example a recipe book the figure would be much higher. In BNC, that would be 6% of 100,000,000 – 6,000,000.

This is admittedly a VERY dodgy sample; but my point is that even a tiny value for N leads to a big number in a corpus such as BNC.

</back_of_fag_packet>

At the British National Corpus I asked for all the plural nouns that end -s. This would catch a few non-standard plurals, like indices or theses; but those would add up to no more than dozens, or hundreds at most, among millions. But the query timed out after finding the first 7500 distinct words (the most common of all was things at 40,453 – a clear 11,000 ahead of the field), by which stage the search had only worked its way down to words that had a total of 27 hits. For comparison, in plural nouns ending -n, the search worked its way down to 23 (there was no 24, 25, 26 or 27) after listing about 96% of all possible hits. Extrapolating from that we can estimate that if a search reaches 27 after 4.5M hits there will be a total of something like 5,000,000 (N = 5 – so my fag packet calculation wasn't too far out).

I've crunched some numbers, thinking at first in terms of some pretty pie charts. But the difference between -s plurals and all the others was so great that pie charts wouldn't be very interesting: most non-s endings would get a tiny (often nearly invisible) sliver. I've shown my working here (none too legibly I'm afraid):

More legible version

In fact, rather than a pie chart, a more helpful image would be a clock-face. The sector occupied by all non-s plurals added together would be the area between 12 o'clock and about 4 minutes to. The only families of non-s plurals that would account for more than a minute or two would be irregular English plurals of all kinds (folk, men, children, feet, teeth,...) and Latin plurals – mostly ending with -i, but sometimes ending with -a, or e; the few Latin plurals that end -ūs (in Latin, as for example syllabus does) are of course lost among the -s endings – if there are any in BNC.

There. There are some numbers. I may try a similar trick on another corpus; on the other hand I may get on with #WVGTbk2.

b

Update 2016.03.21.14:30 – Added PS

PS Here's a clue:

Stubborn – gathering information on the way (12)

Update 2016.03.23.14:50 – PPS

Added link to spreadsheet.

Update 2016.04.25.11:35 – PPPS

PPPS Time. The answer to that clue: INTRANSIGENT

Update 2018.06.10.10:25 – A few typo fixes

Update 2018.10.23.14:05 – Updated linked spreadsheet (but left old screen-grab as is – I'm sure anyone who's interested in the figures will look in the spreadsheet anyway).