Showing posts with label COCA. Show all posts
Showing posts with label COCA. Show all posts

Saturday, 21 March 2020

Love in the time of corona


The big news today is that Mr T, the Neanderthal sporter of bling, is treading on peoples' toes Big Time. Trump's edit of his notes for this speech was reported in the Washington Post, with this photo of his notes:


His defence of this deliberate rabble-rousing was disingenuous and plumbed new depths of insensitivity. The World Health Organization's feelings  were made abundantly clear in this tweet:

I write abundantly not because I'm particularly partial to the cliché (if something is going to be clear in any way, the odds are strongly in favour of abundantly ...
 <COCA_NUMBERS>
Search for *ly clear in this corpus. abundantly is second only to perfectly.
Meanwhile, in this much smaller corpus, dedicated to British English, both  perfectly and
abundantly are deposed by a brand new number one – absolutely (which clings on to fifth place in COCA). "Go figure", as I gather they say in contemporary American. (In my teaching days I'd've called "abundantly clear"  a strong collocation).
</COCA_NUMBERS>

...as the adverb of choice) but because it has long been the WHO's position on the naming of viruses. Generally, it's unfair (and misleading) to name a virus after a place. The pandemic known commonly as Spanish flu came from everywhere but Spain (Africa, USA, France):
In August 1918, a more virulent strain appeared simultaneously in Brest, France; in FreetownSierra Leone; and in the U.S. in Boston, Massachusetts. The Spanish flu also spread through Ireland, carried there by returning Irish soldiers. The Allies of World War I came to call it the Spanish flu, primarily because the pandemic received greater press attention after it moved from France to Spain in November 1918. Spain was not involved in the war and had not imposed wartime censorship. 
wikipedia

It was an accident of non-alignment. Cases in Spain only got reported because they could be.  If only they'd had wartime censorship Spain might have avoided that stigma.

But is place relevant in the case of the Coronavirus? Tom Standage, in Go Figure reported (with no source, but quite plausibly) that a majority (6 in 10) of infections that affect humans started life in other species. (Wikipedia has an article on cross species transmission, which may well point to a source if you want to trawl though it.)

And an obvious source of cross species transmission is wet markets – where all manner of animals are thrown together  in painfully cramped conditions (a crate of chickens,  say,  piled next to a crate of piglets). Such markets aren't peculiar to China, though I suspect most take place in south-east Asia; with a fair few in South America and Africa – and chiefly in the developing world. Articles calling for such markets to be banned or suppressed in some way (like this one or this) are becoming more strident.

"But you can't ban them – they're part of our culture" cry the users. Well bad things need to be banned. And stigmatizing them is a good way of ensuring their demise. If it's a question of  weighing the health of billions of people (and, coincidentally, that of the world economy) against the way of life of a few million, I know where my money is.

Though I hate to appear to side with Mr T, I don't mind the virus being given a name that stigmatizes its source; not geographical, though. The "wet market  virus", perhaps.

But it's a lovely day. I shouldn't be cooped up in here...

b


Wednesday, 29 November 2017

Some thoughts

<rant>
This rant has been bubbling away for a few weeks, ever since Priti Patel's "fulsome apology":

As so often after the breaking of an imagined "rule", this was followed by a Twitterstorm. These snapshots give a taste:

The BBC, to my relief, were a little more measured, allowing themselves a couple of diffident question marks.


(But they still used the loaded phrase "the official definition". For pity's sake, there ISN'T one )
In the #WATO programme that examined the issue  Martha Kearney exemplified this well-meaning misprision...
<digression>
"Tee hee hee, doesn't he mean misapprehension?" hoot the monolexicopaths (OK, I did make that one up) "Misprision means 'wrong action, a failure on the part of authority, early 15c.' [Etymonline], and Ms Kearney certainly did nothing wrong." Well I have chosen to use it to mean failure to grasp (which, incidentally, I have just realized, may well underly Wilde's choice of name for Miss Prism).
 <digression>
 ... by saying that "you and I" as an object phrase is "incorrect" (and was quickly slapped down by Oliver Kamm). And Kamm, at  the beginning of the piece, responds to the ubiquitous official definition Shibboleth: "There is no central arbiter of what words mean, they are part of a social contract between the utterer and the hearer or the writer and the reader." Humpty-Dumpty was right (though on the extreme right, where misunderstandings are likely to occur).

One good thing that came out of the kerfuffle was this idea:

which was taken up the next day by Wayne Myers in this tweet (and youTube posting).

The British National Corpus, for what it's worth, records "fulsome apologies" as the 5th most common "fulsome + <noun>"  collocation; COCA has many more, but neither apology nor apologies. I wonder if this suggests that our American cousins are less tolerant of this usage....
</rant>
Enough of the rant . Another thing that came of the Twitterstorm was my thinking more about -some words. Etymonline has this to say:

-some (1)

word-forming element used in making adjectives from nouns or adjectives (and sometimes verbs) and meaning "tending to; causing; to a considerable degree," from Old English -sum, identical with some, from PIE root *sem- (1) "one; as one, together with." Cognate with Old Frisian -sum, German -sam, Old Norse -samr; also related to same.

Nouns include these:
adventuresome, awesome, bothersome, burdensome, fearsome, frolicsome, handsome, mettlesome, nettlesome, noisome, quarrelsome, toothsome, troublesome, venturesome, winsome. 
The relationship between the noun and -some is not predictable (as often happens when words come together: crocodile shoes are made from part of a crocodile, but crocodile tears aren't). And the other thing that leaps out is that  they often hold fossils of words that no longer have a free-standing life in their own right: what is a noi or a win? The Etymoline entries for noisome and winsome explain.

Adjectives include these:
darksome, fulsome, gladsome, lissome/lithesome,  lonesome, wearisome, wholesome/halesome
<autobiographical_note>
I put halesome on the end there as I first met this dialect word in a song I sang at primary school:

Buy ma caller herrin
They're bonny fish and halesome farin

Halesome is to wholesome as hale (now preserved chiefly in the phrase hale and hearty) is to whole. Health comes into it as well. Healing is making whole.
<digression>
One site I visited to find this song introduces an interesting typo: halesome sarin . Sarin can be called many things, but halesome is not one of them.
</digression>
</autobiographical_note>
I'm not sure why Etymonline includes verb as a parenthetical afterthought:  "element used in making adjectives from nouns or adjectives (and sometimes verbs)".
buxom, cumbersome, irksome, loathsome, meddlesome,  tiresome, worrisome
In any case "More or less any noun can be verbed" (as wossname said – Mark Twain?..); so my putting trouble-some among the nouns and worr[y]-some among the verbs is arbitrary.

Again, there are fossils: things don't cumber much nowadays (in fact I wasn't sure at first what part of speech it was). And in the case of buxom, some spelling changes have tried to cover its tracks. The first part of buxom shares its derivation with the bendy sort of bow; and indeed with elbow. It originally meant something like pliable. It would be neat to say that buxom simply means curvaceous, but that would be an oversimplification. To quote Etymonline:
The meaning progressed from "compliant, obliging," through "lively, jolly," "healthily plump, vigorous and attractive," to (in women, and perhaps influenced by lusty) "attractively plump, comely" (1580s). In Johnson [1755] the primary meaning still is "obedient, obsequious." It was used especially of women's figures from at least 1870s...
But enough of this. SOME things are beyond me.

b
PS: A couple of clues:
  • Top dog – a rapper detox, reformed. (4,8)
  • Measure up for inclusion in modification – tricky. (11)
Update: 2017.12.01 – Added PPS

Inspired by Etymonline's 'meaning "tending to; causing; to a considerable degree"' I started to make a Venn diagram showing overlapping shades of meaning (which could be seen as not fitting in with my opening rant –  only the meanings I'm toying with are more in a spirit of description rather than of proescription). But I'm not satisfied with the result: I ended up just chasing words from one category to another (and speculating on the usefulness or otherwise of a three-dimensional Venn diagram). Still, here it is:

Update: 2017.12.06 – Fixed typo (although proscription and prescription tend to go together in the same minds).

Monday, 22 May 2017

Numbers

Time for another of my periodical looks at Harmless Drudgery‘s vital statistics.
In  Oct 2016 I wrote of the previous 2 years and 3 months:
It would be unrealistic, I think, to expect a similar near-doubling readership over the coming 9 quarters;  and, besides, it takes quite a bit of (writing) effort to maintain interest – which is at odds with the original purpose of the blog [which, longer-term visitors will know, was to support my other writing efforts].
In April 2015, in a PS to this) I had written of a record average of daily visits of 55. Well, 55 schmifty-five. The average for this month so far is about four times as much – over 200. The trend started about Christmas 2016, followed by another up-tick at Easter 2017, leading me to think that maybe my key demographic was teachers, who saved their recreational blog-reading for the school holidays, but page visits in May are already (after about two-thirds of the month) almost as high as the total for April (5,147).
HD stats, courtesy of Blogger
And while we're on the subject of numbers, I have long felt something that grates on my ear as "just American"...
<digression>
(pace Susie Dent, whose Americanize!: Why the Americanisation of English Is a Good Thing on Radio 4 last Saturday neither was  particularly persuasive nor had to be; I don't need persuading. I prefer -ize myself where admissible – certainly NOT in the lamentable cases of *televize or *analyze, for example  And incidentally, I suppose the inconsistency of that programme's title [Americanize but Americanisation] was intentional)
</digression>
...needed further attention – preferably on the basis of numbers. My source as usual is the British National Corpus (BNC) and its much bigger and more recently updated transatlantic cousin the Corpus of Contemporary American English (COCA). My first three searches seemed to confirm my prejudice:

BNC:
sooner rather than later (just click and sit back while BNC does its thing) 65
sooner than later (just click and sit back while BNC does its thing)
COCA:
sooner than later (just click and sit back while COCA does its thing) 105.

QED. Sooner than later could be assigned, along with I could care less (and incidentally I don't buy Steven Pinker's irony argument – but I don't have time to trace the reference, given the length of the grass) to the Expressions that don't make sense in American English pile.

But COCA is more than five times the size of BNC, so I might have expected a frequency for the preferred form of more than 5 times 65 – well over 300. So I looked again in COCA.
sooner rather than later (just click and sit back while COCA does its thing) 486
So what was demonstratum was not what was demonstrandum. Based on those corpus figures, sooner rather than later is more than 10 times as commonly used by British English speakers/writers than sooner than later. But among American English speakers/writers the predominance is similar; just more less pronounced – less than half the ratio of sooner rather than later to sooner than later. And perhaps the preference is on the wane – taken up by a smaller proportion of linguistic ground-breakers on this side of the Atlantic; the sort of comparative-historical corpus query that could prove that though is beyond me.

Enough. Biomass destruction is the hors-d'œuvre of the day, and the mower awaits.

b

PS – a clue to be going on with:
  • VIP? Mark; a nut, when crushed. (6,5)
Update: 2017.05.22.22:40 – Added PPS

PPS – Whoops; got the polarity of the comparison wrong, fixed in bold.

Update: 2017.05.26.14:10 – Added PPPS

PPPS – I said I'd write more about Americanisms. I find it hard to say anything new, because I've been fighting this prejudice for so long and in so many different forums.
<digression>
(And there's another one – pluralizing of words with a clear Latin provenance. I'm with Fowler on this one, as I've said before. He wrote:
 ...that all words not English in appearance are in English writing ugly and not pretty, and that they [HD: Latin plurals] are justified only (1) if they afford much the shortest or clearest, if not the only way to the meaning ... or (2) if they have some special appropriateness of association or allusion in the sentence they stand in.
A consequence of the practice of using English endings is that you avoid solecisms such as syllabi; incidentally, for what it's worth – not a lot for writers of English – the Latin plural of syllabus is syllabūs [or a u with some such diacritic – we didn't need them for the exam, so like any self-respecting school-child I ignored them.)
</digression>

A few years ago I wrote here:

...Less well-informed commentators go so far as to say - when asked the difference between authorise and authorize -
No difference at all ... only that americans spell it different cos they feel the need to be different . The correct spelling is with an -s-

Oh dear. In one such discussion I said
There's nothing unBritish about the spelling 'apologize'. It has been the house style of The Times for well over a hundred years, and is used by many large and influential publishers (Oxford University Press, for example). I'm tired of being accused of flirting with modernity and excessive American influence, just because I use a spelling that millions of British people use (so long as they haven't been got at by generations of school-teachers peddling misinformation).
That may have been true of The Times at the time of writing, but 'the times they are a-changin''. A few cases of '-ize' pass the scrutiny of the subs' eyes - especially when there is a strong etymological justification - as in the case of 'baptize' (where there is a zeta rather than a sigma in the original Greek); but fewer and fewer.
But to quote the Oxford Dictionary for Writers and Editors


The first line is crucial:

WHERE verbs can be spelled with either an -ize or -ise ending...

American and British English speakers simply disagree over that can: not, say we, in a case like televise; to give it a z would be to suggest that there was the noun or adjective telev - and if you televized something you made it either more like one (in the case of the noun) or just more televvy (in the case of the adjective).

The rest, as Professor Brian Cox might say, is science (sic).


    Friday, 21 April 2017

    The little things of life

    I have mentioned diminutives before; and they're always lurking quite close to the surface when you think about words. In my last post, for example:
    ...bacilli  [Latin baculum  'little staff'; there's that '-ulus/m' again, denoting a diminutive...]
    Spaghetti are little spaghi ["strings"]; cigarettes (and cigarillos) are little cigars. A scintilla is a little piece that's been cut off (from the irregular verb scindere [whose part participle is scissus, recognizable in the English scissors]). Often, their meanings diverge widely from the mother-word: a tabernacle – ultimately from tabernaculum doesn't have much of an obvious link with a tavern (> taberna); the altar wine doesn't even go in  the tabernacle...
    <autobiographical_note>
     (at least not in my day, when catering was easier [just a mouthful for the celebrant]).
    <autobiographical_note>

    The reason for this focus (on diminutives) is a chance reading of the title of an Italian board game: Il gioco dell'oca.  In Italy (and much of the Romance world) they don't have Snakes & Ladders (although Google Translate says that Snakes & Ladders is an English "translation" of Gioco dell'oca.) Un' oca is thought to have derived from the Vulgar Latin *AUCA(M) (the preceding asterisk signifies that the word is not attested, but is the source of other Romance words that require it to have existed).

    On the right is a rather mangled excerpt [cobbled together from the foot of one column and the top half of the next] from the Romance philologist‘s bible Romanisches etymologisches Wörterbuch. The book was compiled more than a century ago, when the centre of the philological universe was in Germany, (Grimm's Law, remember)  so it's not a light read. And it says so much about auca, avicella and avicellus that I missed out an elision after the first four lines on avicellus: Section 828 goes on, but my interest ran out after the French oiseau.
    <tangent status="just thrown out there">
    I wonder if Pooh's Woozle owes anything to A.A.Milne's knowledge of Chaucer's ousel... So little time, so many speculations.
    </tangent>
    Anyway, oca means "goose", and there are diminutives in its back-story. But when I first (knowingly, as I imagine I may have come across the word before I saw that Italian board-game) saw the word I wondered whether it might have any connection with the English word ocarina – this odd-looking musical instrument:

    I went to my usual source for this sort of information, Etymonline:
    ocarina (n.)
    1877, from Italian ocarina, diminutive of oca "goose" (so called for its shape), from Vulgar Latin *auca, from Latin avicula "small bird," diminutive of avis "bird" (see aviary).
    My guess was right (though I'm not sure I buy the so-called for its shape. The instrument comes in all sorts  of shapes, but the most common one doesn't remind  me of a goose; perhaps the noise it makes comes into  it).

    Returning to the game, its instructions were in Italian; and I suspect  – my command of Italian is more of a comma – they claimed a millennial origin for the game, though Wikipedia suggests that the author of this pooh-poohs the idea with a rather curt sniff:
    [The games]...are unlikely to have been the same
    Geese figure elsewhere in much language. The rather dated silly goose, cooking someone's goose, wild goose chase...
    <digression theme ="goose".
    In my partial soon-to-be-released new vowel book, the *IL* section says this of the expression wild goose chase:
    When Shakespeare put this expression in the mouth of Mercutio (in the first recorded use), he was probably referring to a certain kind of horse-race, with a leading horse being followed by other riders in the V-shape typical of migrating geese. When used today, it refers more directly (although figuratively) to the notion of chasing after wild geese. (It seems to me that this change in meaning may have been influenced, in days when Latin was more widely studied, by an awareness of the fact that a mission to find the solution to a question that has no anser [=Latin, "goose"] was vain; but there is no documentary proof of this – which, I admit, smacks of folk-etymology.)
    </digression>
    ...what's sauce for the goose is sauce for the gander....[I'm not sure where that "good for the goose" in the UsingEnglish version comes from. Both BNC and COCA prefer sauce as a noun in that context {before "for the goose"}... Oh I get it. I was searching specifically for a noun . BNC prefers the noun, with only a single good; but COCA has much closer balance (indeed, an ABSOLUTE balance, in its corpus – alliteration trumps gastronomy )] Geese certainly get about. But things need doing. Further reflection on ocarinas and goslings will have to wait, sine die].

    b

    PS A clue:
    • Reportedly Spooner's porcine challenge for a sympathetic cure (3, 4, 2, 3, 3)
    Update: 2018.02.03.10:40 –  Added PPS

    PPS: The answer: THE HAIR OF THE DOG

    Tuesday, 28 February 2017

    The price of education

    ... or rather the cost of its omission.

    Some bumf has just plopped onto my doormat  (is any other verb possible, I wonder? – things might thud if they're particularly heavy, but otherwise plop it is)...

    STOP PRESS: BNC and COCA checked

     Yes; they can fall, drop or land
    and lie or be, of course,
    but I was thinking particularly of falling. 

    ... listing donors  to college funds. There is a list showing percentage participation by year of matriculation whatever that is  – presumably percentage of matriculands giving, rather than the percentage (given by each year) of the total given (which, come to think of it, can't be so, as the average for all years since 1942 [before which there are a few odd nonagenarians] is 14%).

    It would only be to be expected that there would be a bell curve, with earlier years tailing off and later years rampimg up (as graduates find either their Heavenly  reward or their feet, respectively).  My own year, 1971, does quite well: since then, only two years have exceeded its participation rate, and one has equalled it:


    But something happened in 1998 (and I think the Chancellor of the Exchequer knows something about it: tuition fees). Since 1998 the average has fallen to single figures; graduates presumably think Pay more? I should cocoa. You've already had N thousand (where N is typically somewhere between 10 and 100 – at a guess; the NUS probabl;y has more exact figures). And that average is raised by the anomalous 2009, when the reported rate is (dubiously?) more than twice the mean.

    Of course, this is a tiny sample, and says nothing  – prima facie – about state funding or its shortfall; but it strikes me, anecdotally, as at least suggestive.
    <autobiographical_note theme="Primary School" relevance="tenuous">
    In the mid-late '50s I met my father on his return from the 2nd Unit photography for No Time to Die. I remember the BOAC bag he was carrying  at Heathrow, where I met him, but not much else; I had just started school.  The film was released in 1958, so I expect the 2nd Unit work was finished in 1957, or even 1956. At the airport I met and shook hands with Bonar Colleano, reaching up from my height of about 4ft.
    The cast and crew list at IMDB credits him as The Pole, which doesn't suggest immense stardom, but I was convinced he was (anachronistically*) a megastar and didn't hesitate to drop his name at school at the earliest opportunity. The first time, there was no sign of recognition. No accounting for the ignorance of SOME people, I thought, and went on to my next name-drop-ee. It took 4 or 5 such attempts for me to get the message that Mr Colleano's was not a name to conjure with.
    <afterthought>
    Perhaps, I have just thought (with the benefit of hind-sight and Wikipedia), that as many of my schoolfellows were Polish (my father had moved to Ealing because  of the film studios, but Ealing was also a magnet for Poles, because the local church had a "Polish Mass" even in those pre-vernacular-Mass ...
    <background>
    This is reminiscent of an issue I discussed a while ago here, explaining about the introduction of  the vernacular after the 2nd Ecumenical Council in 1966, but also discussing the inherent foreignness of familiar Church Latin texts spoken with foreign phonemes.
    </background>
    ...days) maybe their parents sheltered their children from this portrayal of The Pole. More likely, though, he was just a bit-part player who nobody had heard of anyway.
    </afterthought>
    That list also contains the rather enigmatic (APSEUDONYMOUS?) credit
    "Cyril J. Knowles
    ... photography: second unit (as Cyril Knowles)"

    </autobiographical_note>

    Foggy Nomination

    Regular readers may remember Foggies, my award for spectacularly bad writing. As I wrote here
    The idea for the name derives from Robert Gunning's FOG index, although these awards don't restrict themselves only to obstacles to readability measured by that index.
    It goes to Michael Gove for his review of Timothy Snyder's On Tyranny  in The Times of 25 February 2017. The whole thing is worth a read for its consummate display of self-serving doublethink (hoping to atone for his own craven kowtowing to The-Clown-With-The-Orange-Countenance) and obfuscation. But two notable "sentences" are these:
    He compares Trump's behaviour at campaign rallies to the deployment of the SS and brackets Trump's stump speeches with the "shamanistic incantation" [quotes sic, but what does he mean ? Hitler's incantation, or the crowd's, or the crowd saying "Hitler"? – probably Hitler leading the crowd. but in what way is this comparable with deployment?] of Hitler. He also compares Trump's attitude to any opposition to Hitler's approach to critics and feelings of fear on the streets of the US today to totalitarian terror in Nazi Germany in the 1930s.
    Phew. Fifty-seven words and but a single resting place for the weary parser. The last thirty-three-word string is a labyrinth of to's (nearly 1 every 7 words ' – sort that lot out). I finally worked out that the first comparison ends at critics, and the second is between the sadly unparallel feelings of fear and totalitarian terror (whatever THAT is); only the 3rd and 4th to are the comparison  sort. Now I'm  not a stickler for "compare... with" as some language Nazis are, but since Gove did use with in the first sentence, switching to to in the second is at best mindless elegant variation and at worst an unforgivable attempt to trip the reader up.

    As Sheridan père (I think it was) said (and as I may have quoted before –it being a favourite of mine)

    We write with ease to show our breeding
    But easy writing's curs't hard reading.

    Hmm... That's enough for now. I'd like to see how Gove's writing in this article measures up to his own prescriptions (as Education Minister). But that will have to wait for an update.

    b


    * Collins English Dictionary supplies this usage graph:

    Saturday, 31 December 2016

    Watching and seeing

    A quick reflection on a quirk of collocation (words that go with each other).

    I noticed the other day when my Daughter-in-law-Elect asked  'Have you watched <film_name>?' that here there was a difference between my collocation rules and hers. I  'SEE a film' and 'WATCH a television programme'. I looked in the British National Corpus, and found these results:

    watch a film    number of instances:  7
    see a film        number of instances: 19

    But the sample size is quite small and quite old (100 million words; 1980-93). The larger and more recently-updated Corpus Of Contemporary Anerican (520 million words, 1990-2015)

    watch a film    number of instances: 20 (a much smaller proportion)
    see a film      number of instances: 61 (a slightly smaller proportion)

    But, as we're looking at American usage in the case of COCA, perhaps these figure are more representative:

    watch a movie   number of instances:  253
    see a movie       number of instances:  293

    And they give a much more evenly-balanced picture.

    Besides, this generic vocabulary is rather suspect. I thnk it's probably likely that people would say 'I have seen The Magnificent Seven n times' or – to quote a primary school colleague of mine, of dubious taste (and no less dubious veracity) – 'I have seen The Guns of Navarone 15 times.' And I don't see how one could  frame a corpus query in a way that would catch all such collocations.

    Perhaps, as the speaker who started this hare was a millennial, as they say, this just indicates the age and movie-consumption mores of the speaker. Whereas I and my contemporaries look on movie-going as going to a (big-screen) show (to see a film), younger speakers are more likely to catch their movies on a smaller screen (and perhaps watch a DVD or something streamed, or whatever these young folks do, m'lud). That could account for the much more even COCA figures.

    Anyway, there goes a year of great notability (make that notoriety in some respects). See you on the other side. :-)

    b

    Update: 2017.01.01.15:00 – Added PS

    PS And while we're on the subject of corpora, one of the many retrospective programmes that have been aired in the last week has reminded me of two things:
    • Jeremy Corbyn's use of ram-packed
    • my response to the question What's wrong with Google as a corpus?
    Google reports nearly 17,000,000 results in a search for ram-packed. But Garbage-In-Garbage-Out. Here's BNC's search for *am-packed (as usual, just click on the link and sit back while the corpus does its stuff): spoiler – 21 jam-packed, 1 dream-packed, nothing else.  COCA has a different story: 266 jam-packed, and a single alternative; but that alternative is cram-packed (only three).

    Ram-packed is an interestng neologism. It combines the idea of jam-packed with the idea of people being pushed willy-nilly into a carriage. As of 2017, I'd hesitate to call it a word; but that certainly doesn't mean it  will never be . This Google search shows that only 70-odd thousand of those 17 million results link ram-packed with Corbyn. So it's well on the way to... verbitude? Perhaps OED will name it  Word of the Year 2017.

    Happy New Year. :-)

    Thursday, 15 September 2016

    Quips and quiddities

    I've been thinking about seasons – specifically about adding an s. Taking as an example the frame

    "a <season> day

    I thought it was simple (if arbitrary): you can have a winters day or a summers day, but you can't have a springs day or an autumns day.

    But I was forgetting the importance of collocation – what comes next  (in this case).

    I was led to uncover this when I wanted to put numbers on this pattern. BNC doesn't happen to include an instance of a winters day.  And this made me use the "any noun" search syntax; which led me to conclude that the pattern is even more uneven. All season names are much preferred without the s (I've no  idea what its syntactical status is – some kind of possessive, I suppose; stay tuned for an update.)

    By the wonders of BNC, the following seven links all run a BNC query; depending on line speeds, processor speeds, and all sorts of other techy imponderables, you may have to wait a second or two after clicking for the full story to unfold

    winter [*n]  1590  winters [*n] 19 
    spring [*n]  1086  springs [*n] 90 
    summer [*n] 23163  summers [*n] 21 
    autumn [*n]   774...
    <tangent>
    Why so many more cases of summer? About 30 times as many as of autumn, 20 times as many as of spring , 15 times as many as of winter. (Those numbers are wobbly; I didn't use a cuaculator. I could've just said an order of magnitude greater, but that expression has been sadly debased.) Hmm...  Meanwhile, back at the a <season>s <noun> pattern...
    </tangent>
     ...But autumns doesn't follow the pattern of feast and famine. In that case it's feast and starvation; absolute starvation. The string autumns [*n]  just doesn't occur in BNC. And precious few occur in the much bigger (520 million words – more than five times bigger) COCA; just three (of which one is a mistake, resulting from a mistaken parsing of the word back):
    autumns [*n] 3

    I'm  not sure what this shows, except that when you put numbers on something you end up finding that things aren't quite as clear-cut as they seemed at first.

    And here are two numbers that have only the very faintest soupspoon of a connection (and even that is arguable – it's just that the number of refugees world-wide can only increase when nation-states take more than their fair share of the planet's resources).

    The first comes from a UNHCR report. CNN reported it like this

     The second comes from the ONS:

    But the numbers reached out and grabbed my attention by dint of their similarity. I suppose another way of looking at it is that we have reached a tipping point: the number of refugees world-wide now exceeds the population of the UK (and that's before any readjustment that Nicola Sturgeon might have in mind ).

    But time's wingéd chariot could do with some 3-in-ONE.

    b
    PS A couple of clues:

    Affect brilliant fedora without an emergency jump-starter (13)
    Such a way of arguing makes him a demon (2, 7)

    Update 2016.09.16.11:45 – Added PPS and PPPS

    PPS And I meant to add, justifying my subject-line (this is more of a quiddity than a quip), earlier this week I heard something of interest to a one-time translator of songs (a bit of background covered in this old post) . In Monday's Woman's Hour (the song starts at about 18'30")  Kizzy Crawford sang about a pond skater (no, really). She sang in English, but she sang the last chorus in Welsh. One word leapt out at me from the Welsh. Earlier (English) choruses had involved the word lily pad (see? – it really was about a pond skater); and the word that jumped out at me from the Welsh was lily pad.

    I looked this up in an online Welsh dictionary: pad lili. What was up? Was Kizzy fibbing?

    Of course not. She does say, in the interview that precedes the song, that Welsh is more appropriate to singing about nature. And one of the ways it has of being more flexible may be the choice of translation sources.  By chance I found this alternative translation (in Google Translate): lilypad  – note the lack of word-break.

    Marvellous things, dictionaries. But they have their limits.

    PPPS The promised update on syntax: the jury's still out. In some cases, it's obviously a possessive; summer's lease is the metaphorical lease held by belonging to a metaphorical summer. There's an apostrophe, and its aptness is not in question. But in summers day there's no apostrophe* and no sense of possession. I'll ask some teachers.

    Update 2016.09.18.10:45 – Added P⁴S

    Yes – it's an attributive genitive (also called descriptive). Some people omit the apostrophe, and in cases where the idea of possession is weak (or absent) this tendency is stronger.


    Update 2016.09.26.11:15 – Added P5S

    *Nonsense – I just misunderstood the BNC's search algorithm's treatment of the apostrophe.  I think  it's true, however, that the apostrophe , which usually marks possession, is less widely insisted on (and, from the point of view of a language historian, more likely to  be dropped   – rather like the apostrophe marking omission before words such as bus, cello or phone – when the sense of possession is weaker.

    Update 2016.10.29.15:05 – Answers to those clues: DEFIBRILLATOR and AD HOMINEM.

    Sunday, 7 August 2016

    Highlights

    <rant id="1"  ferocity="intense, but not as strong as that of MrsK">
    The term highlights has a longer history than one might think, given that its meaning today is so closely related to radio or television. Etymonline, glossing over the pluralized version (which it doesn't distinguish as a headword), says

    highlight (n.) Look up highlight at Dictionary.com1650s, originally of paintings, "the brightest part of a subject," from high (adj.) + light (n.). The figurative sense of "outstanding feature or characteristic" is from 1855....  Related: Highlights.

    The Collins Online site avoids this dilemma (geddit? LEMMA), even giving it its own frequency graph:



    In the words of a comment I made recently to the Collins Online site
    The definition 'a selection on the TV or radio of the most important and exciting parts of an event, esp a sporting event' doesn't work any more. To judge by the BBC's coverage of Rio 2016, "highlights" seems to mean "about an hour of celebrity chat, punctuated by very occasional and sparse clips of sports action".

    </rant>

    <rant id="2" ferocity="mild – not even a rant really, just an occasion of vague regret and nostalgia">

    I know I know  I KNOW, this is the way language develops – I've defended so-called "mistakes" often enough in this blog.

    But I'll never say (or write, except here, of course) appeal the decision. The most recent "infraction"("He only does it to annoy, because he  knows it teases") was probably to do with drug cheats before Rio.  I compared this construction (and the version we dinosaurs still use, with a preposition and no direct object) in the British National Corpus and in its American analogue COCA.

    The search appeal against the [n*] (by the magic of BNC, you can just click on that link) occurs 136 times in BNC. Meanwhile, appeal the [n*] occurs only 36 times: the version with the preposition outnumbers the newcomer about 4:1. (That word newcomer suggests a possible PhD study: "The usage of non-traditional grammatical forms – an age-related study". That would put some numbers on what to me at least is a self-evident truth: as language develops over time, the trail-blazers are the young.)

    In COCA, unsurprisingly (although possibly the extent of the preponderance [nearly 20:1] is a bit of a surprise), the relative weights are reversed: the American English strong preference is for appeal the [n*]. (Given the state of the hedge [] I must leave the workings as an exercise for the reader.)
    </rant>

    b

    Monday, 21 March 2016

    Ex unibus plurum - "wuggen" revisited

    In my last post I stumbled, in passing, on an idea; some of you may have noticed the "hmmm". I thought it might take the form of an update, but it has turned out to be a bit more substantial than that. The idea was to quantify the different ways English has of forming a plural. When I wrote this post nearly two years ago it didn't occur to me to wonder. I was content to say
    English has lots of ways of pluralizing a noun – no change (sheepfish...), change -us to -i (radius → radii...), add -en (ox → oxen [or do something else involving '-en' {child → children, brother →  brethren...}]), change -ex or -ix to -ices (matrix → matrices) etc, but by far the most common device is to add an s (though this simple idea hides several options [/s/ {rabbits}, /z/ {gardeners}, /ɪz/ {radishes}]. What is the word for 'more than one wug'? Wugs, of course, with /gz/.
    It's fairly obvious to a  native speaker that the most common way is to  add an s. In fact, this rule becomes apparent whenever a young language learner mistakenly adds an s to an irregular plural – sheep becomes sheepS rather than sheep, for example, and when an adult corrects mouseS to mice, the compulsion to keep faith in the add-an-s rule is so strong that the next attempt is quite often miceS.

    But I wondered how I could put a number on that. The obvious source of data seemed to me to be the British National Corpus, though it is relatively small, at a mere 100,000,000 words. Some of the publicly available corpuses...
    <I_know_ I_know subject="corpora">
    There are people  who  say that corpora is the"correct" plural; some readers may have had the misfortune of being taught by someone who believed so; Firefox is trying to correct my spelling.  The latinate plural is not wrong, but I adhere to Fowler's belief (in The King‘s English)
     ...that all words not English in appearance are in English writing ugly and not pretty, and that they are justified only (1) if they afford much the shortest or clearest, if not the only way to the meaning ... or (2) if they have some special appropriateness of association or allusion in the sentence they stand in.
    Elsewhere (maybe Modern English Usage) he gives the advice that you‘re less likely to make an embarrassing mistake (like mistaking a latinate -us word for a second declension noun instead of a 3rd [such as corpus] or 4th declension one [syllabus, for example], and giving it an -i ending), and more likely to be understood, if you use a native English s plural ending whenever it's possible.
    </I_know_ I_know>
    ... have many more.

    It is possible in principle to construct a query that requires a search engine to return all the nouns in it that end with a certain string. But accompany me, if you will, in a thought experiment. Suppose for the sake of argument that in any text in the corpus the percentage of nouns  is N%.
    <back_of_fag_packet> 
    A few examples, followed by the count of plural  nouns:
    1. The cat sat on the mat.                               N=0 
    2. There is a tide in the affairs of men...      N=1
    3. Softly softly catchee monkey.                   N=0         
    4. The wages of sin is death.                         N=1
    5. When shall we three meet again?            N=0
    6. Honey I shrunk [sic] the kids                    N=1
    7. Where have you been all day?                 N=0
    In this mini-corpus (perhaps I should make that nano-corpus) there are 3 plural nouns out of a total of 50 words. They're not too common, plural nouns; 6% in this case, though in for example a recipe book the figure would be much higher. In BNC, that would be 6%  of 100,000,000 – 6,000,000. 

    This is admittedly  a VERY dodgy sample; but my point is that even a tiny value for N leads to a big number in a corpus such as BNC. 

    </back_of_fag_packet>
    At the British National Corpus I asked for all the plural nouns that end -s. This would catch a few non-standard plurals, like indices or theses; but those would add up to no more than dozens, or hundreds at most, among millions. But the query timed out after finding the first 7500 distinct words (the most common of all was things at 40,453 – a clear 11,000 ahead of the field), by which stage the search had only worked its way down to words that had a total of 27 hits.  For comparison, in plural nouns ending -n, the search worked its way down to 23 (there was  no 24, 25, 26 or 27) after listing about 96% of all possible hits. Extrapolating from that we can estimate that if a search reaches 27 after 4.5M hits there will be a total of something like 5,000,000 (N = 5 – so my fag packet calculation wasn't too far out).

    I've crunched some numbers, thinking at first in terms of some pretty pie charts. But the difference between -s plurals and all the others was so great that pie charts wouldn't be very interesting: most non-s endings would get a tiny (often nearly invisible) sliver. I've shown my working here (none too legibly I'm afraid):

    More legible version

    In fact, rather than a pie chart, a more helpful image would be a clock-face. The sector occupied by all non-s plurals added together would be the area between 12 o'clock and about 4 minutes to. The only families of non-s plurals that would account for more than a minute or two would be irregular English plurals of all kinds (folk, men, children, feet, teeth,...) and Latin plurals  – mostly ending with -i, but sometimes ending with -a, or e; the few Latin plurals that end -ūs  (in Latin, as for example syllabus does) are of course lost among the -s endings  – if there are any in BNC.

    There. There are some numbers. I may try a similar trick on another corpus; on the other hand I may get on with #WVGTbk2. 

    b

    Update 2016.03.21.14:30 – Added PS

    PS  Here's a clue:

    Stubborn – gathering information on the way (12)

    Update 2016.03.23.14:50 – PPS

    Added link to spreadsheet.


    Update 2016.04.25.11:35 – PPPS

    PPPS Time. The answer to that clue: INTRANSIGENT

    Update 2018.06.10.10:25 – A few typo fixes

    Update 2018.10.23.14:05 – Updated linked spreadsheet (but left old screen-grab as is – I'm sure anyone who's interested in the figures will look in the spreadsheet anyway).

    Sunday, 24 January 2016

    Shameless plug

    A choral singer knows he's getting on when, as for me this term, the next concert includes three choral pieces all of which he's sung before with another choir or choirs.

    The first is Vivaldi's Gloria, which I've sung twice before, once with Reading Haydn Choir about 20 years ago, and once when I was driving my son to a concert arranged by a fellow barbershop singer (who was choir master at his local church). As I knew the piece, I became a singing chauffeur.

    The other two pieces involve a setting of Psalm 110, once entirely (Handel's Dixit Dominus) and once as one of several texts in Mozart's Vesperae Solennes de Confessore. When I first sang the Handel, at the first rehearsal, somebody asked me what the opening words of Dixit Dominus meant. One word in the opening sentence was new to me, so I could only say 'The lord said to my lord "Sit on my right, until I do something jolly unpleasant to your enemies."'

    The unknown word  was scabellum* – a footstool. The something jolly unpleasant was turning them into footstools (although I imagine there was an element of metaphor here  – I don't think trans-substantiation was involved).

    The word 'until' seems a bit odd. Does the first lord – the speaker – mean that  the second lord can only occupy the favoured position until the enemies turn up and suffer enscabellation – thereafter to sit somewhere else (on the enemies, perhaps)? But donec, when followed by a subjunctive, usually does mean until. The bible translations listed here all use until or till, with a small handful of exceptions. Only two translate it as while, in which case donec would usually be followed by an indicative (not ponam but pono). Food for thought. But not today – I'm neglecting the cricket.

    Suffice it ...
    <digression> 
    I refer readers to an old discussion,, in the UsingEnglish forum, where I explained: 
    The fossilized phrase 'Suffice it to say' means 'let it be sufficient to say'; a more modern idiom is 'Enough said' - but, unlike 'suffice it to say', this follows the thing said: 'I shouldn't have done it. I'm sorry. Enough said'.

    You'll have noticed that I keep saying 'Suffice it to say'. This uses the subjunctive, which is hardly used in informal British English. And as both 'it' and 'to' are unstressed in that phrase, they are easily heard as a single /t/ followed by a schwa - particularly by habitual non-users of the subjunctive. This form [HD clarification: the ITless form] is widely used, and has become almost as common as the fuller form: BNC has 53 instances of 'suffice to say' and 88 of 'suffice it to say'.

    In COCA, on the other hand, which is based on N. American usage, has [HD correction: 'there are' (I may have meant háy)] 376 (377 if you include 'sufficeit to say', of which there is a single instance which I found by accident ), and only 97 of  'suffice to say'. And that balance makes sense, considering the relative strength of the subjunctive in American English. 
    Anyway, I'm an IT-man. 
    </digression>

    ... to say that you should put Saturday 2nd April, 2016 at 7.30pm in your diary. (More details of the concert here.)


    Tales from the word-face

    My android system's latest exploit in the matter of spelling corrections involve a Character Entity expressed in the Named Entity Syntax (and if you really want to know what all that means, pick the bones out of this).  My HTML code makes occasional use of &nbsp; – a non-breaking space (for use when you want to keep a space between two words but keep them on the same line).

    If I used it often enough I'd tell the spell-checker to add it to my dictionary. But for now, whenever it sees "nbsp" it asks me if I'd prefer to use "tbsp", which sounds like the sort of Character Entity that'd come in useful for writers of recipe books.

    b

    PS Another clue:
    Landlubbers' haven in heavy swell (in case of bowel-movement) (5)

    Update 2016.01.27.12:15 – Added PPS

    PPS
    I've been thinking about the until/while problem mentioned in the fifth para. To recap: the Latin text has Donec ponam  (="until I put"), not Donec pono (="while I put"). "Until I put" involves the first 'Lord' (the speaker) in some rather strange reasoning, making the sitting at the right hand only a temporary (pre-enscabellatory) position – which I suppose I should gloss as meaning lasting only until the end of the turning-into-a-footstool [sorry about these unfeeling neologisms, but scabellum is too good a word not to have any derivatives in English]). So why is ponam not pono – unless, of course, St Jerome (or one of his predecessors) got it wrong (when translating from David's [or someone's – Wikipedia has a rather ominous  "although his authorship is not accepted by modern Bible scholars"] Hebrew)?

    It would take a Hebrew scholar to take this further (and I'm working on that), but I suspect that Hebrew has a way of expressing temporal and/or conditional relations in a way that does not fit in with the Latin way – so that neither "until" nor "while" really does the job. Hmmm...

    Update 2016.01.27.15:05 – esprit d'escalier in blue.

    Update 2016.02.05.10:15 – Added PPPS


    PPPS

    When, in last night's rehearsal, we broached (and on occasion breached) the Magnificat, I was reminded of last summer's post, My soul doth magnify the problem – particularly this bit:
    ...the words of the Magnificat reminded me of a confusion that keeps cropping up in the life of a choral singer. In the text that that link points to you'll see in the third line of the Latin exultavit, translated in the English as "hath rejoiced". But later on the word exaltavit appears, translated in the English as "hath exalted". 
    Italianate pronunciation of Latin now gets involved. Listen to this YouTube clip; the relevant word starts occurring from about 30 seconds in, and is repeated as often as Vivaldi chooses. When this vowel (not unlike the English /ʌ/ phoneme – the one that occurs in, for example, "exulted", although it is closer to [ɑ] {Update note: this is an IPA transcription})  – is heard by a strictly Anglophone ear, confusion arises.... 
     Last summer's post   
    Update 2016.02.06.16:40 – Added P⁴S
    P⁴S Another clue:

    Surfeit of promissory notes – hateful (6)

    Update 2016.02.10.09:15 – Added concert poster.

    Update 2016.03.24.14:40 – Added footnote, and crossword solutions.

    * By chance, flicking through a dictionary looking for something else (the kind of serendipitous Aha-provoking discovery that doesn't happen with an online dictionary – excepting artificial things like Word-of-the-day), I found that Spanish (and indeed Catalan, Provençal, Italian, French etc, I've since determined [courtesy of the wonderful Meyer-Lübcke – which I've mentioned before] all have similar words) has the word escabelo. Spanish also has a quite charming metaphorical use for escabelo (which is, on weekdays, "a little stool"); in its Sunday best, figurative, use it is a "stepping stone". Life really is just one digression after another.

    Solutions: BELOW and ODIOUS.