I heart hangry bagel droids (or: How new words form)

The_fin_de_siècle_newspaper_proprietor_(cropped)

You’re probably familiar with the old adage “the only thing that’s constant is change.” Still, so many people tend to think about language as a relatively fixed affair. I’ve said it before (and will inevitably say it again): all living languages change all the time, and at all levels – phonological (sounds!), morphological (word-bits!), lexical (words!), syntactic (clauses!), and semantic (meaning!).

Historical linguistics (also known as diachronic linguistics) is the study of how and why languages change over time. In this post I’m going to discuss categories of change at the morphological and lexical levels – how new words come into being. In the future, I’ll explore semantic and perhaps phonological change.

Without further ado, here are the main mechanisms of word formation. Almost all examples are for English, but these formation types apply to other languages as well. (NOTE: Processes are not mutually exclusive. It is quite possible for a word to undergo multiple processes simultaneously, or one on the heels of another.)

  1. Derivation

New words are born by adding affixes to existing words. Affixes are bound[1] morphemes that can be prefixes, suffixes, and even (for certain languages, although not really for English) infixes and circumfixes. Derivation is a very common process cross-linguistically.

Zero derivation (also known as conversion) is a special case where a new word, with a new word class (part of speech) is created from an existing word of a different class, without any change in form.

Examples:
(Derivation) hater [hate + -er], truthiness [truth + -i (-y) + -ness], deglobalization [de- + globalization], hipsterdom [hipster + -dom]

(Zero derivation) heart as verb, as in “I heart coffee” [heart as noun]; friend as verb, as in “he friended me on Facebook” [friend as noun]; green as noun, in the golf lawn sense [green as adjective]; down as verb, as in “Hector downed a beer” [down as preposition]

  1. Back-formation

This process creates a new word through the removal of true or incorrectly assumed affixes. It’s kind of the opposite of derivation. This one is easier to explain through examples:

New word

Derived from older word

Analysis

donate, automate, resurrect

(verbs)

donation, automation, resurrection

(nouns)

The nouns were borrowed into English first from Latin. The verbs were back-formed later by discarding the -ion suffix, which speakers did through analogy with other Latinate verb and (-ion) noun pairs that already existed in English.

pea

pease

The older form was initially a mass noun (like water or sand), but was reanalyzed as plural. People then dropped the “plural” -s(e) to form the “singular” count noun pea.

beg, edit, hawk

(verbs)

beggar, editor, hawker

(nouns)

Speakers mistook the -ar, -or, and ­-er on the ends of these nouns (respectively) for the agentive suffix (that did/does exist in English), and removed it to form corresponding verbs.

lime-a-rita, mango-rita

appletini, kiwini

margarita

martini

Actually examples of folk etymology, which is related to back-formation. Here, speakers incorrectly assumed that -rita in margarita and –(t)ini in martini were separate morphemes (indicating the class of cocktail). Under that assumption, they switched out the rest of the word and substituted it with morphemes indicating new twists/ingredients.

  1. Blending

Also known as portmanteaus. Blends are produced by combining two or more words, where parts of one or both words are deleted.

Examples: smog [smoke + fog], brunch [breakfast + lunch], infomercial [information + commercial], bromance [bro + romance], hangry [hungry + angry], clopen [close + open][2]

  1. Borrowing

Also known as loan words. These are expressions taken from other languages. Pronunciation is usually altered to fit the phonological rules of the borrowing language.

Examples: algebra [from Arabic], ménage à trois [from French], whisky [from Scots Gaelic or Irish], bagel [from Yiddish], doppelgänger [from German], karaoke [from Japanese]

  1. Coinage

Words can be created outright to fit some purpose. Many of these are initially product names.

Examples: Xerox, Kleenex, Jell-O, Google, zipper, Frisbee

  1. Compounding

Two or more words join together to form a compound. Frequently the joining words are nouns, but they can belong to different parts of speech, including verbs, adjectives, prepositions, etc. Compounds can be separated by spaces, by hyphens, or glued to each other with nothing intervening.

Examples: homework, grocery store, mother-of-pearl, first world problem, binge-watch, weaksauce, fake news

  1. Eponyms

These are words that derive from proper nouns – usually people and place names. If a proper noun is used frequently enough and across multiple contexts, it eventually becomes a common noun (or verb or adjective).

Examples: sandwich [after the fourth Earl of Sandwich], gargantuan [after Gargantua, name of the giant in Rabelais’ novels], boycott [after Capt. Charles C. Boycott], mesmerize [a back-formation from mesmerism, in turn after Franz Anton Mesmer], sadism [after the Marquis de Sade]

  1. Reducing

Several types of reducing processes exist.  The main ones are clipping, acronyms, and initialisms.

a. Clipping

New words can be formed by shearing one or more syllables off an existing longer word. Syllables can be removed from the word’s beginning, end, or both.

Examples: fax [facsimile], flu [influenza], droid [android], fridge [refrigerator], blog [weblog]

b. Acronyms

Words are created from the initial letters of several other words. Acronyms are pronounced as regular words (in contrast to initialisms below).

Examples: NASA [National Aeronautics and Space Administration], RAM [random-access memory], FOMO [fear of missing out]

c. Initialisms

Also known as Alphabetisms. Like with acronyms, a word is created from the initial letters of other words, but the resulting term is pronounced by saying each letter. This usually happens when the string of letters is not easily pronounced as a word according to the phonological rules of the language.

Examples: NFL [National Football League], UCLA [University of California, Los Angeles], MRI [magnetic resonance imaging], WTF [what the fuck]

  1. Reduplication

Reduplication is one of my favorite phenomena.[3] It’s a process whereby a word or sound is repeated or nearly repeated to form a new word/expression. This is a productive morphological process (meaning, it’s part of the grammar and happens frequently and rather systematically) in many languages – South-East Asian and Austronesian languages particularly (e.g. Malay, Tagalog, Samoan). It’s not an especially productive process in English, although it does still happen.

Examples:
(English) wishy-washy, teensy-weensy, goody-goody, cray-cray, po-po

(Samoan) savali [‘he travels’ – third person singular + verb]; savavali [‘they travel’ – third person plural + verb]

* * * * *

Phew! Since hopefully you can see the light at the end of this long lexical tunnel, I’ll mention that of course languages lose words as well. Diverse factors motivate word loss, but that’s a subject for another post. A few quick examples of words that have fallen out of favor in English:

pell-mell [in a disorderly, reckless, hasty manner]; davenport [couch/sofa – my grandma used to say this]; grass [for marijuana – my mom still says this]; porridge [an oatmeal-like dish boiled in water or milk]; tumbrel [a farmer’s cart for hauling manure]; fain [gladly or willingly]

* * * * *

And now… ADD WORDS TO THE SPREADSHEET – Word shenanigans!

I’ve got almost 200 in there to start us off. If you’re not sure about the process for any particular word, just leave it blank or take a guess. Free bagel droids[4] to all who contribute.

 

[1] Bound meaning they cannot exist on their own, but must be attached to another morpheme.

[2] Describes a shitty situation where one has to work a closing shift followed by an opening shift. We used this term as bartenders, although I’d never seen it in print until recently. It came up in some paperwork I had to sign relating to work week ordinances, and then I saw it here as well.

[3] Some languages even have triplication – where the sound/word is copied twice!

[4] Kidding! These do not exist outside of my head. Sorry.

Literally cray: A linguist’s attitude toward speech errors and slang

Slang_wordcloud

In a recent Lyft Line, it surfaced that the other rider in the car with me also had a linguistics background. Our driver was a non-native English speaker (from his accent maybe Russian) – although his English was pretty fluent. As he was deciding whether to make a left turn at a chaotic, construction-clogged intersection, he stuttered a bit and said, “well, it’s not not allowed”. Then, making the turn, he followed that with, “oh boy, and making these language mistakes with two linguists in the car…” The driver was assuming, as many do, that we would be more critical than the average person of said language “mistakes”.

First off, the driver’s statement wasn’t even a real speech error. Although slightly harder for us to process cognitively because of the two negatives, it’s not not allowed is in fact a perfectly grammatical sentence of English. A similar utterance might be said that avoids the duplicated notit’s not illegal, for example. But what’s going on here is this:

It’s [not [not [allowed1]2]3].

Between each opening and closing bracket is a structural unit, called a constituent in syntax. (The sentence as a whole is also a constituent, but I didn’t want to blind you with brackets.) So, allowed by itself is a constituent (subscript 1). The inner not negates allowed; together they’re a constituent (subscript 2). The outer not negates not allowed, and becomes a larger unit of its own (subscript 3). In the end, this structure has a very nuanced meaning – more nuanced than just it’s not illegal – which is something like, “this action is not necessarily encouraged and may even be frowned upon, but it’s not against the law”.

Second, even if the driver had made a speech error, linguists as a group are much less inclined to judge than the average person. There is a prevalent misconception that linguists and English teachers are siblings in a “grammar nazi” family.  This is untrue. Indeed, just as biologists thrill in discovering some new mutation in a species, linguists are generally delighted by speech errors and seek them out as important material to study; they give vital insights into how human language and the human brain function.

It shouldn’t come as a surprise, then, that a couple of my colleagues and I have had fun collecting both native and non-native English speech errors we’ve encountered over the past year. Here is a sample:

Actual speech Intended speech Speaker’s native lang Type of error
“thinking loudly” “thinking out loud” Farsi Idiom
“cross the finger” “fingers crossed” Farsi Idiom
“stepping over their toes” “stepping on their toes” Farsi Idiom
“thank you for fast react” “thank you for the fast reply/response Korean Dropping definite article; Wrong word
“confusication” probably “confusion” or “miscommunication” Hindi Blend
“decrepit rules” “deprecated rules” English Wrong word
“laids norm” “Lord’s name” English Metathesis[1]
“my tights are hip” “my hips are tight” English Metathesis

 

Of major relevance to the speech attitudes topic are two concepts, flip sides of a coin: descriptivism and prescriptivism.

Descriptivism is a process which attempts to objectively describe actual language usage, as well as speakers’ basic and intuitive linguistic knowledge. From several centuries of descriptive investigation, researchers have concluded that all languages and dialects are complex and rule-governed. No clearly superior or inferior languages/dialects exist.[2] The judgements we, as members of a society have about a particular language or dialect are inextricably influenced by sociological factors.

Prescriptivism, on the other hand, is a process which attempts to prescribe, subjectively, what should happen in language. You are familiar with this from years of English/grammar classes and from style guides mandating rules for spoken and written language. What you may not know is that many of these rules are arbitrary, based on personal taste and accidents of history.

A few of the most common “rules” that persist today are actually confused English misappropriations of Latin by pompous old men playing king-of-the-intellectual-castle games. One example is preposition-stranding, which dictates: Do not separate a preposition from its noun, leaving it at the end of a clause. Say “To whom did you talk?” instead of “Who did you talk to?” Seventeenth century poet John Dryden made this up (misapplying Latin, where preposition-like pieces attach to nouns and truly cannot separate from them) in order to disparage the work of Ben Johnson. Other examples include the predicative nominative, split infinitives, and the count–mass noun distinction (less vs. fewer).

English teachers are not alone in their prescriptivist tendencies. People generally are rather opinionated about language. Certain “errors” even become so despised as to prompt real-world action. Take the word literally. A New York City bar now has signage banning its use and warns that offending customers will be kicked out. Countless online articles and forums bemoan the word’s ubiquity with the rationale that speakers are using it to mean its opposite (figuratively). A bit of history and context, however, lend perspective.

Literally has been used as figuratively, or more precisely, as an intensifier, for over 300 years. Such literary greats as Charles Dickens, Mark Twain, and James Joyce (among others) have used it in this emphatic way. And the adverb’s paradoxical plight is similarly shared by a whole cast of terms, known as auto-antonyms. Interestingly, none of the other English auto-antonyms get the attention that is lavished on “literally”.

Now that I’ve outlined descriptivism and prescriptivism, I would like to add two final clarifications. First, being a descriptivist does not mean throwing out the idea of spelling conventions, or tossing aside standard education. Linguists of course recognize the utility of teaching standardized writing and speaking for particular contexts (school, job, etc.) for purposes of clarity, versatility, and social mobility. Language is rich and its uses are necessarily multifaceted.

All of the above also does NOT mean specific words or expressions or ways of speaking never make linguists cringe. (Enjoy that double negation?) We’re human after all. Despite knowing the full historical and linguistic context of “literally”, I still grind my teeth hearing it many times in succession. I have other personal struggles with clippings (cray, totes, obvi) as well as with internet chat-cum-speech acronyms and initialisms (lol, idk, wtf, omg). Simultaneously, I view them as fascinating lexical change phenomena. And I never take my individual tastes to mean that the language is somehow “degrading”. Languages don’t degrade; they change, and have been changing ever since our ancestors began to talk. If not for such constant metamorphosis, we wouldn’t have the enormous linguistic diversity – the thousands of languages and dialects – that exists today.

 

[1] Where sounds, syllables, or words are switched.

[2] It has been an oft-repeated creed in linguistics over the last few decades to make the stronger claim that “all languages are equal”. However, the statement has not been scientifically proven, as researchers have not yet determined the precise criteria by which languages are to be measured, much less figured out how to measure and compare such enormous complexity. This thought-provoking topic will be the subject of at least one future post.