Sign language, ASL, and baby signing

September was National Deaf Awareness Month. I tried to post this piece before the month ended, but alas! Better late than never. I’d like to discuss and dispel some of the many misconceptions around signed languages. Here are a few of the most common:

  • Sign language is universal – there is only one
  • Sign languages are not “real” languages
    • They’re simpler and easier to learn than spoken languages; they’re just gestures, or body language, or pantomime
    • They’re not as complex as spoken languages; they don’t have true grammars, or large vocabularies, or the ability to express abstract concepts
    • They were “invented”; they didn’t evolve naturally among communities over time
    • They have to be explicitly taught; they cannot be acquired naturally by children through exposure as with spoken language
  • Sign languages are the visual equivalent of spoken languages – for example, American Sign Language is the visual equivalent of English

I’ll also spend some time discussing “baby sign language” (which is of personal import due to last year’s arrival of my very own teacup human).

Sign languages

Sign languages are natural languages whose modality is visual and kinesthetic instead of speech- and sound-based. They exhibit complexity parallel to that of spoken languages, with rich grammars and lexicons. Sign languages developed and are used among communities of deaf people, but can also be used by hearing individuals. These languages are not composed solely of hand movements. A good deal of their prosody, grammar (e.g. syntax, morphology), modification (adjectives and adverbials), and other features are expressed through head movements, facial expressions, and body postures.

American Sign Language (ASL)

American Sign Language (ASL) is the main language of Deaf communities in the U.S. and Canada. Contrary to what many assume, ASL is not grammatically related to English. From Wikipedia:

“On the whole […] sign languages are independent of spoken languages and follow their own paths of development. For example, British Sign Language (BSL) and American Sign Language (ASL) are quite different and mutually unintelligible, even though the hearing people of the United Kingdom and the United States share the same spoken language. The grammars of sign languages do not usually resemble those of spoken languages used in the same geographical area; in fact, in terms of syntax, ASL shares more with spoken Japanese than it does with English.”

ASL emerged in the early 1800s at the American School for the Deaf in Connecticut, from a mix of Old French Sign Language, village sign languages, and home signs. ASL and French Sign Language (LSF – Langue des Signes Française) still have some overlap, but are not mutually intelligible.

One element of ASL that I find particularly neat is its reduplication (repetition of a morpheme or word to serve a particular grammatical function, like plurality)[1]. Reduplication is a common process in many languages, and it performs several important jobs in ASL. It does things like pluralize nouns, convey intensity, create nouns from verbs (e.g. the noun chair is a repeated, slightly altered version of the verb to sit), and represent verbal aspects such as duration (e.g. VERB + for a long time).

Baby sign language

What is “baby sign language”? I haven’t found a very precise definition. The term seems to basically describe signing between (hearing) parents/caregivers and young children, but whether the signs come from a legitimate sign language like ASL, or are invented idiosyncratically by the family using them (and are maybe more iconic[2]), or some combination of the two, varies from source to source.

Anthropologist-psychologist Gwen Dewar, on her blog parentingscience.com, says:

“The term is a bit misleading, since it doesn’t refer to a genuine language. A true language has syntax, a grammatical structure. It has native speakers who converse fluently with each other. By contrast, baby sign language […] usually refers to the act of communicating with babies using a modest number of symbolic gestures.”

When Dr. Dewar mentions symbolic gestures, she is describing things like pointing or other hand motions that accompany speech and make communication with preverbal infants a little easier. Most of the baby sign language resources I’ve come across endorse using ASL as a base, however, so it’s not just “symbolic gestures”. At the same time, the ASL signs are often simplified (both by baby sign teachers and the parents learning), and the fuller grammar is not usually taught to/learned by parents or their children.

In the following post, I’m going to delve further into baby sign language – its supposed benefits, tips, timelines, resources, and my personal experience so far. We’ll look at proponents’ claims versus the scientific research. (Spoiler: Your offspring won’t be the next Einstein just because you taught them how to sign ‘more’ and ‘milk’.)

 

*Photo attribution: “Learn sign language at the playground”


[1] More on this process in “I heart hangry bagel droids (or: How new words form)” – see #9.

[2] I’ll discuss iconicity in the next post.

A Norwegian smörgåsbord

norwegian_sign_cropped

Okay, “smörgåsbord” is a Swedish borrowing, but close enough. It’s appropriate for this post, which will be a buffet of miscellaneous facts about the Norwegian language.

I became interested in and started learning Norwegian because my brother has been living in Oslo for the past several years, where he is getting his Ph.D. in lichenology.[1] My family and I traveled to visit him last summer. To characterize the country in a few words, I’d say Norway is – more iconically – Vikings, fjords, trolls, nature, Norse mythology, and – more personally – lichens, stellar black coffee, gross sweet brown cheese, overly-restricted booze-purchasing hours, part of my paternal ancestry, and vampires.[2]

Heddal stavkirke (stave church), built in the early 13th century

So what’s cool about Norwegian?

Dialects

First (as I mentioned in one of the recent dialect posts), Norwegian forms a dialect continuum with Swedish and Danish, languages with which it is, to a greater or lesser extent, mutually intelligible. These are Scandinavian or North Germanic languages, along with Icelandic and Faroese. My brother, who now has a decent command of Norwegian, says he can understand Swedish relatively well too, although Danish is harder. Have a listen to differences between Danish and Norwegian in this video.

However, there are also a staggering number of Norwegian dialects spread across Norway. People claim it’s often harder to understand someone from a different part of the country (for example, Oslo inhabitants vs. speakers of trøndersk, a group of sub-dialects in north-central Trøndelag county) than it is to understand a Swede speaking Swedish. Wikipedia corroborates: “Variations in grammar, syntax, vocabulary, and pronunciation cut across geographical boundaries and can create a distinct dialect at the level of farm clusters. Dialects are in some cases so dissimilar as to be unintelligible to unfamiliar listeners.”

There are two official standard forms for the written language, even if there is no standard for spoken Norwegian (since local dialects rule in most situations). Bokmål (literally “book tongue”) is used in the majority of publications, and Nynorsk (“new Norwegian”) in under 10% of written communication.

Lexicon and Morphology

Onto smaller language-y bits: words and morphemes. Norwegian is super fun because it is prone to extensive compounding (like German), and these compounds often break down into etymologically amusing or charming pieces. By this I mean that the component words reveal interesting (but usually sensible) semantic relationships with the larger compound. Let me give you some examples:

Norwegian compound English word Individual morphemes
fruktkjøtt “pulp” frukt (“fruit”) + kjøtt (“meat”)  ⇒  “fruit meat”
matbit “snack” mat (“food”) + bit (“bite”)  ⇒  “food bite”
sommerfugl “butterfly” sommer (“summer”) + fugl (“bird”) ⇒  “summer bird”
morkake “placenta” mor (“mother”) + kake (“cake”)  ⇒  “mother cake”
verdensrommet “(outer) space” verden (“world”) + s (possessive) + romm (“room”) + et (“the”)  ⇒  “the room of the world”
skyehus “hospital” skye (“sick”) + hus (“house”)  ⇒  “sick house”
grønnsak “vegetable” grøn (“green”) + sak (“thing”)  ⇒  “green thing”
støvsuger “vacuum cleaner” støv (“dust”) + suger (“suck[er]”)  ⇒  “dust suck[er]”
flaggermus “bat” flagger (“flying”) + mus (“mouse”)  ⇒  “flying mouse”
piggsvin “hedgehog” pig (“spike”) + svin (“pig”)  ⇒  “spike pig”

Morphosyntax 

rommegraut_cropped


Rest stop on the road back to Oslo. Rømmegraut is the Nynorsk word for a traditional porridge – kind of like cream of wheat, but sweeter and topped with butter.

One facet of Norwegian morphosyntax that was novel to me is the structure of its determiners. In English, both definite (“the”) and indefinite (“a / an”) articles are independent words that always precede their noun or noun phrase. So we have:

“the house”          “the big blue house”
“a house”             “a big blue house”

The same is true for the Romance languages I know about (French, Spanish, Italian), the other Germanic language I’m familiar with (German)… and it is simply not relevant for the Asian languages I’ve dabbled in (Japanese, Cantonese) because they lack articles entirely.

In Norwegian (as well as in Swedish and Danish), indefinite articles are, familiarly, the independent words which precede the noun, while definite articles are actually suffixes, which attach to the end of the noun they modify. What’s more – if you place something in front of the noun, like an adjective or a number, there’s another set of determiners to use, called demonstratives (in English: this, that, these, those). These precede the noun phrase (adjective/number + noun), where the noun already contains its definite suffix. Again, a table might help illustrate:

Norwegian (Bokmål) determiners

Indefinite articles

Definite articles

Masc. singular

Fem. singular

Neuter singular

Masc. singular

Fem. singular

Neuter singular

en

ei

et

-en

-a

-et

en sykkel
“a bicycle”

ei jente
“a girl”

et hus
“a house”

bilen
“the car”

døra
“the door”

huset
“the house”

Demonstratives + noun phrase

den

den

det

den røde bilen
“the red car”

den røde døra
“the red door”

det røde huset
“the red house”

Because Norwegian and English are closely related in their linguistic genealogy, a native English speaker may have less trouble learning Norwegian than, say, Taa (also known as !Xóõ, a southern African language with possibly the largest phoneme inventory in the world, including dozens of clicks) – but as the determiner situation here demonstrates, it’s still no piece of bløtkake.

IMG_20180708_100933

View (!) from our rental house deck on Hardangerfjord

Phonology and Prosody

Norwegian is what’s called a pitch-accent language. There are roughly three categories of languages when it comes to stress and pitch. Here’s a super abridged breakdown [3]:

  1. Stress-accented languages

Stress (emphasis) is placed on a syllable in a word, or on a word in a phrase/sentence. This can create a difference in word meaning, but it doesn’t have to. Stress is a combination of loudness, length, and higher pitch.

  • Example languages: English, Czech, Finnish, Classical Arabic, Quechua, Italian
  • Example words/phrases [English]:
    • On a word in a sentence (no difference in meaning) – “I REALLY like your jacket”
    • On a syllable in a word (meaning difference) –

NOUNS vs. VERBS
REcord vs. reCORD
INcrease vs. inCREASE
PERmit vs. perMIT

  1. Pitch-accented languages

A syllable on a word/morpheme is accentuated by a particular pitch contour (instead of by stress). So only pitch is involved, not loudness or length. Distinct tonal patterns occur in words that otherwise look and sound the same, giving them different meanings.

  • Example languages: Norwegian, Swedish, Japanese, Turkish, Filipino, Yaqui (a Native American language)
  • Example words/phrases [Norwegian]:
    • Norwegian has two kinds of tonal accents or pitch patterns:

ACCENT 1 (ACUTE) and ACCENT 2 (GRAVE)

(Audio extracted from video by “Norwegian Teacher – Karin”)

hender – “hands” vs. hender – “happens”
ånden – “the spirit” vs. ånden – “the breath”
bønder – “farmer” vs. bønner – “beans”
været – “the weather” vs. være – “to be”

  1. Tonal languages

Each syllable of the language has an independent tone or pitch contour. Tones are used to distinguish between words (they create a difference in meaning between words that otherwise look and sound the same).

  • Example languages: Mandarin, Cantonese, Thai, Zulu, Navajo, Yucatec (a Mayan language)
  • Examples words/phrases [Mandarin]:
    • Tones combine with the syllable ma, resulting in different words:
  1. “mother” [high level tone]
  2. “hemp” [mid pitch rising to high pitch]
  3. “horse” [low with slight fall]
  4. “scold” [short, sharply falling tone]
  5. ma (an interrogative particle) [neutral, used on weak syllables]

 

The pitch-accent feature of Norwegian contributes to the language’s sing-song quality. Just listen to the melodiousness of Norway’s King Harald V as he gives a speech:

(Audio extracted from full NRK video)

Orthography

Norwegian writing uses the same Latin alphabet as English, except that it has three additional letters at the end – æ, ø, and å. I highly recommend insist that you watch this ridiculous video to hear how the vowels are pronounced, as well as be entertained in musically nerdy fashion. (Final note: Contrary to the video’s main argument, several letters – c, q, w, x, and z – are not actually used to spell Norwegian-native words, although they’re sometimes used in loan words. One could therefore quibble that they shouldn’t count towards the alphabet size…)

vowels_cropped

 

 

[1] If you want to ogle some gorgeous macrophotography of lichens, scope out his Instagram, https://www.instagram.com/lichens_of_norway/.

[2] The ancient stave churches for some reason reminded me of True Blood (plus three of the show’s main characters, Eric, Pam, and Godric, were Swedish and Norwegian); also I was coincidentally reading The Vampire Lestat while we were there… but NO I’m not generally obsessed with vampires.

[3] This subject gets really complex. There are a lot more subtleties and distinctions than I make above.

O Syntax Tree, O Syntax Tree!

how_lovely

Digital voice agents like Alexa, Siri, and Google Assistant are all the rage these days. But when we talk to our smart devices, are they actually “understanding” our speech in the same way that another human understands it? Take the command, “Find flights from Chicago to New York on February 21.” We can easily comprehend this sentence; our newborn brains were predisposed to acquire language, and we’ve been using it ever since.

Computers, on the other hand, cannot acquire language. They must be trained. In order to train them, computational linguists, other linguists, and engineers have broken language down into more manageable parts that can be tackled individually. Automatic speech recognition (ASR) deals with training machines to recognize speech (via acoustic properties, etc.), and convert that speech to text. Next, natural language processing (NLP) attempts to figure out what is meant by that text[1]. An NLP system itself is composed of multiple modules[2], one of which will likely be a syntactic parser.

Today we’re going to delve into the parser component. Let’s start with some syntactic basics!

Syntax is the set of rules and processes governing sentence structure in any natural language. It involves things like word order, and constituents (words or phrases that form functional units). One of the most common ways to represent syntactic information (at least as of the 20th century) is with a syntax tree. Traditional syntax trees specify:

  • The words of a phrase/sentence
  • Part of speech for each word, usually abbreviated
    • N (noun); V (verb); P (preposition); D or DET (determiner, a.k.a. article); A (adjective); etc.
  • Larger phrases, also abbreviated
    • S (sentence); NP (noun phrase); VP (verb phrase); etc.
  • Relationships between all of the words and phrases
    • These are hierarchical relationships that show how constituents combine into larger ones (or split into smaller ones, if starting from the opposite end of the tree)

Here’s a tree diagram (specifically, a constituency tree) for the sentence, “My parakeet drinks mimosas in the morning”:

tree

You can see that my parakeet forms a larger chunk which is a noun phrase, in the morning forms a larger chunk which is a prepositional phrase, drinks mimosas in the morning forms an even larger chunk which is a verb phrase, and both the NP and VP combine to form the largest chunk, a full sentence S. Remember that syntax focuses on phrasal order and structure, not meaning or context – so it can’t tell us why on earth you’re feeding boozy orange juice to your pet bird.

Onto the parsing! Very generally, a parser is a piece of software (often a trained machine learning model) that takes input text, and outputs a parse tree or similar structural representation, based on syntactic rules and statistics learned from its training data.

Syntactic parsers include a component called a Context-Free Grammar, which has:

  1. A set of non-terminal symbols – abbreviations for language constituents (lexical parts of speech and phrasal types):

{S, NP, VP, PP, D, N, A…}

  1. A set of terminal symbols – words of the phrase/sentence:

{drinks, parakeet, mimosas, morning, my, in, the}

  1. A set of rules like:

S → NP VP  (a sentence S is composed of a noun phrase NP and verb phrase VP)

NP → D N  (a noun phrase NP is composed of a determiner D and a noun N)

VP → VP PP  (etc.)

PP → P NP

  1. A start symbol: S

The parser starts at S, and applies its rules successively, until it arrives at the terminal symbols. The resulting parse is the labeled relationships connecting those terminals (i.e. words).

There are two main kinds of syntactic parsers: dependency and constituency. To keep this post to a reasonable length, I’ll focus on dependency only, but constituency parsers output structures similar to the parakeet tree above[3]. A dependency parser builds a tree for each input sentence by starting with a sentence root (usually the main verb), and assigning a head word to each word, until it gets to the end of the sentence. (Heads link to dependents.) When it’s done, each word has at least one branch, or relationship, with another word. The parser also characterizes each word-word relationship. These are things like: nominal subject of a verb (“nsubj”); object of a verb or a preposition (“dobj” and “pobj,” respectively); conjunction (“cc” for the conjunction word, and “conj” for the elements being conjoined); determiner (“det”); and adverbial modifier (“advmod”).

A visualized example will probably help. Taking that same sentence, “My parakeet drinks mimosas in the morning,” a visualization of the dependency parse might look like this:

displacy_parse_parakeet_drinks

Can you spot the root, or main verb? It’s the one without any arrows going towards it: drinks. The parser then finds the subject of drinks, which is parakeet, and labels that relationship “nsubj.” It finds mimosas as the direct object of drinks, and labels it “dobj.” And so on and so forth.

Let’s look at another example, for a dollop of variety. Here is “Mr. Vanderloop had smiled and said hello”:

displacy_parse_vanderloop

In this one, the past participle smiled is the root/main verb, which has multiple dependents: its subject Vanderloop, its auxiliary (a.k.a. “helping verb”) had, its conjunction and, and the other verb with which it conjoins, said. The subject Vanderloop has a dependent Mr., with which it forms a compound (proper) noun; said’s dependent is the interjection hello.

How about our sentence from the beginning, “Find flights from Chicago to New York on February 21”? How might it be parsed? (You can check your hypotheses by typing the sentence into an interactive demo of the displaCy dependency visualizer, from which the visualizations above also came[4].) Something to keep in mind here is that English imperative structure leaves the subject – whoever is being addressed – implicit.

A slight aside: I’ve chosen simple examples for demonstration, but parsing gets decidedly complicated when input sentences are themselves complicated. Questions, subordinate clauses, coordination (or all three: “What’s the name of the movie where the guy drives a flying taxi and saves the human race from aliens?”), and structurally ambiguous sentences (“The horse raced past the barn fell”) get tricky quickly.

So now we have some parsed output. How is this structured, annotated data useful? Well, one thing you can do with these word relations is identify noun phrases. Identifying noun phrases across sentences helps with another step in the NLP pipeline called Named Entity Recognition, or NER. NER tries to recognize nouns/noun phrases (names, places, dates, etc.) and label them with categories of concepts from the real world. In our flights example, “Chicago” and “New York” should get tagged with some label like CITY or GEOGRAPHIC LOCALE, and “February 21” should get tagged with DATE. Once a text has been automatically annotated for such named entities, information about those entities can then be pulled from a knowledge base (say, Wikipedia).

Having parts of speech and word relations also makes it easier to match up the specifics of a given user command (e.g. “Text mom saying I’ll call tonight,” or “Show popular Thai restaurants near me”) with slightly more generalized intents (e.g. Send text or Get restaurants); machine models can start learning how words typically pattern across the main verb and direct object positions for various commands. Code then uses the more generalized intent to fulfill that request on a device – be it smartphone, tablet, or home speaker. “Find flights from Chicago to New York on February 21” would hopefully be matched with a more general Get flights intent, and the particular noun phrases could be passed to fields for origin, destination, and date.

* * * * *

Before leaving you to your holiday leftovers, I’d like to reiterate that syntactic parsing is only one step in an NLP system. Its parses don’t tell us much about the actual semantics of the linguistic input. Language meaning, however, is a whole other ball of wax, best left for the new year…

 

[1] There is often terminological confusion between NLP and NLU (natural language understanding). See this graphic for one common breakdown, although I’ve heard the terms used interchangeably as well.

[2] If you’re interested to learn about other NLP steps, read this accessible post, Natural Language Processing is Fun!

[3] You can also play around with this interactive demo from Stanford CoreNLP, http://corenlp.run. In the second “Annotations” field dropdown, make sure you have “constituency parse” selected.

[4] The visualizer is from the creators of spaCy, an awesome open-source NLP library in Python; a dependency parser is one of its components.

I heart hangry bagel droids (or: How new words form)

The_fin_de_siècle_newspaper_proprietor_(cropped)

You’re probably familiar with the old adage “the only thing that’s constant is change.” Still, so many people tend to think about language as a relatively fixed affair. I’ve said it before (and will inevitably say it again): all living languages change all the time, and at all levels – phonological (sounds!), morphological (word-bits!), lexical (words!), syntactic (clauses!), and semantic (meaning!).

Historical linguistics (also known as diachronic linguistics) is the study of how and why languages change over time. In this post I’m going to discuss categories of change at the morphological and lexical levels – how new words come into being. In the future, I’ll explore semantic and perhaps phonological change.

Without further ado, here are the main mechanisms of word formation. Almost all examples are for English, but these formation types apply to other languages as well. (NOTE: Processes are not mutually exclusive. It is quite possible for a word to undergo multiple processes simultaneously, or one on the heels of another.)

  1. Derivation

New words are born by adding affixes to existing words. Affixes are bound[1] morphemes that can be prefixes, suffixes, and even (for certain languages, although not really for English) infixes and circumfixes. Derivation is a very common process cross-linguistically.

Zero derivation (also known as conversion) is a special case where a new word, with a new word class (part of speech) is created from an existing word of a different class, without any change in form.

Examples:
(Derivation) hater [hate + -er], truthiness [truth + -i (-y) + -ness], deglobalization [de- + globalization], hipsterdom [hipster + -dom]

(Zero derivation) heart as verb, as in “I heart coffee” [heart as noun]; friend as verb, as in “he friended me on Facebook” [friend as noun]; green as noun, in the golf lawn sense [green as adjective]; down as verb, as in “Hector downed a beer” [down as preposition]

  1. Back-formation

This process creates a new word through the removal of true or incorrectly assumed affixes. It’s kind of the opposite of derivation. This one is easier to explain through examples:

New word

Derived from older word

Analysis

donate, automate, resurrect

(verbs)

donation, automation, resurrection

(nouns)

The nouns were borrowed into English first from Latin. The verbs were back-formed later by discarding the -ion suffix, which speakers did through analogy with other Latinate verb and (-ion) noun pairs that already existed in English.

pea

pease

The older form was initially a mass noun (like water or sand), but was reanalyzed as plural. People then dropped the “plural” -s(e) to form the “singular” count noun pea.

beg, edit, hawk

(verbs)

beggar, editor, hawker

(nouns)

Speakers mistook the -ar, -or, and ­-er on the ends of these nouns (respectively) for the agentive suffix (that did/does exist in English), and removed it to form corresponding verbs.

lime-a-rita, mango-rita

appletini, kiwini

margarita

martini

Actually examples of folk etymology, which is related to back-formation. Here, speakers incorrectly assumed that -rita in margarita and –(t)ini in martini were separate morphemes (indicating the class of cocktail). Under that assumption, they switched out the rest of the word and substituted it with morphemes indicating new twists/ingredients.

  1. Blending

Also known as portmanteaus. Blends are produced by combining two or more words, where parts of one or both words are deleted.

Examples: smog [smoke + fog], brunch [breakfast + lunch], infomercial [information + commercial], bromance [bro + romance], hangry [hungry + angry], clopen [close + open][2]

  1. Borrowing

Also known as loan words. These are expressions taken from other languages. Pronunciation is usually altered to fit the phonological rules of the borrowing language.

Examples: algebra [from Arabic], ménage à trois [from French], whisky [from Scots Gaelic or Irish], bagel [from Yiddish], doppelgänger [from German], karaoke [from Japanese]

  1. Coinage

Words can be created outright to fit some purpose. Many of these are initially product names.

Examples: Xerox, Kleenex, Jell-O, Google, zipper, Frisbee

  1. Compounding

Two or more words join together to form a compound. Frequently the joining words are nouns, but they can belong to different parts of speech, including verbs, adjectives, prepositions, etc. Compounds can be separated by spaces, by hyphens, or glued to each other with nothing intervening.

Examples: homework, grocery store, mother-of-pearl, first world problem, binge-watch, weaksauce, fake news

  1. Eponyms

These are words that derive from proper nouns – usually people and place names. If a proper noun is used frequently enough and across multiple contexts, it eventually becomes a common noun (or verb or adjective).

Examples: sandwich [after the fourth Earl of Sandwich], gargantuan [after Gargantua, name of the giant in Rabelais’ novels], boycott [after Capt. Charles C. Boycott], mesmerize [a back-formation from mesmerism, in turn after Franz Anton Mesmer], sadism [after the Marquis de Sade]

  1. Reducing

Several types of reducing processes exist.  The main ones are clipping, acronyms, and initialisms.

a. Clipping

New words can be formed by shearing one or more syllables off an existing longer word. Syllables can be removed from the word’s beginning, end, or both.

Examples: fax [facsimile], flu [influenza], droid [android], fridge [refrigerator], blog [weblog]

b. Acronyms

Words are created from the initial letters of several other words. Acronyms are pronounced as regular words (in contrast to initialisms below).

Examples: NASA [National Aeronautics and Space Administration], RAM [random-access memory], FOMO [fear of missing out]

c. Initialisms

Also known as Alphabetisms. Like with acronyms, a word is created from the initial letters of other words, but the resulting term is pronounced by saying each letter. This usually happens when the string of letters is not easily pronounced as a word according to the phonological rules of the language.

Examples: NFL [National Football League], UCLA [University of California, Los Angeles], MRI [magnetic resonance imaging], WTF [what the fuck]

  1. Reduplication

Reduplication is one of my favorite phenomena.[3] It’s a process whereby a word or sound is repeated or nearly repeated to form a new word/expression. This is a productive morphological process (meaning, it’s part of the grammar and happens frequently and rather systematically) in many languages – South-East Asian and Austronesian languages particularly (e.g. Malay, Tagalog, Samoan). It’s not an especially productive process in English, although it does still happen.

Examples:
(English) wishy-washy, teensy-weensy, goody-goody, cray-cray, po-po

(Samoan) savali [‘he travels’ – third person singular + verb]; savavali [‘they travel’ – third person plural + verb]

* * * * *

Phew! Since hopefully you can see the light at the end of this long lexical tunnel, I’ll mention that of course languages lose words as well. Diverse factors motivate word loss, but that’s a subject for another post. A few quick examples of words that have fallen out of favor in English:

pell-mell [in a disorderly, reckless, hasty manner]; davenport [couch/sofa – my grandma used to say this]; grass [for marijuana – my mom still says this]; porridge [an oatmeal-like dish boiled in water or milk]; tumbrel [a farmer’s cart for hauling manure]; fain [gladly or willingly]

* * * * *

And now… ADD WORDS TO THE SPREADSHEET – Word shenanigans!

I’ve got almost 200 in there to start us off. If you’re not sure about the process for any particular word, just leave it blank or take a guess. Free bagel droids[4] to all who contribute.

 

[1] Bound meaning they cannot exist on their own, but must be attached to another morpheme.

[2] Describes a shitty situation where one has to work a closing shift followed by an opening shift. We used this term as bartenders, although I’d never seen it in print until recently. It came up in some paperwork I had to sign relating to work week ordinances, and then I saw it here as well.

[3] Some languages even have triplication – where the sound/word is copied twice!

[4] Kidding! These do not exist outside of my head. Sorry.