Sign language, ASL, and baby signing

September was National Deaf Awareness Month. I tried to post this piece before the month ended, but alas! Better late than never. I’d like to discuss and dispel some of the many misconceptions around signed languages. Here are a few of the most common:

  • Sign language is universal – there is only one
  • Sign languages are not “real” languages
    • They’re simpler and easier to learn than spoken languages; they’re just gestures, or body language, or pantomime
    • They’re not as complex as spoken languages; they don’t have true grammars, or large vocabularies, or the ability to express abstract concepts
    • They were “invented”; they didn’t evolve naturally among communities over time
    • They have to be explicitly taught; they cannot be acquired naturally by children through exposure as with spoken language
  • Sign languages are the visual equivalent of spoken languages – for example, American Sign Language is the visual equivalent of English

I’ll also spend some time discussing “baby sign language” (which is of personal import due to last year’s arrival of my very own teacup human).

Sign languages

Sign languages are natural languages whose modality is visual and kinesthetic instead of speech- and sound-based. They exhibit complexity parallel to that of spoken languages, with rich grammars and lexicons. Sign languages developed and are used among communities of deaf people, but can also be used by hearing individuals. These languages are not composed solely of hand movements. A good deal of their prosody, grammar (e.g. syntax, morphology), modification (adjectives and adverbials), and other features are expressed through head movements, facial expressions, and body postures.

American Sign Language (ASL)

American Sign Language (ASL) is the main language of Deaf communities in the U.S. and Canada. Contrary to what many assume, ASL is not grammatically related to English. From Wikipedia:

“On the whole […] sign languages are independent of spoken languages and follow their own paths of development. For example, British Sign Language (BSL) and American Sign Language (ASL) are quite different and mutually unintelligible, even though the hearing people of the United Kingdom and the United States share the same spoken language. The grammars of sign languages do not usually resemble those of spoken languages used in the same geographical area; in fact, in terms of syntax, ASL shares more with spoken Japanese than it does with English.”

ASL emerged in the early 1800s at the American School for the Deaf in Connecticut, from a mix of Old French Sign Language, village sign languages, and home signs. ASL and French Sign Language (LSF – Langue des Signes Française) still have some overlap, but are not mutually intelligible.

One element of ASL that I find particularly neat is its reduplication (repetition of a morpheme or word to serve a particular grammatical function, like plurality)[1]. Reduplication is a common process in many languages, and it performs several important jobs in ASL. It does things like pluralize nouns, convey intensity, create nouns from verbs (e.g. the noun chair is a repeated, slightly altered version of the verb to sit), and represent verbal aspects such as duration (e.g. VERB + for a long time).

Baby sign language

What is “baby sign language”? I haven’t found a very precise definition. The term seems to basically describe signing between (hearing) parents/caregivers and young children, but whether the signs come from a legitimate sign language like ASL, or are invented idiosyncratically by the family using them (and are maybe more iconic[2]), or some combination of the two, varies from source to source.

Anthropologist-psychologist Gwen Dewar, on her blog parentingscience.com, says:

“The term is a bit misleading, since it doesn’t refer to a genuine language. A true language has syntax, a grammatical structure. It has native speakers who converse fluently with each other. By contrast, baby sign language […] usually refers to the act of communicating with babies using a modest number of symbolic gestures.”

When Dr. Dewar mentions symbolic gestures, she is describing things like pointing or other hand motions that accompany speech and make communication with preverbal infants a little easier. Most of the baby sign language resources I’ve come across endorse using ASL as a base, however, so it’s not just “symbolic gestures”. At the same time, the ASL signs are often simplified (both by baby sign teachers and the parents learning), and the fuller grammar is not usually taught to/learned by parents or their children.

In the following post, I’m going to delve further into baby sign language – its supposed benefits, tips, timelines, resources, and my personal experience so far. We’ll look at proponents’ claims versus the scientific research. (Spoiler: Your offspring won’t be the next Einstein just because you taught them how to sign ‘more’ and ‘milk’.)

 

*Photo attribution: “Learn sign language at the playground”


[1] More on this process in “I heart hangry bagel droids (or: How new words form)” – see #9.

[2] I’ll discuss iconicity in the next post.

Career interviews: Computational linguist for a virtual assistant

working wugs_cropped

Wugs go to work

After much delay (eek! just realized it’s been a year!), I have another interview with a career linguist for your reading pleasure. [See the first interview here.] Even though I still get the “I’ve never met a real-live linguist” reaction when telling folks what I do, these days there are indeed people working full-time, earning living wages, as these specialized language nuts – and not all as professors in academia, or as translators/interpreters for the UN.

* * * * *

Just like with my last interviewee, I met Allan at Samsung Research America, where we worked together on Bixby, Samsung’s virtual voice assistant. On the Bixby linguist team, we worked with engineers, Quality Assurance (QA) testers and others to develop a personal assistant that would carry out thousands of different spoken user commands. Also like with my last interviewee, Allan is no longer at the job I interviewed him about. (He’s now a Language Engineer on Amazon’s Alexa!). I’m keeping questions and answers in present tense, however, because I feel like it.

Allan Schwade, a graduate student in linguistics, won the Humanities Division Dean's Award for his poster on the adaptation of Russian words by English speakers

  1. What kind of work do you do?

I’m a computational linguist, which means I create solutions for natural language processing problems using computers. More specifically, I work on the systems and machine learning models that enable your smart devices to understand you when you say “set an alarm for 7am” or “tell me the weather in Chicago”.

  1. Describe a typical day at your job.

I usually start the day by meeting with my manager. The lab I work in supports products in production and conducts research and development for smart devices. If there is an issue with a product in production, I’ll work with the team to solve the problem. Usually this involves curating the training data for the machine learning models – removing aberrant data from training sets or generating new data to support missing patterns. If nothing is on fire, there are usually several projects I’ll be working on at any given time. Projects generally start out with me doing a lot of reading on the state of the art, then I’ll reach a point where I’m confident enough to build a proof of concept (POC). While I’m creating the POC, the linguists will generate data for the models. Once the code and data are ready, I’ll build the models and keep iterating until performance is satisfactory. The only really dependable thing in my schedule is lunch and a mid-afternoon coffee break with colleagues, which are both indispensable.

  1. How does your linguistics background inform your current work?

My degree in linguistics is crucial for my current line of work. When building machine learning models, so much rests on data you feed into your models. If your data set is diverse and representative of the problem, your model will be robust.

Having a linguistics background also gives me quick insight into data sets and how to balance them. Understanding the latent structures in the data allows me to engineer informative feature vectors for my models (feature vectors are derived from the utterances collected and are the true inputs to the machine learning model).

  1. What do you enjoy most and/or least about the job?

I really enjoy getting to see differences between human and machine learning. We have a pretty good idea of the types of things humans will attend to when learning language, but sometimes those things aren’t informative for machines. It can be frustrating when something I’d call “obvious” is useless in a model and even more frustrating when something “marginal” is highly informative. But I never tire of the challenge, the satisfaction I feel at the end of a project is worth it.

The thing I enjoy least is data annotation. The process of doing it is indispensable because you become intimately familiar with the problem, but after a couple of hours of it my mind goes numb.

  1. What did you study in college and/or grad school?

I got my BA from Rutgers University and my MS from the University of California, Santa Cruz. Both degrees were in linguistics and both schools specialized in generative linguistic theory. I enjoyed a lot about the programs but they did a better job of preparing people for careers in academia than industry. Learning programming or common annotation tools and schemas before graduating would have made industry life easier for me.

  1. What is your favorite linguistic phenomenon?

Loanword adaptation! I wrote my master’s thesis on it. Seeing how unfamiliar phonemes are digested by speakers never fails to pique my interest. In general, I love it when stable systems are forced to reconcile things outside their realm of experience.

  1. (If you had the time) what language would you learn, and why?

As a phonetician I’d love to learn Georgian for its consonant clusters, Turkish for its morpho-phonology, Hmong for its tones, or ASL because it’s a completely different modality than what I specialized in. As a subjective entity who does things for personal enjoyment, I’d love to learn Japanese.

  1. Do you have any advice for young people looking to pursue a career in linguistics?

If you want to go into industry doing natural language processing, I cannot stress enough how important the ability to code is. It’s true that for annotation work you won’t usually need it, but if you want to be annotation lead, the ability to write utility scripts will save you a lot of time. Also, how I transitioned from annotator to computational linguist came from me showing basic coding competency – the engineers were too busy to work on some projects so they threw the smaller ones my way. This brings me to my next piece of advice: always voice your interest in things that interest you to those with the potential to get you involved. Telling your co-worker you really want to work on a cool project will do next to nothing, but telling your manager or the project lead that you are interested in a project may get you involved.

Frame Semantics and FrameNet

FN image

I’d like to discuss a theory in cognitive linguistics which is very near to my heart[1]: frame semantics. I’ll also present FrameNet, a database built using frame semantic theory, which has been and continues to be an excellent resource in the fields of natural language processing (NLP) and machine learning (ML).

Why is frame semantics cool? Why should you want to learn about it? Just this: the theory is an intuitive and comprehensive way to categorize the meaning of any scenario you could possibly dream up and express via language. Unlike many other semantic and syntactic theories, the core concepts are quickly understandable to the non-linguist. What’s more, frame semantics can apply to language meaning at many different levels (from the tiniest morpheme to entire swaths of discourse), and it works equally well for any particular language – be it English, Mandarin, Arabic, or Xhosa. I’ll try to demonstrate the theory’s accessibility and applicability with some details.

American linguist Charles Fillmore developed the frame semantics research program in the 1980s, using the central idea of a frame: a cognitive scene or situation which is based on a person’s prototypical understanding of real-world (social, cultural, biological) experiences. A frame is ‘evoked’ by language – this can be a single word (called a lexical unit), a clause, a sentence, or even longer discourse. Each frame contains various participants and props, called frame elements (FEs). If you’ve studied syntax/semantics (the generative grammar kind), FEs are somewhat analogous to traditional theta roles.

FrameNet is a corpus-based lexicographic and relational database (sort of a complex dictionary) of English frames, the lexical units evoking them, annotated sentences containing those lexical units, and a hierarchy of frame-to-frame relations. It was built and continues to grow at the International Computer Science Institute (ICSI), a nonprofit research center affiliated with UC Berkeley. FrameNets have also been developed in other languages, such as Spanish, Brazilian Portuguese, Japanese, Swedish, French, Chinese, Italian, and Hebrew.

Each frame entry includes a definition, example sentences, frame elements, lexical units, and annotation that illustrates the various fillers (words) of the FEs as well as their syntactic patterns. Let’s unpack all of this!

We’ll take a look at the Motion frame in FrameNet. Some screenshots of the frame entry follow.

framenet_motion1

The Motion frame is first defined. Its definition includes the frame elements that belong to the frame (the text with color highlighting):

“Some entity (Theme) starts out in one place (Source) and ends up in some other place (Goal), having covered some space between the two (Path). Alternatively, the Area or Direction in which the Theme moves or the Distance of the movement may be mentioned.”

After the definition come example sentences, featuring lexical units that evoke the frame (the black-backgrounded text) such as move, drift, float, roll, go.

Further down is the list of frame elements with their definitions and examples.

framenet_motion2

Here, the Theme FE is “the entity that changes location,” while the Goal FE is “the location the Theme ends up in.” In order for language to evoke this Motion frame, it must have some words or phrases which instantiate the Theme, the Goal, and the other FEs listed. In the examples above, me is a Theme in The explosion made [me] MOVE in a hurry; and into the slow lane is a Goal in The car MOVED [into the slow lane].

At the bottom of the entry is a list of lexical units that belong to or evoke the frame, as well as links to annotation of sentences from real data that contain those words.

framenet_motion3

Verbs like come, glide, roll, travel, and zigzag all evoke, quite sensibly, the Motion frame.

Once you click on the “Annotation” link for a particular lexical item, you’re taken to a page that looks like this:

framenet_motion4

Natural language sentences pulled from online corpora (texts from newspapers, magazines, books, tv transcripts, scholarly articles, etc.) are annotated for their Motion FEs. Annotation for the lexical item glide gives us an idea of the types of “entities” (the purple-backgrounded text, or Theme FEs) that “change location” (i.e. that glide) – boats, pink clouds, men, cars, planes, gondolas, and so on.

* * * * *

After this mini FrameNet dive, you may be wondering how the database is used in a concrete sense. To illustrate, let’s compare two sentences:

  1. The boat GLIDED into the harbor.
  2. The dingy DRIFTED away from the harbor.

The entities differ (boat vs. dingy), the verbs differ (glide vs. drift) and the prepositions differ (into vs. [away] from). Yet at a higher level, both of these sentences describe a Theme which “changes location” – either moving towards a Goal in (1), or from a Source in (2). They both indicate motion. Because FrameNet helps machines “learn” that sentences with a variety of nouns, verbs, prepositions, and syntactic patterns can basically point to the same scenario, it’s a useful tool for many applications in the computational realm.

These days computers do all kinds of language-y things for us: answer questions, paraphrase texts, extract relevant information from text (and then maybe organize it thematically – for instance, around people, places, or events), and even generate new texts. These feats require that a computer parse natural language into accurate semantic chunks. FrameNet’s semantically- and syntactically-annotated data can be used as training input for machine models that “learn” how to analyze such meaning chunks, enabling our electronic devices to respond, paraphrase, or extract information appropriately.

To peruse a (very long) list of the projects which have used FrameNet data (organized by requester/researcher), check out the FrameNet Downloaders page.

So – on the off-chance that you find yourself stuck at home and bored out of your mind (?!?!)… you might perhaps enjoy a little investigation of frame-semantic characterization of scenes that involve applying heat, intoxication, or temporal collocation. 🙂

 

[1] Why am I so fond of frame semantics? A terrific professor of mine during grad school introduced the theory, and it resonated with me immediately. I used it in my master’s thesis, then presented the paper at the International Conference on Construction Grammar in 2014. Eventually, I had the privilege of working at FrameNet, where I came to know the brilliant lexicographers/semanticists/cognitive linguists who have dedicated decades of their lives to the theory and the project. Sadly, I never met the legendary Chuck Fillmore, as he passed away the year before I joined the FrameNet team.

Back from hiatus

Why, hello there! It’s been ages since I’ve posted, but I’ve been pretty busy with a tiny new experiment:

IMG_20191125_111833_cropped

Ryden was born in October (the photo was taken at not-quite-two-months) and is now emphatically ingesting solids, crawling (but only backwards), and beginning to babble.

Now that my life has gone from hallucinatorily topsy-turvy to relatively stable (in a pandemic – yes, that’s how childbirth and newborn-land will relativize things), I plan on posting again more regularly. Coming up, stuff on:

  • frame semantics and FrameNet
  • “parentese” (apropos, yes?)
  • another linguist career interview
  • “crashblossoms”

Hurray!