{"id":843,"date":"2024-02-23T12:56:14","date_gmt":"2024-02-23T20:56:14","guid":{"rendered":"https:\/\/linguamonium.com\/?p=843"},"modified":"2024-02-23T12:56:14","modified_gmt":"2024-02-23T20:56:14","slug":"real-life-star-trek-humans-fail-to-detect-speech-deepfakes","status":"publish","type":"post","link":"https:\/\/linguamonium.com\/?p=843","title":{"rendered":"Real-life Star Trek: Humans fail to detect speech deepfakes"},"content":{"rendered":"<p>According to a recent study in AI and crime science, humans are not all that good at detecting speech deepfakes. Read the sciencedaily.com summary <a href=\"https:\/\/www.sciencedaily.com\/releases\/2023\/08\/230802162437.htm\" target=\"_blank\" rel=\"noopener\">here<\/a>, and the full paper <a href=\"https:\/\/journals.plos.org\/plosone\/article?id=10.1371\/journal.pone.0285333\" target=\"_blank\" rel=\"noopener\">here<\/a>.<\/p>\n<p>&nbsp;<\/p>\n<p>A few brief comments:<\/p>\n<p>First off, if you\u2019re not familiar with the concept of a\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Deepfake\" target=\"_blank\" rel=\"noopener\"><strong>deepfake<\/strong><\/a>, now would be a good time to learn.<\/p>\n<p>&nbsp;<\/p>\n<p>Second, I\u2019ll observe that while the online world has lately spilled a lot of digital ink over AI-generated <strong>writing<\/strong> and <strong>imagery<\/strong>, I haven\u2019t heard as much about AI-generated <strong>speech<\/strong>.<\/p>\n<p>&nbsp;<\/p>\n<p>One scary result from the study is that even after <em>training<\/em>, people were still not great at the identification task:<\/p>\n<p style=\"padding-left: 40px;\">\u201cParticipants were only able to identify fake speech 73% of the time, <em>which improved only slightly after they received training to recognise aspects of deepfake speech<\/em>.\u201d (Emphasis mine.)<\/p>\n<p>&nbsp;<\/p>\n<p>According to the ScienceDaily report, this investigation is the first to \u201cassess human ability to detect artificially generated speech in a language other than English\u201d; researchers looked at both English and Mandarin. I believe this should serve as a reminder that we need more research on AI in a variety of languages \u2013 especially if we\u2019re trying to make generalizations and predications about how AI tools can best serve (or potentially harm) people around the world.<\/p>\n<p>&nbsp;<\/p>\n<p>Towards the end of the ScienceDaily summary:<\/p>\n<p style=\"padding-left: 40px;\">\u201c[\u2026] there are growing fears that such technology could be used by criminals and nation states to cause significant harm to individuals and societies. Documented cases of deepfake speech being used by criminals include one 2019 incident where the CEO of a British energy company was convinced to transfer hundreds of thousands of pounds to a false supplier by a deepfake recording of his boss&#8217;s voice.\u201d<\/p>\n<p>As terrifying as this real-life scenario is, it made me chuckle inadvertently, for I instantly thought of a scene from \u201cStar Trek: The Next Generation\u201d where Wesley Crusher creates some small piece of technology with which he impersonates Captain Picard\u2019s voice, causing comedic confusion and dismay amongst ship crew members (in \u201cThe Naked Now\u201d, season 1, episode 3). Curious, how creators of the show reliably envisioned our technological trajectory over 30 years ago. (Even though I\u2019m a child of the \u201880s and \u201890s, I didn\u2019t really watch Star Trek growing up\u2026but I\u2019ve recently begun watching \u201cThe Next Generation,\u201d and its lighthearted nerdiness is refreshing at the end of a long day.)<\/p>\n<p>The deepfake speech crime above also made me think of how <a href=\"https:\/\/linguamonium.com\/?p=748\" target=\"_blank\" rel=\"noopener\"><strong>forensic linguists<\/strong><\/a> (probably in conjunction with computer scientists) may have their work cut out for them.<\/p>\n<p>&nbsp;<\/p>\n<p>The full paper has a bunch of interesting points and I recommend reading it if you have the time\/inclination. The word cloud graphics (of study participants\u2019 freeform text responses) and corresponding discussion are particularly thought-provoking.<\/p>\n<p>&nbsp;<\/p>\n<h3>Playing catch-up<\/h3>\n<p>Here\u2019s yet another, more recent article on the topic of audio deepfakes: <a href=\"https:\/\/www.scientificamerican.com\/article\/ai-audio-deepfakes-are-quickly-outpacing-detection\/\" target=\"_blank\" rel=\"noopener\">AI Audio Deepfakes Are Quickly Outpacing Detection<\/a>.<\/p>\n<p>One of the big takeaways from the interviewee is how it\u2019s now incredibly easy to <em>create<\/em> \u201cconvincing audio deepfakes,\u201d and yet the skills and tech needed for <em>identifying <\/em>AI-generated speech are much farther behind. Such a mismatch between creation and detection is obviously problematic for many reasons. Crucially, the mismatch contributes to our growing difficulties distinguishing what is real and worthy of trust.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>Photo attribution: <a href=\"https:\/\/unsplash.com\/photos\/a-child-using-a-cell-phone-Lq8ho5dJReg\" target=\"_blank\" rel=\"noopener\">BandLab<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>According to a recent study in AI and crime science, humans are not all that good at detecting speech deepfakes. Read the sciencedaily.com summary here, and the full paper here. &nbsp; A few brief comments: First off, if you\u2019re not familiar with the concept of a\u00a0deepfake, now would be a good time to learn. &nbsp;&#8230;<\/p>\n","protected":false},"author":1,"featured_media":847,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_newsletter_tier_id":0,"footnotes":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false}}},"categories":[20],"tags":[247,250,204,121,248,246,249],"class_list":["post-843","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-nlu-nlp-ml-ai","tag-ai-generated-speech","tag-digital-forensics","tag-machine-learning","tag-ml","tag-research","tag-speech-deepfakes","tag-speech-synthesis"],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v21.7 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Real-life Star Trek: Humans fail to detect speech deepfakes - Linguamonium<\/title>\n<meta name=\"description\" content=\"Recent research demonstrates that humans are not great at detecting speech deepfakes. I share some of my musings on the topic.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/linguamonium.com\/?p=843\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Real-life Star Trek: Humans fail to detect speech deepfakes - Linguamonium\" \/>\n<meta property=\"og:description\" content=\"Recent research demonstrates that humans are not great at detecting speech deepfakes. I share some of my musings on the topic.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/linguamonium.com\/?p=843\" \/>\n<meta property=\"og:site_name\" content=\"Linguamonium\" \/>\n<meta property=\"article:published_time\" content=\"2024-02-23T20:56:14+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/linguamonium.com\/wp-content\/uploads\/2024\/02\/boy_waveform_recording_medium_cropped.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1558\" \/>\n\t<meta property=\"og:image:height\" content=\"987\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"hannah\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@linguamonium\" \/>\n<meta name=\"twitter:site\" content=\"@linguamonium\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hannah\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/linguamonium.com\/?p=843#article\",\"isPartOf\":{\"@id\":\"https:\/\/linguamonium.com\/?p=843\"},\"author\":{\"name\":\"hannah\",\"@id\":\"https:\/\/linguamonium.com\/#\/schema\/person\/f6a9c49248cb623f9a6061aaacff0238\"},\"headline\":\"Real-life Star Trek: Humans fail to detect speech deepfakes\",\"datePublished\":\"2024-02-23T20:56:14+00:00\",\"dateModified\":\"2024-02-23T20:56:14+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/linguamonium.com\/?p=843\"},\"wordCount\":565,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/linguamonium.com\/#\/schema\/person\/f6a9c49248cb623f9a6061aaacff0238\"},\"keywords\":[\"AI-generated speech\",\"digital forensics\",\"machine learning\",\"ML\",\"research\",\"speech deepfakes\",\"speech synthesis\"],\"articleSection\":[\"NLU\/NLP\/ML\/AI\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/linguamonium.com\/?p=843#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/linguamonium.com\/?p=843\",\"url\":\"https:\/\/linguamonium.com\/?p=843\",\"name\":\"Real-life Star Trek: Humans fail to detect speech deepfakes - Linguamonium\",\"isPartOf\":{\"@id\":\"https:\/\/linguamonium.com\/#website\"},\"datePublished\":\"2024-02-23T20:56:14+00:00\",\"dateModified\":\"2024-02-23T20:56:14+00:00\",\"description\":\"Recent research demonstrates that humans are not great at detecting speech deepfakes. I share some of my musings on the topic.\",\"breadcrumb\":{\"@id\":\"https:\/\/linguamonium.com\/?p=843#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/linguamonium.com\/?p=843\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/linguamonium.com\/?p=843#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/linguamonium.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Real-life Star Trek: Humans fail to detect speech deepfakes\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/linguamonium.com\/#website\",\"url\":\"https:\/\/linguamonium.com\/\",\"name\":\"Linguamonium\",\"description\":\"A tumult of linguistic musings\",\"publisher\":{\"@id\":\"https:\/\/linguamonium.com\/#\/schema\/person\/f6a9c49248cb623f9a6061aaacff0238\"},\"alternateName\":\"Linguamo\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/linguamonium.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/linguamonium.com\/#\/schema\/person\/f6a9c49248cb623f9a6061aaacff0238\",\"name\":\"hannah\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/linguamonium.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/linguamonium.com\/wp-content\/uploads\/2018\/06\/cropped-dummeh-babboh-2-1.png\",\"contentUrl\":\"https:\/\/linguamonium.com\/wp-content\/uploads\/2018\/06\/cropped-dummeh-babboh-2-1.png\",\"width\":512,\"height\":512,\"caption\":\"hannah\"},\"logo\":{\"@id\":\"https:\/\/linguamonium.com\/#\/schema\/person\/image\/\"},\"sameAs\":[\"https:\/\/linguamonium.com\",\"www.linkedin.com\/in\/hannah-vanbrunt\",\"https:\/\/twitter.com\/linguamonium\"],\"url\":\"https:\/\/linguamonium.com\/?author=1\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Real-life Star Trek: Humans fail to detect speech deepfakes - Linguamonium","description":"Recent research demonstrates that humans are not great at detecting speech deepfakes. I share some of my musings on the topic.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/linguamonium.com\/?p=843","og_locale":"en_US","og_type":"article","og_title":"Real-life Star Trek: Humans fail to detect speech deepfakes - Linguamonium","og_description":"Recent research demonstrates that humans are not great at detecting speech deepfakes. I share some of my musings on the topic.","og_url":"https:\/\/linguamonium.com\/?p=843","og_site_name":"Linguamonium","article_published_time":"2024-02-23T20:56:14+00:00","og_image":[{"width":1558,"height":987,"url":"https:\/\/linguamonium.com\/wp-content\/uploads\/2024\/02\/boy_waveform_recording_medium_cropped.jpg","type":"image\/jpeg"}],"author":"hannah","twitter_card":"summary_large_image","twitter_creator":"@linguamonium","twitter_site":"@linguamonium","twitter_misc":{"Written by":"hannah","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/linguamonium.com\/?p=843#article","isPartOf":{"@id":"https:\/\/linguamonium.com\/?p=843"},"author":{"name":"hannah","@id":"https:\/\/linguamonium.com\/#\/schema\/person\/f6a9c49248cb623f9a6061aaacff0238"},"headline":"Real-life Star Trek: Humans fail to detect speech deepfakes","datePublished":"2024-02-23T20:56:14+00:00","dateModified":"2024-02-23T20:56:14+00:00","mainEntityOfPage":{"@id":"https:\/\/linguamonium.com\/?p=843"},"wordCount":565,"commentCount":0,"publisher":{"@id":"https:\/\/linguamonium.com\/#\/schema\/person\/f6a9c49248cb623f9a6061aaacff0238"},"keywords":["AI-generated speech","digital forensics","machine learning","ML","research","speech deepfakes","speech synthesis"],"articleSection":["NLU\/NLP\/ML\/AI"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/linguamonium.com\/?p=843#respond"]}]},{"@type":"WebPage","@id":"https:\/\/linguamonium.com\/?p=843","url":"https:\/\/linguamonium.com\/?p=843","name":"Real-life Star Trek: Humans fail to detect speech deepfakes - Linguamonium","isPartOf":{"@id":"https:\/\/linguamonium.com\/#website"},"datePublished":"2024-02-23T20:56:14+00:00","dateModified":"2024-02-23T20:56:14+00:00","description":"Recent research demonstrates that humans are not great at detecting speech deepfakes. I share some of my musings on the topic.","breadcrumb":{"@id":"https:\/\/linguamonium.com\/?p=843#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/linguamonium.com\/?p=843"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/linguamonium.com\/?p=843#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/linguamonium.com\/"},{"@type":"ListItem","position":2,"name":"Real-life Star Trek: Humans fail to detect speech deepfakes"}]},{"@type":"WebSite","@id":"https:\/\/linguamonium.com\/#website","url":"https:\/\/linguamonium.com\/","name":"Linguamonium","description":"A tumult of linguistic musings","publisher":{"@id":"https:\/\/linguamonium.com\/#\/schema\/person\/f6a9c49248cb623f9a6061aaacff0238"},"alternateName":"Linguamo","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/linguamonium.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/linguamonium.com\/#\/schema\/person\/f6a9c49248cb623f9a6061aaacff0238","name":"hannah","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/linguamonium.com\/#\/schema\/person\/image\/","url":"https:\/\/linguamonium.com\/wp-content\/uploads\/2018\/06\/cropped-dummeh-babboh-2-1.png","contentUrl":"https:\/\/linguamonium.com\/wp-content\/uploads\/2018\/06\/cropped-dummeh-babboh-2-1.png","width":512,"height":512,"caption":"hannah"},"logo":{"@id":"https:\/\/linguamonium.com\/#\/schema\/person\/image\/"},"sameAs":["https:\/\/linguamonium.com","www.linkedin.com\/in\/hannah-vanbrunt","https:\/\/twitter.com\/linguamonium"],"url":"https:\/\/linguamonium.com\/?author=1"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/linguamonium.com\/wp-content\/uploads\/2024\/02\/boy_waveform_recording_medium_cropped.jpg?fit=1558%2C987&ssl=1","jetpack_likes_enabled":true,"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/linguamonium.com\/index.php?rest_route=\/wp\/v2\/posts\/843","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/linguamonium.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/linguamonium.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/linguamonium.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/linguamonium.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=843"}],"version-history":[{"count":9,"href":"https:\/\/linguamonium.com\/index.php?rest_route=\/wp\/v2\/posts\/843\/revisions"}],"predecessor-version":[{"id":854,"href":"https:\/\/linguamonium.com\/index.php?rest_route=\/wp\/v2\/posts\/843\/revisions\/854"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/linguamonium.com\/index.php?rest_route=\/wp\/v2\/media\/847"}],"wp:attachment":[{"href":"https:\/\/linguamonium.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=843"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/linguamonium.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=843"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/linguamonium.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=843"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}