报告题目:The New Frontier of Speech to Speech Translation: Breaking Down Language Barriers
报告人:微软研究院微软翻译团队技术项目首席经理William Lewis
报告时间:2016年9月17日(星期一)9:30-10:30
报告地点:文管A116
邀请人:朱靖波教授
报告摘要:
In 1966, Star Trek introduced us to the notion of the Universal Translator. Such a device allowed Captain Kirk and his crew to communicate with alien species, such as the Gorn, who did not speak their language. In 1979, Douglas Adams introduced us to the “Babelfish” in the Hitchhiker’s Guide to the Galaxy which, when inserted into the ear, allowed the main character to do essentially the same thing: communicate with alien species who spoke different languages. Although flawless communication using speech and translation technology is beyond the current state of the art, major improvements in these technologies over the past decade have brought us many steps closer. The Microsoft Translator team has taken on the challenge of production level and open domain Speech to Speech (S2S) Translation. By using the latest technologies, specific Deep Neural Networks (DNNs), employed in both Automated Speech Recognition (ASR) and Machine Translation (MT) training, we have significantly improved the quality of speech translation. We have provided the speech translation service through an API, so tools such as Skype Translator, our mobile apps, along with many other tools and apps, can integrate S2S directly into an existing workflow. Skype Translator, in particular, provides a unique user experience, since it allows a Skype user who speaks, say, English, to call a colleague or friend who speaks, say, Mandarin, and be able to hold a bilingual conversation mediated by the translator.
The crucial technologies to make Skype Translator possible are Automated Speech Recognition (ASR), Machine Translation (MT), and Text to Speech (TTS). In this talk, I will review how we built conversational Speech Recognition models, altered the output of conversational SR via disfluency processing (“TrueText”), and built conversational MT models that perform better on output from an SR engine, all combined to make near real-time conversations possible. Further, I will review our S2S API and how anyone can use it to build their own apps or use it in their work.
个人简介:
William Lewis is Principal Technical Program Manager with the Microsoft Translator team at Microsoft Research. He has led the Microsoft Translator team's efforts to build Machine Translation engines for a variety of the world's languages, including threatened and endangered languages, and has been working with the Translator team to build Skype Translator. He has been leading the efforts to support the features that allow deaf and hard of hearing users to make calls over Skype. This work has been extended to the classroom in Seattle Public Schools, where “mainstreamed” deaf and hard of hearing children are using MSR’s speech recognition technology to participate fully in the “hearing” classroom. Before joining Microsoft, Will was Assistant Professor and founding faculty for the Computational Linguistics Master's Program at the University of Washington, where he continues to hold an Affiliate Appointment, and continues to teach classes on Natural Language Processing. Before that, he was faculty at CSU Fresno, where he helped found the Computational Linguistic and Cognitive Science Programs at the university. He received a Bachelor's degree in Linguistics from the University of California Davis and a Master's and Doctorate in Linguistics, with an emphasis in Computational Linguistics, from the University of Arizona in Tucson. In addition to regularly publishing in the fields of Natural Language Processing and Machine Translation, Will is on the editorial board for the Journal of Machine Translation, is on the board for the Association for Machine Translation in the Americas (AMTA), served as a program chair for the National American Association for Computational Linguistics (NAACL) conference in 2015, served as a program chair for the Machine Translation Summit in 2015, regularly reviews papers for a number of Computational Linguistic conferences, and has served multiple times as a panelist for the National Science Foundation.
报告人:微软研究院微软翻译团队技术项目首席经理William Lewis
报告时间:2016年9月17日(星期一)9:30-10:30
报告地点:文管A116
邀请人:朱靖波教授
报告摘要:
In 1966, Star Trek introduced us to the notion of the Universal Translator. Such a device allowed Captain Kirk and his crew to communicate with alien species, such as the Gorn, who did not speak their language. In 1979, Douglas Adams introduced us to the “Babelfish” in the Hitchhiker’s Guide to the Galaxy which, when inserted into the ear, allowed the main character to do essentially the same thing: communicate with alien species who spoke different languages. Although flawless communication using speech and translation technology is beyond the current state of the art, major improvements in these technologies over the past decade have brought us many steps closer. The Microsoft Translator team has taken on the challenge of production level and open domain Speech to Speech (S2S) Translation. By using the latest technologies, specific Deep Neural Networks (DNNs), employed in both Automated Speech Recognition (ASR) and Machine Translation (MT) training, we have significantly improved the quality of speech translation. We have provided the speech translation service through an API, so tools such as Skype Translator, our mobile apps, along with many other tools and apps, can integrate S2S directly into an existing workflow. Skype Translator, in particular, provides a unique user experience, since it allows a Skype user who speaks, say, English, to call a colleague or friend who speaks, say, Mandarin, and be able to hold a bilingual conversation mediated by the translator.
The crucial technologies to make Skype Translator possible are Automated Speech Recognition (ASR), Machine Translation (MT), and Text to Speech (TTS). In this talk, I will review how we built conversational Speech Recognition models, altered the output of conversational SR via disfluency processing (“TrueText”), and built conversational MT models that perform better on output from an SR engine, all combined to make near real-time conversations possible. Further, I will review our S2S API and how anyone can use it to build their own apps or use it in their work.
个人简介:
William Lewis is Principal Technical Program Manager with the Microsoft Translator team at Microsoft Research. He has led the Microsoft Translator team's efforts to build Machine Translation engines for a variety of the world's languages, including threatened and endangered languages, and has been working with the Translator team to build Skype Translator. He has been leading the efforts to support the features that allow deaf and hard of hearing users to make calls over Skype. This work has been extended to the classroom in Seattle Public Schools, where “mainstreamed” deaf and hard of hearing children are using MSR’s speech recognition technology to participate fully in the “hearing” classroom. Before joining Microsoft, Will was Assistant Professor and founding faculty for the Computational Linguistics Master's Program at the University of Washington, where he continues to hold an Affiliate Appointment, and continues to teach classes on Natural Language Processing. Before that, he was faculty at CSU Fresno, where he helped found the Computational Linguistic and Cognitive Science Programs at the university. He received a Bachelor's degree in Linguistics from the University of California Davis and a Master's and Doctorate in Linguistics, with an emphasis in Computational Linguistics, from the University of Arizona in Tucson. In addition to regularly publishing in the fields of Natural Language Processing and Machine Translation, Will is on the editorial board for the Journal of Machine Translation, is on the board for the Association for Machine Translation in the Americas (AMTA), served as a program chair for the National American Association for Computational Linguistics (NAACL) conference in 2015, served as a program chair for the Machine Translation Summit in 2015, regularly reviews papers for a number of Computational Linguistic conferences, and has served multiple times as a panelist for the National Science Foundation.