A Unified Framework for Multilingual Text-to-Speech Synthesis with SSML Specification as Interface期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

A Unified Framework for Multilingual Text-to-Speech Synthesis with SSML Specification as Interface

Authors:	WU Zhiyong CAO Guangqi MENG M Helen CAI Lianhong

Institution:	^aDepartment of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China;^bTsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems, Graduate School at Shenzhen, Tsinghua University, Shenzhen 518055, China

Abstract:	This paper describes the design of a unified framework for a multilingual text-to-speech (TTS) synthesis engine – Crystal. The unified framework defines the common TTS modules for different languages and/or dialects. The interfaces between consecutive modules conform to the speech synthesis markup language (SSML) specification for standardization, interoperability, multilinguality, and extensibility. Detailed module divisions and implementation technologies for the unified framework are introduced, together with possible extensions for the algorithm research and evaluation of the TTS synthesis. Implementation of a mixed-language TTS system for Chinese Putonghua, Chinese Cantonese, and English demonstrates the feasibility of the proposed unified framework.

Keywords:	text-to-speech (TTS) synthesis multilingual unified framework speech synthesis markup language (SSML)
本文献已被 CNKI 万方数据 ScienceDirect 等数据库收录！