本文以帮助聋儿言语康复为出发点，从聋儿音频发音数据中获得了聋儿易错发音文本以及聋儿易混淆发音文本对。设计了一个数据驱动的3D说话人头发音系统，该系统以EMA AG500设备采集的发音动作为驱动数据，逼真模拟了汉语的发音，从而可使聋儿观察到说话人嘴唇及舌头的运动情况，辅助聋儿发音训练，纠正易错发音。最后对系统的性能进行了人工评测，结果表明：3D说话人头发音系统可以有效地模拟说话人发音时口腔内外器官的发音动作。此外，本文还用基于音素的CM协同发音模型合成的方法，合成了聋儿易错发音文本的发音动动作，并用RMS度量了合成发音动作与真实发音动作的误差，得到了均值为1.25 mm的RMS误差值。
In order to help the hearing loss children, we obtained hearing loss children’s fallible pronunciation texts and the confusing pronunciation text pairs form a good deal of hearing loss children’s audio pronunciation data. We designed a data-driven 3D talking head articulatory animation system, it was driven by the articulatory movements which were collected from a device called Electro-magnetic articulography (EMA) AG500, the system simulated Chinese articulation realistically. In that way, the hearing loss children can observe the speaker’s lips and tongue’s motions during the speaker pronouncing, which could help the hearing loss children train pronunciation and correct the fallible pronunciations. Finally, a perception test was applied to evaluate the system’s performance. The results showed that the 3D talking head system can animate both internal and external articulatory motions effectively. A modified CM model based synthesis method was used to generate the articulatory movements. The root mean square between the real articulatory movements and synthetic articulatory movements was used to measure the synthesis method, and an average value of RMS is 1.25 mm.
郑红娜,朱 云,王 岚,等.汉语三维发音动作合成和动态模拟 [J].集成技术,2013,2(1):23-28
Zheng Hongna, Zhu Yun, Wang LAN, et al. Chinese 3D Ariticulatory Movement Synthesis and Animation[J]. Journal of Integration Technology,2013,2(1):23-28