Previous Abstract  | Full Text PDF (434 K)  | Next Abstract

International Journal of Speech Technology
6 (1): 33-43, January 2003
Copyright © 2003 Kluwer Academic Publishers
All rights reserved

Hierarchical Structure and Word Strength Prediction of Mandarin Prosody

Greg Kochanski
Bell Laboratories, Lucent Technologies, Murray Hill, NJ, USA. gpk@alum.mit.edu

Chilin Shih
Bell Laboratories, Lucent Technologies, Murray Hill, NJ, USA. cls@prosodies.org

Hongyan Jing
Bell Laboratories, Lucent Technologies, Murray Hill, NJ, USA. hjing@us.ibm.com

Abstract

We use Stem-ML to build an automatic learning system for Mandarin prosody that allows us to make quantitative measurements of prosodic strengths. Stem-ML is a phenomenological model of the muscle dynamics and planning process that controls the tension of the vocal folds. Because Stem-ML describes the interactions between nearby tones or accents, we were able to use a highly constrained model with only one accent template for each lexical tone category, and a single prosodic strength per word. The model accurately reproduces the intonation of the speaker, capturing 87% of the variance of the speech's fundamental frequency, f0. The result reveals strong alternating metrical patterns in words, and suggests that the speaker uses word strength to mark a hierarchy of sentence, clause, phrase, and word boundaries.

Keywords
prosody, dynamics, modeling, intonation, tone, Mandarin Chinese

Article ID: 5104947