We build our model on top of Stem-ML because it captures several desirable properties. A positive feature of Stem-ML is that the representation is understandable, adjustable, and can be transported from one situation to another.
Unlike most approaches, this model cleanly separates into local (word-dependent) and global (speaker-dependent) parameters. For instance, one can generate acceptable speech by using the templates of one speaker with prosodic strengths from another, where a female speaker's tone templates were used as part of a model to predict a male speaker's f0 contours. Unlike some descriptive models, we predict numerical f0 values, and so our model is subject to quantitative test, and can be extended to testing linguistic theories. Few other approaches to intonation have these properties.