Modeling Expressive Performances of the Singing Voice

The long term goal of this work is to develop models of operatic singers and use them to generate expressive performances similar in voice quality and style with what original performances by those singers would sound like. This paper focuses on learning timing models of expressive performance by using high-level descriptors extracted from existing audio recordings. Our approach is based on applying machine learning to discover singer-specific timing patterns of expressive singing based on existing performances. The experimental results show a significant correlation between the note durations of real performances and those predicted by our model.