Tag Archives: resemble.ai

Tooling for expressive TTS

Over the last several years I’ve been working on a number of ways to make computer generated speech more expressive.
Not just a bit, but radically more so.

A Christmas Carol - audiobook excerpt

Several years ago before the Cambrian explosion of neural speech models – I spent a lot of time working on parametric synthesis. The thought being that one could animate speech in the same way that Pixar animators manipulate wireframe models to create the illusion of life in CG characters. This work resulted in a couple of patents which have since been outpaced by the rapid development in machine learning and neural speech synthesis.

Once spline control of parameters became obsolete, I switch my focus to developing better tools for managing large scripts and multiple takes of a large cast of characters. The most recent incarnation of this is tied to the open source TTS package “Chatterbox Turbo” from Resemble.ai.

As a proof of concept, I wanted to use these tools to create a long form excerpt from “A Christmas Carol”. I designed a pleasing narrator voice, and an appropriately grumpy one for Scrooge.