The podcast recording and editing platform Podcastle has joined other companies in the generative AI field by releasing its own text-to-speech model called Asyncflow v1.0. An API will also be available for developers, allowing them to integrate this model directly into their applications. Thanks to the new model, the company can offer more than four hundred and fifty AI voices capable of narrating text.
Podcastle has joined a number of startups such as ElevenLabs, Speechify, and WellSaid, which have developed technologies for converting text into AI-narrated voice clips. This technology has wide applications in marketing, advertising, content creation, education, and corporate training.
Podcastle founder Arto Yeritsyan noted that the company has always aimed to create a text-to-speech model, but training costs and data requirements were too high. Thanks to the development of large language models, they made significant progress last year, which enabled them to create a high-quality voice model without the need for large amounts of data.
Podcastle is also improving its voice cloning feature, which now allows the model to be trained faster. Previously, the training process required reading about seventy different sentences, but now just a few seconds of recording are enough to create a clone of your voice. This process uses Magic Dust AI technology, which was released last year to enhance audio quality. The company noted that it plans to further improve this feature over time.