ByteDance’s Volcano Engine Unveils Two DouBao Voice Large Models with Breakthrough Features
ByteDance’s Volcano Engine has introduced two novel DouBao voice large models,namely the DouBao · Voice Podcast Model and the DouBao · Real – Time VoiceModel. These models are set to elevate the intelligence quotient of sPeechrecognition and generation.
The DouBao · Voice Podcast Model offers remarkable capabilities. By inputtinga sentence, web lINK, long text, or document, it can raPidly conduct seARChes,learn, and generate podcast scripts and content. It’s also capable ofinstantaneously creating double – person conversational podcast works withnatural effects, mimicking real – life podcast rhythms. Moreover, its built -in deep search function can generate content based on hot topics.
The DouBao · Real – Time Voice Model is designed for real – time speechrecognition and generation, with wide applications in online meetings andeducational training. Available for enterprise clients, this model supportsadvanced natural language instructions control. It can perform singing,impersonate voices, interpret dialects, and significantly enhance human – liketones, expressions, and thinking styles. It can interrupt and initiateconversations proactively.
These two models mark a significant step in Volcano Engine’s voice technologylayout. They show great potential in podcast content creation and real – timevoice interaction. Looking ahead, Volcano Engine will continue to innovate,driving the dEVElopment of voice interaction and the arrival of theintelligent era.