Creating natural-sounding spoken dialogues between two people has long been a pain point in AI-driven voice applications. Traditional approaches either…
Spoken Dialogue Generation
Kimi-Audio: A Unified, Open-Source Foundation Model for Speech, Sound, and Spoken Dialogue 4373
Building voice-enabled applications today often means stitching together separate models for speech recognition, sound classification, audio captioning, and spoken response…