Presentation

Track: WebRTC and Real-Time Applications

Building a Low-Latency Voice Assistant Using LLMs: Insights and Challenges

At Telnyx, we provide APIs enabling dynamic phone call interactions. With the rise of Large Language Models (LLMs), we integrated them into our voice flows, enhancing customer applications. Our Voice Assistant combines transcription, response generation, and speech synthesis. Initially, we faced significant latency issues and poor interruption handling. Through various optimizations, including LLM streaming, service colocation, improved transcription, and a custom text-to-speech system, we reduced latency to 900-1,000ms. We also improved user experience with advanced end-of-speech detection and noise handling.

During this presentation, we will explore our progress, the challenges we faced, and the innovative solutions we implemented to build a high-performance Voice Assistant. Today, our system delivers low-latency, high-quality voice interactions, allowing customers to focus on their business logic.

Enzo Piacenza - Speaker

Presentation Video

Real Time Communications Conference & Expo at Illinois Tech

IEEE International Conference

Follow Us

Share This

News

RTC Conference 2024

RTC Conference is Today!

RTC Conference at Illinois Tech

More Info: