Presentation

Track: WebRTC and Real-Time Applications
Keynote

Engaging a New Species: How Multimodal LLMs Are Set to Transform RTC Infrastructure

The advent of multimodal large language models has introduced a new kind of participant into human real-time communication (RTC) infrastructure. While these models lack ears, they can still listen; though they have no mouth, they can speak. This raises a compelling question: how will the interfaces for integrating these models differ from the traditional microphones and speakers we use today? And how will these models, as new "customers" of RTC infrastructure, behave compared to humans?

For example, the codecs designed for human auditory processing may need to evolve. LLMs could "speak" at speeds far beyond human capability or process several seconds of speech in just one second, provided the data arrives simultaneously. What new requirements will these capabilities place on RTC infrastructure, and how will it adapt?

With over 700 billion minutes of real-time audio and video running through Agora’s RTC infrastructure annually, we are finely tuned for human-to-human communication. We invite you to join this keynote as we explore the potential of real-time communication using large language models and examine the possibilities and challenges in this innovative form of conversation.

Dr. Shawn Zhong - Speaker
Tony Bin Zhao - Speaker

Presentation Video

Presentation Notes

ZHAO-ZHONG-ENGAGING-A-NEW-SPECIES-MULTIMODAL-LLMS2.pdf

Real Time Communications Conference & Expo at Illinois Tech

IEEE International Conference

Follow Us

Share This

News

RTC Conference 2024

RTC Conference is Today!

RTC Conference at Illinois Tech

More Info: