.md

Kyle Phillips

Creative Technologist, Google NYC

Gemini Live Web Console

Bringing Google's low-latency conversational AI to the web

December, 2024

In December 2024, Google Deepmind released the Gemini 2.0 Flash model and their first-ever low-latency, audio-first "Live API". While our models were advancing there was no easy on-ramp for web developers, and I knew there were some hard parts that needed to be solved to reduce the barrier. The Live API Web Console aimed to solve those hard parts.

Along with designer People/Hana Tanimura and creative director People/Alexander Chen I began to build what became Gemini's 2nd-most popular github repository. The Live API Web Console is a starter project, written in react + typescript that manages the websocket connection, worklet-based audio processing, logging and streaming of your webcam or screen sharing.

A diagram showcasing the 3 main areas of the console

An animation showing the control tray interface The control tray shows your websocket connection status, your microphone activity, the models audio activity as well as controls to share your webcam or your screen.

logs The logs view shows you details of all of the different messages going into and coming out of the model


Since the initial launch of this project it has had an outsized impact. It has been used in Projects/Gemini Robotics to showcase embodied reasoning, as well as featured several times at Google's I/O 2025 event; including demos in the developer keynote, pre-show and AI sandbox. It is often found at the forefront of where AI finds personality.

Supporting Materials

1 Reference

  1. Projects/Gemini Robotics

    In December of 2024, I launched the Projects/Gemini Live Web Console. That project lowered the barrier of prototyping with conversational AI and it became an ideal basis for Deepmind researchers to build upon. I worked with People/Fei Xia and others at Deepmind and Creative Lab to produce a series of robotics demonstrations of embodied reasoning.