After several delays reportedly related to safety and fine-tuning, OpenAI’s much anticipated “Advanced Voice Mode” (AVM) for ChatGPT is now available in alpha to select users.
The AVM feature was announced and demonstrated back in May. It allows users to have a real time conversation with the ChatGPT artificial intelligence model via a tech-to-speech synthesization module.
Remember Duplex?
Those familiar with the concept may remember Google’s 2018 announcement that its “Duplex” AI service would be available “soon.” At its IO developer’s event, the company showed off an AI system capable of calling businesses on your behalf to schedule appointments in real time with humans.
The big idea, according to Google, was that the AI would be robust enough to handle casual conversation and to confirm the correct information.
The Duplex project was eventually shuttered, but its legacy apparently lives on in OpenAI’s ChatGPT.
Advanced Voice Mode
AVM features real-time communication that attempts to mimic human-to-human discussions. ChatGPT responds to user queries in a human-like voice that has a natural cadence. Users can interrupt the chatbot mid-sentence and, based on the demo, it can keep track of what’s been said.
The company is launching the feature in limited alpha in order to continue evaluating its capabilities and safety implications. While the May demos were impressive, there were some glitchy moments and it isn’t hard to imagine scenarios where the technology could be misused.
Per OpenAI, safety has been the company’s paramount concern. In a post on X announcing the feature’s launch, the company wrote:
“We tested GPT-4o’s voice capabilities with 100+ external red teamers across 45 languages. To protect people’s privacy, we’ve trained the model to only speak in the four preset voices, and we built systems to block outputs that differ from those voices. We’ve also implemented guardrails to block requests for violent or copyrighted content.”
The timed rollout of AVM has already begun, according to OpenAI, and will continue with more users to be added “on a rolling basis.” The company expects the feature to be available to all Plus subscribers in the fall.