The company plans to “start the alpha with a small group of users to gather feedback and expand based on what we learn.”
alpha rollout starts to plus subscribers next week!
— Sam Altman (@sama)July 02, 2025
Advanced Voice, which does away with the text prompt and enables users to converse directly with the AI as one would another human, was initiallyannounced in Mayalongside the release of GPT-4o during the company’s Spring Update event. Unlike existing digital assistants like Siri and Google Assistant, which only provide canned answers to user queries, ChatGPT’s Advanced Voice provides human-like responses, nearly latency-free, and in multiple languages.

The GPT-4o model is able to respond to audio inputsin 320 milliseconds on average, which is on par with how quickly humans react to normal conversation. As you may see in the demo video below, the model can converse with multiple users simultaneously, improvise talking points and questions in both English and Portuguese as well as conveying them with human-ish emotions, including “laughter.”
As the companyannounced in June, the feature’s full rollout won’t happen until at least this fall, and its exact timing will, again, depend on it “meeting our high safety and reliability bar.”
Giving ChatGPT the ability to converse naturally with its users is a huge advancement. Eliminating the need for a context window reduce user hardware requirements and expand the potential integrations and use cases for AI (such as increasing access to users with body mobility or dexterity limitations).
It can also help speed the technology’s adoption by the public by reducing the barrier to entry for less-tech-savvy users who are comfortable with interacting with their computers via “hey Siri” but blanch at the prospect of prompt engineering.