Source: Quantum
OpenAI is launching an advanced AI chatbot that you can talk to. The bot is available now, at least for some people.
The new chatbot represents OpenAI's move toward a new generation of AI voice assistants that are similar to Siri and Alexa, but more powerful and can enable more natural and smooth conversations. It's a step towards more comprehensive AI agents. The new ChatGPT voice bot can distinguish the information conveyed by different tones, respond to interruptions, and answer queries in real time. It has also been trained to sound more natural and can convey a variety of different emotions with its voice.
Its voice mode is driven by OpenAI's new GPT-4o model, which combines voice, text, and vision capabilities. To gather feedback, the company initially rolled out the chatbot to a "small number" of paid users of ChatGPT Plus, but the company said it will be available to all ChatGPT Plus subscribers this fall. ChatGPT Plus subscriptions cost $20 per month (about 144 yuan). OpenAI said it will notify the first batch of users in the ChatGPT app and provide instructions on how to use the new model.
The new voice feature was announced in May, but was launched a month later than originally planned because the company said it needed more time to improve security features, such as the model's ability to detect and reject bad content. The company also said it was preparing its infrastructure to provide real-time responses to millions of users.
OpenAI said it has tested the model's voice capabilities with more than 100 external red team members, whose task is to detect whether the model has flaws. According to OpenAI, these testers speak a total of 45 languages and come from 29 countries.
The company said it has implemented a number of safety mechanisms. For example, to prevent the model from being used to create audio deep fakes, the company worked with voice actors to create four preset voices. GPT-4o does not imitate or generate the voice of others.
When OpenAI first launched GPT-4o, the company faced backlash for using a voice called "Sky" that sounds a lot like actress Scarlett Johansson. Johansson issued a statement saying the company had approached her about allowing her voice to be used in the model, but she declined. She said she was shocked to hear a voice that was "strikingly similar" to hers in a model demonstration. OpenAI denied that the voice was Johansson's but has suspended the use of Sky.
The company has also been embroiled in multiple lawsuits over alleged copyright infringement. OpenAI said it has implemented filters to identify and block requests to generate music or other copyrighted audio. OpenAI also said it has applied the same safety mechanisms it uses in text-based models to GPT-4o to prevent it from violating laws and generating harmful content.
OpenAI plans to add more advanced features in the future, such as video and screen sharing, that could make the assistant even more useful. In a demonstration in May, employees pointed their phone cameras at a piece of paper and asked the AI model to help them solve a math equation. They also shared their computer screens and asked the model to help them solve a programming problem. OpenAI said those features won’t be available now but will be available at an unspecified date later.