OpenAI announced the launch of o1, an AI model with reasoning capabilities, internally codenamed "Strawberry". OpenAI o1 can reason about complex tasks and solve problems that are more difficult than previous scientific, coding, and mathematical models.
In tests, OpenAI o1 performed similarly to PhD students on challenging benchmark tasks such as physics, chemistry, and biology, and performed well in mathematics and coding. In the International Mathematical Olympiad (IMO) qualifying exam, GPT-4o only solved 13% of the problems correctly, while the reasoning model scored 83%, and OpenAI o1's coding ability reached the 89th percentile in the Codeforces competition.
As an early model, OpenAI o1 does not yet have many useful features of ChatGPT, such as browsing the web for information and uploading files and images. GPT-4o will be more powerful in the short term. But for complex reasoning tasks, this is a major improvement and represents a new level of AI capabilities.
In view of this, the counter is reset to 1 and the series is named OpenAI o1. Healthcare researchers can use o1 to annotate cell sequencing data, physicists can use o1 to generate the complex mathematical formulas required for quantum optics, and developers in all fields can use o1 to build and execute multi-step workflows.
OpenAI also released OpenAI o1-mini, a cost-effective inference model. o1-mini excels in STEM, especially math and coding—performing nearly as well as OpenAI o1 on evaluation benchmarks such as AIME and Codeforces. OpenAI expects o1-mini to be a faster, cost-effective model for applications that require inference without extensive world knowledge, 80% cheaper than o1-preview. ChatGPT Plus, Team, Enterprise, and Edu users can use o1-mini as an alternative to o1-preview with higher rate limits and lower latency.