x
Economy Science USA World

OpenAI Unveils Advanced AI Models with Enhanced ‘Reasoning’ Capabilities

OpenAI Unveils Advanced AI Models with Enhanced ‘Reasoning’ Capabilities
  • PublishedSeptember 13, 2024

OpenAI, backed by Microsoft, announced the launch of its new “Strawberry” series of AI models.

These models, known internally as “o1” and “o1-mini,” are designed to handle complex problem-solving tasks in fields like science, mathematics, and coding. Unlike previous iterations, these models take more time to process responses, a feature aimed at improving their reasoning abilities.

According to OpenAI’s blog post on Thursday, the o1 model scored an impressive 83% on the qualifying exam for the International Mathematics Olympiad, significantly surpassing the 13% achieved by the earlier GPT-4o model. Additionally, the models demonstrated enhanced performance in competitive programming and exceeded human-level accuracy in a benchmark of PhD-level science questions.

The key advancement in these models is the use of a technique called “chain-of-thought” reasoning, which involves breaking complex problems into smaller, logical steps. This technique, which has been a known prompting strategy in AI research, is now automated in the o1 series, allowing the models to independently refine their approach and correct mistakes as they process queries.

Noam Brown, a researcher at OpenAI, confirmed the new models’ connection to the Strawberry project on social media, expressing excitement about their potential for more general reasoning.

The o1 series will be integrated into ChatGPT and made available via OpenAI’s API, with a preview version accessible to ChatGPT Plus and Team users. However, these models come with limitations, such as longer response times and text-only capabilities. Additionally, users will face message limits during the preview phase. Despite these drawbacks, OpenAI believes the models represent a significant leap in AI’s ability to solve complex problems, particularly in math, coding, and science.

With input from Reuters, Axios, the Atlantic, Azure.

Written By
Joe Yans