Blog

OpenAI releases GPT-5.2 after “code red” Google threat alert

OpenAI releases GPT-5.2 after “code red” Google threat alert

OpenAI Releases GPT-5.2, Aiming to Surpass Competition

In the rapidly evolving landscape of artificial intelligence, model releases are becoming increasingly frequent. OpenAI’s latest release, GPT-5.2, marks the company’s third major update since August. This new model builds upon the foundation laid by its predecessors, incorporating a new routing system that toggles between instant-response and simulated reasoning modes. Although users initially complained about responses feeling cold and clinical, subsequent updates have focused on making the system more conversational.

November’s GPT-5.1 update introduced eight preset “personality” options, aiming to enhance the model’s conversational capabilities. The latest release, GPT-5.2, is ostensibly a response to the performance of Gemini 3, a competing model. However, OpenAI has chosen not to directly compare the two models on its promotional website, instead highlighting GPT-5.2’s improvements over its predecessors and its performance on the new GDPval benchmark.

Performance Benchmarks

The GDPval benchmark, developed by OpenAI, measures professional knowledge work tasks across 44 occupations. According to the company, GPT-5.2 Thinking beats or ties “human professionals” on 70.9 percent of tasks in this benchmark, outperforming Gemini 3 Pro, which achieves 53.3 percent. Additionally, GPT-5.2 completes these tasks at more than 11 times the speed and less than 1 percent of the cost of human experts.

OpenAI has also shared comparison benchmarks with Gemini 3 Pro and Claude Opus 4.5, showcasing GPT-5.2’s performance on various tasks. On the SWE-Bench Pro software engineering benchmark, GPT-5.2 scored 55.6 percent, compared to 43.3 percent for Gemini 3 Pro and 52.0 percent for Claude Opus 4.5. On the GPQA Diamond graduate-level science benchmark, GPT-5.2 scored 92.4 percent, narrowly outperforming Gemini 3 Pro’s 91.9 percent.

GPT-5.2 benchmarks that OpenAI shared with the press.

Credit:

OpenAI / Venturebeat

Reducing Confabulations

According to Max Schwarzer, OpenAI’s post-training lead, GPT-5.2 Thinking generates responses with 38 percent fewer confabulations than GPT-5.1. This improvement is significant, as confabulations can lead to inaccurate or misleading information. Schwarzer notes that the model “hallucinates substantially less” than its predecessor, a crucial step towards developing more reliable AI systems.

While these benchmarks are promising, it’s essential to approach them with a critical eye. The science of measuring AI performance objectively is still evolving, and corporate sales pitches often present benchmarks in a favorable light. Independent benchmark results from researchers outside OpenAI will provide a more comprehensive understanding of GPT-5.2’s capabilities.

For now, users can expect competent models with incremental improvements and enhanced coding performance. As the AI landscape continues to shift, it’s crucial to stay informed about the latest developments and advancements. For more information on GPT-5.2 and its implications, visit Here

Image Credit: arstechnica.com

Leave a Reply

Your email address will not be published. Required fields are marked *