Delphi Intelligence | What OpenAI Wants and Where Things Went Wrong

The release of GPT5 sparked significant controversy, leading Sam Altman to promise the return of GPT4o during Friday's Reddit AMA (ask me anything)—a rather embarrassing situation.

While GPT5's performance and the drastic downgrade of previous models represents the primary controversy, I'm more interested in the less-discussed aspects of OpenAI's most significant release since 2023, particularly how it reflects a shift in the company's strategic focus.

First, on the developer's end, the enhancements are very agent-friendly:

Tools: custom tools now support any text input (code, SQL, shell, configuration, etc.) beyond just JSON; Allowed tools mechanism lets you dynamically specify which subset of tools can be used in each round; Parallel tool invocation enables the model to use multiple tools simultaneously, greatly improving efficiency for complex tasks and multi-tasking; Preamble feature automatically generates concise explanations before tool use, improving transparency and debugging.
Structured Output: Support for CFG (context-free grammar) allows developers to use lark/regex to precisely constrain output formats. Combined with strict mode, JSON output success rates have improved from 40% to over 90%. The model now strictly adheres to parameter formats, making outputs more predictable and safer.
Performance/Latency/Cost Control: New "Reasoning Effort" settings (minimal/low/medium/high) let developers fine-tune the depth of model "thinking"—minimal for ultra-low latency needs, high for complex reasoning tasks. Output verbosity settings (low/medium/high) control detail levels; for code, low generates refined code while high provides detailed explanations with structured formatting.
Context: Chain-of-Thought cross-round delivery enables reasoning chains to persist across conversation rounds, giving developers fine-grained control over context management and multi-step tasks.
Pricing: Token caching support for high-frequency scenarios, dramatically lower API pricing (now aligned with Gemini and approaching 1/10th of Claude's rates).

These API enhancements focus on being "agent-friendly" and "developer-controllable." Experienced model API developers will recognize that while these features aren't directly related model intelligence, they address significant headaches in day-to-day engineering applications. When model intelligence itself isn't the bottleneck, these practical features often become decisive factors in choosing which AI system to use.

Second is the ecosystem. Few may have noticed that Cursor released CLI mode on the second day after GPT5's launch, quickly integrating with GPT5 in response to Claude Code's impact. Frequent users recognize that Claude Code's true killer feature is its $200 max subscription that provides thousands of dollars worth of tokens. Now that Anthropic can no longer afford such subsidies and has begun limiting usage, the affordable GPT5 API paired with Cursor CLI offers an immediate counterattack just as the competition pauses to regroup. Additionally, the open-weight version released recently features strategically selected parameter sizes—120B for high-performance workstations and 20B for consumer PCs and even MacBooks. Its optimization goals and marketing clearly target the recent surge in China's open-source models.

Finally, the most radical change to the consumer experience is ChatGPT's completely revamped user interface. GPT5 represents more than just a model—it's a UNIFIED system that combines a fast model, a thinking model, and real-time routing between them. OpenAI has eliminated all other models, offering only GPT5. This simplification actually benefits the "silent majority" of users who were previously confused by multiple model options and likely never understood which one to choose—much like Word users who only know how to copy/paste using right-clicks. The backlash against these changes stems primarily from GPT5's underwhelming performance—many users believe it's inferior to GPT4o—rather than from the interface overhaul itself. The system also features four preset answer personalities, enhanced memory capabilities, better third-party integration (Gmail/calendar), improved healthcare answer accuracy, significantly reduced hallucinations, and stronger emphasis on secure output. While power users skilled with prompts might find these features trivial, they're genuinely transformative for novices. Additionally, for the first time, the flagship model was made immediately available to free users, andcharging U.S. government employees just $1 serves as both a political statement and a strategic move to capture institutional mindshare.

Analyzing the current competitive landscape helps us understand OpenAI's actions. As AI models have converged on the general LLM + RL formula, capturing users' mindshare has become the key competitive advantage. This is crucial not only for direct revenue but also for the second phase of modeling training —learning in real environments

On one hand, OpenAI leads in consumer products with 700 million weekly users (85% international). METR's report shows their user stickiness (DAU/MAU) exceeds 20%, while no close competitor surpasses 10% (excluding China's DeepSeek and MS Copilot, which also uses OpenAI's model). They've doubled annualized revenue to $12B with 90% penetration among Global 500 companies.

On the other hand, Anthropic's relative growth rate is significantly faster—growing more than 10x for two consecutive years and quadrupling in the first half of 2025 to reach $4B annually. An estimated 60-70% comes from API usage (despite having the industry's most expensive API pricing). Claude Code has reached nearly $400 million in annualized revenue within months of launch. Seeing developers gravitating toward Anthropic must concern OpenAI, as historical experience shows that platform-type products always win by attracting creators.

Meanwhile, Google remains a formidable competitor with modeling technology that potentially surpasses GPT. Their innovations span LLM, video, world models, and AI4S. Though slower in product development, Google's robust platforms—Search, Chrome, Android, YouTube—provide substantial resources to sustain a prolonged competitive battle. Additional threats include Musk-backed Grok from x.com, a regrouped Meta, and Chinese players fighting for ecological and enterprise developers with open-weight models.

OpenAI aims to establish itself as both the dominant global consumer AI portal and preferred assistant while maintaining its strong position with agent developers and enterprise markets. Their strategy is well-executed; however, ironically, what holds them back this time is their traditional strength—modeling capability...

‍

What OpenAI Wants and Where Things Went Wrong

INFO

Join the Hivemind

Featured Mini-post

The Open Model Landscape

dAGI Summit (October 24th in SF)

The Future of Energy Production

Cloudflare: A New Social Contract for the Web?

What OpenAI Wants and Where Things Went Wrong

How AI Conquered the US Economy

What OpenAI Wants and Where Things Went Wrong

INFO

Join the Hivemind

Featured Mini-post

The Open Model Landscape

dAGI Summit (October 24th in SF)

The Future of Energy Production

Cloudflare: A New Social Contract for the Web?

What OpenAI Wants and Where Things Went Wrong

How AI Conquered the US Economy

Join the Hivemind. Delphi Intelligence to your inbox.