My Experience With GPT-5 Has Been Decent, But Not That Impressive

Getting your Trinity Audio player ready...

I am pretty sure, by now, you all have used ChatGPT at least once in your life.

The chatbot that has disrupted every industry in the last 2 years. In fact, OpenAI, the company behind this chatbot, has always been known as a pioneer when it comes to releasing AI features and upgrades. However, other organizations are catching up.

In January, a mere Chinese startup, DeepSeek, released its reasoning model, R1, for free, something OpenAI was charging $20/month for, which crashed the global tech stock market. Then there is the Gemini 2.5 series, considered the best-performing LLM series in the market for the longest time.

Even some weeks before, xAI released Grok 4, and it is “better than PhD — no exceptions,” as per Elon Musk.

The GPT-5 Release

Prior to the release, the anticipation was insane. The hype was at its peak, as Sam Altman, through various interviews and podcasts, had been promoting the next GPT release (GPT-5) as being closer to AGI. 

As per OpenAI, AGI would be defined as “a highly autonomous system that outperforms humans at most economically valuable work”.

Altman even posted this on X before the launch.

Well, OpenAI released GPT-5 on 7th August, 2.5 years after its predecessor. The company defined GPT-5 as “a unified system that can decide how hard to think before giving you an answer.”

By unified, it means there are no longer multiple models that caused confusion, such as o3, their state-of-the-art reasoning model, or GPT-4o, their most popular model that included image generation capabilities and brought them huge success in recent times.

OpenAI removed other models

GPT-5 features what OpenAI calls a real-time router, which decides whether to respond instantly or switch to its “GPT-5 thinking” mode for more complex problems. 

If you type something like “think hard about this,” it will deliberately trigger that extended reasoning process. Once you hit the usage limits, a smaller “mini” version takes over to handle the rest of your queries. 

The company says these routing decisions are based on live signals, such as when users switch models, how they rate responses, and how they correct answers. The router is continuously trained to improve over time.

GPT-5 dominated Benchmarks. Did it though?

Most viewers of the livestream found the benchmarks highly impressive, though the comparisons mainly against previous models showed more incremental improvements than groundbreaking leaps.

Comparison of GPT-5 with previous models such as o3 and GPT-4o

The above chart has been taken from OpenAI’s official blog from the GPT-5 release. As you can see, there’s a mere 5% increase in the accuracy when it’s getting compared with o3.

That hasn’t stopped Sam Altman from calling GPT-5 “significantly better” than everything that came before. In fact, he went as far as to say, “GPT-5 is the first time that it really feels like talking to an expert in any topic, like a PhD-level expert.”

But the internet is already calling BS. Within hours of launch, social media lit up with nearly 5,000 posts slamming GPT-5 as “horrible,” “a disaster,” and “underwhelming.” For a model pitched as the closest step to AGI yet, many early users say it feels more like a downgrade than a leap forward.

I have been using the model mostly for coding, and I didn’t find it a major upgrade from GPT-4. It still wasn’t able to fix an issue which I was facing at work. The response is more refined, though. Maybe, OpenAI will update the model soon and it will be better then. Let’s see how that goes!


If this article provided you with value, please support me by buying me a coffee—only if you can afford it. Thank you!