OpenAI unveils GPT-4 5 ‘Orion,’ its largest AI model yet
Don’t get me wrong, I still rely on my senior web developers to craft my and our customers’ websites and apps, but small bugs and glitches can be patched by ChatGPT. The LLM is capable of writing original lines of code to create webpages or to scan existing code to pinpoint errors specified in plain language. AI-generated captions aren’t perfect and sometimes lack emotion or humor.
How ChatGPT actually works (and why it’s been so game-changing)
The industry has held its collective breath for Orion, which some consider to be a bellwether for the viability of traditional AI training approaches. GPT-4.5 was developed using the same key technique — dramatically increasing the amount of computing power and data during a “pre-training” phase called unsupervised learning — that OpenAI used to develop GPT-4, GPT-3, GPT-2, and GPT-1. DeepSeek became a big competitor against OpenAI in January, with industry leaders finding DeepSeek-R1 reasoning to be as capable as OpenAI’s — but more affordable. O4 mini, and its mini-high version, are great for fast and more straightforward reasoning.
Here’s how I’m going to integrate GPT technology into my life and how I hope you can, too.
GPT-4.5 is OpenAI’s largest model to date, trained using more computing power and data than any of the company’s previous releases. However, many noted that GPT-4.5 had nothing to do with its performance. Instead, people questioned why OpenAI would release a model so expensive that it is almost prohibitive to use but is not as powerful as its other models. Box CEO Aaron Levie said on X that his company used GPT-4.5 to help extract structured data and metadata from complex enterprise content. The company said the new model “is not a frontier model” but is still its biggest large language model (LLM), with more computational efficiency.
Acer Swift AI 16 vs. Microsoft Surface Laptop 7: it’s close, but the same laptop remains at the top
- There’s a lot of AI hype floating around, and it seems every brand wants to cram it into their products.
- It’s also quite limited in its ability to learn from experience, so it may continue to make the same errors, even when they are pointed out to it.
- Scott Swingle, a DeepMind alum and founder of AI-powered developer tools company Abante AI, tested o4 with an Euler problem — a series of challenging computational problems released every week or so.
- As and when it becomes available, you will see a new Deep Search button alongside the text search box.
- The GPT-4.1 model has been in talks since the developers’ preview for its advanced coding capabilities and managing follow-up instructions.
- GPT-4.1 scored a 38.3% rating — which, at less than half the time, isn’t that much more than my dog.
Industry observers, many of whom had early access to the new model, have found GPT-4.5 to be an interesting move from OpenAI, tempering their expectations of what the model should be able to achieve. OpenAI has announced the release of GPT-4.5, which CEO Sam Altman previously said would be the last non-chain-of-thought (CoT) model. Altman described GPT-4.5 in a post on X as “the first model that feels like talking to a thoughtful person.” Given the huge costs of GPT-4.5, though, it is very hard to justify many of the use cases. One of the constant trends we have seen in recent years is the plummeting costs of inference, and if this trend applies to GPT-4.5, it is worth experimenting with it and finding ways to put its power to use in enterprise applications.
Or being able to suggest recipes for a user after analyzing an image of the ingredients they have to hand. GPT-4 is a next-generation language model for the AI chatbot, and though OpenAI isn’t being specific about what changes it’s made to the underlying model, it is keen to highlight how much improved it is over its predecessor. In response to the pre-training hurdles, the industry — including OpenAI — has embraced reasoning models, which take longer than non-reasoning models to perform tasks but tend to be more consistent. By increasing the amount of time and computing power that AI reasoning models use to “think” through problems, AI labs are confident they can significantly improve models’ capabilities. OpenAI did not list one of its top-performing AI reasoning models, deep research, on SimpleQA. An OpenAI spokesperson tells TechCrunch it has not publicly reported deep research’s performance on this benchmark and claimed it’s not a relevant comparison.
Box CEO’s thoughts on GPT-4.5
Indeed, OpenAI says that GPT-4.5’s increased size has given it “a deeper world knowledge” and “higher emotional intelligence.” However, there are signs that the gains from scaling up data and computing are beginning to level off. On several AI benchmarks, GPT-4.5 comes under newer AI “reasoning” models from Chinese AI startup DeepSeek, Anthropic, and OpenAI itself. As a content marketing professional, I’ve been following OpenAI’s GPT series of large language models (LLMs) for a long time. However, it wasn’t until ChatGPT was released last autumn that the general public caught on to the brilliance of this technology that is—pardon the buzzword—truly disruptive.
They’re good at speeding up any quantitative reasoning tasks you encounter during your day. GPT-4.5 also showed improved performance at extracting information from unstructured data. In a test that involved extracting fields from hundreds of legal documents, GPT-4.5 was 19% more accurate than GPT-4o. The bottom line for GPT-4.1 seems to be more of the same, but better. Given that the improved offering now comes baked into all of the ChatGPT pay versions — for those who are contributing to OpenAI’s $415 million monthly revenue stream — better is better. ChatGPT has already shown itself a capable programmer, but GPT-4 takes it to a while new level.
It can write beautifully, is very creative, and is occasionally oddly lazy on complex projects.Feels like Claude 3.7 while Claude 3.7 feels like GPT-4.5. Small models have been gaining traction in the industry for a while now as a faster and more cost-efficient alternative to larger, foundation models. OpenAI released its first small model, o3 mini, in January, just weeks after Chinese startup Butterfly Effect debuted DeepSeek’s R1, which shocked Silicon Valley — and the markets — with its affordable pricing. However, it also raised copyright questions as critics argued that OpenAI is unfairly profiting off artists’ content.
OpenAI says this variant is “more cost-efficient while preserving high quality.” More importantly, it is available to use for free without any subscription caveat. “Deep Search leverages GPT-4 to find all the possible intents and computes a comprehensive description for each of them,” explains Microsoft. But it’s predictive context-aware guesswork at best, and even Microsoft acknowledges that Deep Search’s expansion work might falter from time to time. The text-to-image superpowers of Copilot are also being upgraded to the DALL-E 3 engine. We’ve used this one and can confirm that not only are the visuals improved dramatically, but also its understanding of prompts is also better. Another cool trick coming soon to Bing Search is multi-modality, which lets you combine text and image inputs for an improved search experience.