Enterprise corporations have to pay attention to OpenAI’s Deep Analysis. It offers a strong product primarily based on new capabilities, and is so good that it might put lots of people out of jobs.
Deep Analysis is on the bleeding fringe of a rising pattern: integrating massive language fashions (LLMs) with search engines like google and different instruments to drastically increase their capabilities. (Simply as this text was being reported, for instance, Elon Musk’s xAI unveiled Grok 3, which claims related capabilities, together with a Deep Search product. Nevertheless, it’s too early to evaluate Grok 3’s real-world efficiency, since most subscribers haven’t truly gotten their arms on it but.)
OpenAI’s Deep Analysis, launched on February 3, requires a Professional account with OpenAI, costing $200 per thirty days, and is presently accessible solely to U.S. customers. Up to now, this restriction could have restricted early suggestions from the worldwide developer group, which is usually fast to dissect new AI developments.
With Deep Analysis mode, customers can ask OpenAI’s main o3 mannequin any query. The consequence? A report typically superior to what human analysts produce, delivered quicker and at a fraction of the associated fee.
How Deep Analysis works
Whereas Deep Analysis has been extensively mentioned, its broader implications have but to totally register. Preliminary reactions praised its spectacular analysis capabilities, regardless of its occasional hallucinations in its citations. There was the man who mentioned he used it to assist his spouse who had breast most cancers. It supplied deeper evaluation than what her oncologists supplied on how radiation remedy was the best plan of action, he mentioned. The consensus, summarized by Wharton AI professor Ethan Mollick, is that its benefits far outweigh occasional inaccuracies, as fact-checking takes much less time than what the AI saves total. That is one thing I agree with, primarily based by myself utilization.
Monetary establishments are already exploring functions. BNY Mellon, for example, sees potential in utilizing Deep Analysis for credit score threat assessments. Its affect will prolong throughout industries, from healthcare to retail, manufacturing, and provide chain administration — just about any area that depends on information work.
A better analysis agent
Not like conventional AI fashions that try one-shot solutions, Deep Analysis first asks clarifying questions. It would ask 4 or extra questions to ensure it understands precisely what you need. It then develops a structured analysis plan, conducts a number of searches, revises its plan primarily based on new insights, and iterates in a loop till it compiles a complete, well-formatted report. This could take between a couple of minutes and half an hour. Studies vary from 1,500 to twenty,000 phrases, and sometimes embody citations from 15 to 30 sources with actual URLs, not less than in keeping with my utilization over the previous week and a half.
The expertise behind Deep Analysis: reasoning LLMs and agentic RAG
Deep Analysis does this by merging two applied sciences in a manner we haven’t seen earlier than in a mass-market product.
Reasoning LLMs: The primary is OpenAI’s cutting-edge mannequin, o3, which leads in logical reasoning and prolonged chain-of-thought processes. When it was introduced in December 2024, o3 scored an unprecedented 87.5% on the super-difficult ARC-AGI benchmark designed to check novel problem-solving talents. What’s attention-grabbing is that o3 hasn’t been launched as a standalone mannequin for builders to make use of. Certainly, OpenAI’s CEO Sam Altman introduced final week that the mannequin as an alternative could be wrapped right into a “unified intelligence” system, which might unite fashions with agentic instruments like search, coding brokers and extra. Deep Analysis is an instance of such a product. And whereas opponents like DeepSeek-R1 have approached o3’s capabilities (one of many the reason why there was a lot pleasure just a few weeks in the past), OpenAI remains to be extensively thought-about to be barely forward.
Agentic RAG: The second, agentic RAG, is a expertise that has been round for a few 12 months now. It makes use of brokers to autonomously search out data and context from different sources, together with looking out the web. This could embody different tool-calling brokers to seek out non-web data through APIs; coding brokers that may full advanced sequences extra effectively; and database searches. Initially, OpenAI’s Deep Analysis is primarily looking out the open internet, however firm leaders have prompt it will have the ability to search extra sources over time.
OpenAI’s aggressive edge (and its limits)
Whereas these applied sciences aren’t totally new, OpenAI’s refinements — enabled by issues like its jump-start on engaged on these applied sciences, large funding, and its closed-source improvement mannequin — have taken Deep Analysis to a brand new stage. It might probably work behind closed doorways, and leverage suggestions from the greater than 300 million lively customers of OpenAI’s standard ChatGPT product. OpenAI has led in analysis in these areas, for instance in the right way to do verification step-by-step to get higher outcomes. And it has clearly applied search in an attention-grabbing manner, maybe borrowing from Microsoft’s Bing and different applied sciences.
Whereas it’s nonetheless hallucinating some outcomes from its searches, it’s doing so lower than opponents, maybe partially as a result of the underlying o3 mannequin itself has set an trade low for these hallucinations at 8%. And there are methods to scale back errors nonetheless additional, by utilizing mechanisms like confidence thresholds, quotation necessities and different refined credibility checks.
On the identical time, there are limits to OpenAI’s lead and capabilities. Inside two days of Deep Analysis’s launch, HuggingFace launched an open-source AI analysis agent referred to as Open Deep Analysis that acquired outcomes that weren’t too far off of OpenAI’s — equally merging main fashions and freely accessible agentic capabilities. There are few moats. Open-source opponents like DeepSeek seem set to remain shut within the space of reasoning fashions, and Microsoft’s Magentic-One provides a framework for many of OpenAI’s agentic capabilities, to call simply two extra examples.
Moreover, Deep Analysis has limitations. The product is de facto environment friendly at researching obscure data that may be discovered on the net. However in areas the place there’s not a lot on-line and the place area experience is essentially non-public — whether or not in peoples’ heads or in non-public databases — it doesn’t work in any respect. So this isn’t going to threaten the roles of high-end hedge-fund researchers, for instance, who’re paid to go discuss with actual specialists in an trade to seek out out in any other case very hard-to-obtain data, as Ben Thompson argued in a latest put up (see graphic under). Typically, OpenAI’s Deep Analysis goes to have an effect on lower-skilled analyst jobs.
Deep Analysis’s worth first will increase as data on-line will get scarce, then drops off when it will get actually scarce. Supply: Stratechery.
Probably the most clever product but
If you merge top-tier reasoning with agentic retrieval, it’s not likely stunning that you just get such a strong product. OpenAI’s Deep Analysis achieved 26.6% on Humanity’s Final Examination, arguably one of the best benchmark for intelligence. This can be a comparatively new AI benchmark designed to be essentially the most tough for any AI mannequin to finish, overlaying 3,000 questions throughout 100 completely different topics. On this benchmark, OpenAI’s Deep Analysis considerably outperforms Perplexity’s Deep Analysis (20.5%) and earlier fashions like o3-mini (13%) and DeepSeek-R1 (9.4%) that weren’t connected with agentic RAG. However early critiques counsel OpenAI leads in each high quality and depth. Google’s Deep Analysis has but to be examined in opposition to this benchmark, however early critiques counsel OpenAI leads in each high quality and depth.
The way it’s completely different: the primary mass-market AI that might displace jobs
What’s completely different with this product is its potential to eradicate jobs. Sam Witteveen, cofounder of Crimson Dragon and a developer of AI brokers, noticed in a deep-dive video dialogue with me that lots of people are going to say: “Holy crap, I can get these reports for $200 that I could get from some top-4 consulting company that would cost me $20,000.” This, he mentioned, goes to trigger some actual adjustments, together with possible placing folks out of jobs.
Which brings me again to my interview final week with Sarthak Pattanaik, head of engineering and AI at BNY Mellon, a serious U.S. financial institution.
To make certain, Pattanaik didn’t say something in regards to the product’s ramifications for precise job counts at his financial institution. That’s going to be a very delicate subject that any enterprise might be going to shrink back from addressing publicly. However he mentioned he might see OpenAI’s Deep Analysis getting used for credit score underwriting experiences and different “topline” actions, and having vital affect on quite a lot of jobs: “Now that doesn’t impact every job, but that does impact a set of jobs around strategy [and] research, like comparison vendor management, comparison of product A versus product B.” He added: “So I think everything which is more on system two thinking — more exploratory, where it may not have a right answer, because the right answer can be mounted once you have that scenario definition — I think that’s an opportunity.”
A historic perspective: job loss and job creation
Technological revolutions have traditionally displaced employees within the quick time period whereas creating new industries in the long term. From vehicles changing horse-drawn carriages to computer systems automating clerical work, job markets evolve. New alternatives created by the disruptive applied sciences are inclined to spawn new hiring. Firms that fail to embrace these advances will fall behind their opponents.
OpenAI’s Altman acknowledged the hyperlink, even when oblique, between Deep Analysis and labor. On the AI Summit in Paris final week, he was requested about his imaginative and prescient for synthetic common intelligence (AGI), or the stage at which AI can carry out just about any job {that a} human can. As he answered, his first reference was to Deep Analysis: “It’s a model I think is capable of doing like a low-single-digit percentage of all the tasks in the economy in the world right now, which is a crazy statement, and a year ago I don’t think something that people thought is going to be coming.” (See minute three of this video). He continued: “For 50 cents of compute, you can do like $500 or $5,000 of work. Companies are implementing that to just be way more efficient.”
The takeaway: a brand new period for information work
Deep Analysis represents a watershed second for AI in knowledge-based industries. By integrating cutting-edge reasoning with autonomous analysis capabilities, OpenAI has created a instrument that’s smarter, quicker and considerably less expensive than human analysts.
The implications are huge, from monetary companies to healthcare to enterprise decision-making. Organizations that leverage this expertise successfully will achieve a major aggressive edge. People who ignore it accomplish that at their peril.
For a deeper dialogue on how OpenAI’s Deep Analysis works, and the way it’s reshaping information work, take a look at my in-depth dialog with Sam Witteveen in our newest video:
Each day insights on enterprise use instances with VB Each day
If you wish to impress your boss, VB Each day has you lined. We provide the inside scoop on what corporations are doing with generative AI, from regulatory shifts to sensible deployments, so you’ll be able to share insights for max ROI.
An error occured.