LinkedIn is launching its new AI-powered folks search this week, after what looks like a really lengthy await what ought to have been a pure providing for generative AI.
It comes a full three years after the launch of ChatGPT and 6 months after LinkedIn launched its AI job search providing. For technical leaders, this timeline illustrates a key enterprise lesson: Deploying generative AI in actual enterprise settings is difficult, particularly at a scale of 1.3 billion customers. It’s a sluggish, brutal means of pragmatic optimization.
The next account relies on a number of unique interviews with the LinkedIn product and engineering workforce behind the launch.
First, right here’s how the product works: A consumer can now sort a pure language question like, "Who is knowledgeable about curing cancer?" into LinkedIn’s search bar.
LinkedIn's outdated search, primarily based on key phrases, would have been stumped. It might have regarded just for references to "cancer". If a consumer wished to get refined, they’d have needed to run separate, inflexible key phrase searches for "cancer" after which "oncology" and manually attempt to piece the outcomes collectively.
The brand new AI-powered system, nonetheless, understands the intent of the search as a result of the LLM beneath the hood grasps semantic which means. It acknowledges, for instance, that "cancer" is conceptually associated to "oncology" and even much less immediately, to "genomics research." Consequently, it surfaces a much more related listing of individuals, together with oncology leaders and researchers, even when their profiles don't use the precise phrase "cancer."
The system additionally balances this relevance with usefulness. As a substitute of simply exhibiting the world's prime oncologist (who is perhaps an unreachable third-degree connection), it would additionally weigh who in your fast community — like a first-degree connection — is "pretty relevant" and might function a vital bridge to that professional.
See the video under for an instance.
Arguably, although, the extra essential lesson for enterprise practitioners is the "cookbook" LinkedIn has developed: a replicable, multi-stage pipeline of distillation, co-design, and relentless optimization. LinkedIn needed to excellent this on one product earlier than making an attempt it on one other.
"Don't try to do too much all at once," writes Wenjing Zhang, LinkedIn's VP of Engineering, in a submit in regards to the product launch, and who additionally spoke with VentureBeat final week in an interview. She notes that an earlier "sprawling ambition" to construct a unified system for all of LinkedIn's merchandise "stalled progress."
As a substitute, LinkedIn targeted on profitable one vertical first. The success of its beforehand launched AI Job Search — which led to job seekers with no four-year diploma being 10% extra prone to get employed, in keeping with VP of Product Engineering Erran Berger — supplied the blueprint.
Now, the corporate is making use of that blueprint to a far bigger problem. "It's one thing to be able to do this across tens of millions of jobs," Berger instructed VentureBeat. "It's another thing to do this across north of a billion members."
For enterprise AI builders, LinkedIn's journey offers a technical playbook for what it really takes to maneuver from a profitable pilot to a billion-user-scale product.
The brand new problem: a 1.3 billion-member graph
The job search product created a sturdy recipe that the brand new folks search product might construct upon, Berger defined.
The recipe began with with a "golden data set" of just some hundred to a thousand actual query-profile pairs, meticulously scored towards an in depth 20- to 30-page "product policy" doc. To scale this for coaching, LinkedIn used this small golden set to immediate a big basis mannequin to generate an enormous quantity of artificial coaching information. This artificial information was used to coach a 7-billion-parameter "Product Policy" mannequin — a high-fidelity decide of relevance that was too sluggish for reside manufacturing however excellent for educating smaller fashions.
Nevertheless, the workforce hit a wall early on. For six to 9 months, they struggled to coach a single mannequin that might stability strict coverage adherence (relevance) towards consumer engagement indicators. The "aha moment" got here once they realized they wanted to interrupt the issue down. They distilled the 7B coverage mannequin right into a 1.7B trainer mannequin targeted solely on relevance. They then paired it with separate trainer fashions educated to foretell particular member actions, corresponding to job functions for the roles product, or connecting and following for folks search. This "multi-teacher" ensemble produced tender likelihood scores that the ultimate scholar mannequin discovered to imitate by way of KL divergence loss.
The ensuing structure operates as a two-stage pipeline. First, a bigger 8B parameter mannequin handles broad retrieval, casting a large internet to tug candidates from the graph. Then, the extremely distilled scholar mannequin takes over for fine-grained rating. Whereas the job search product efficiently deployed a 0.6B (600-million) parameter scholar, the brand new folks search product required much more aggressive compression. As Zhang notes, the workforce pruned their new scholar mannequin from 440M down to simply 220M parameters, attaining the mandatory velocity for 1.3 billion customers with lower than 1% relevance loss.
However making use of this to folks search broke the outdated structure. The brand new drawback included not simply rating but in addition retrieval.
“A billion data," Berger said, is a "completely different beast."
The team’s prior retrieval stack was built on CPUs. To handle the new scale and the latency demands of a "snappy" search experience, the team had to move its indexing to GPU-based infrastructure. This was a foundational architectural shift that the job search product did not require.
Organizationally, LinkedIn benefited from multiple approaches. For a time, LinkedIn had two separate teams — job search and people search — attempting to solve the problem in parallel. But once the job search team achieved its breakthrough using the policy-driven distillation method, Berger and his leadership team intervened. They brought over the architects of the job search win — product lead Rohan Rajiv and engineering lead Wenjing Zhang — to transplant their 'cookbook' directly to the new domain.
Distilling for a 10x throughput gain
With the retrieval problem solved, the team faced the ranking and efficiency challenge. This is where the cookbook was adapted with new, aggressive optimization techniques.
Zhang’s technical post (I’ll insert the link once it goes live) provides the specific details our audience of AI engineers will appreciate. One of the more significant optimizations was input size.
To feed the model, the team trained another LLM with reinforcement learning (RL) for a single purpose: to summarize the input context. This "summarizer" model was able to reduce the model's input size by 20-fold with minimal information loss.
The combined result of the 220M-parameter model and the 20x input reduction? A 10x increase in ranking throughput, allowing the team to serve the model efficiently to its massive user base.
Pragmatism over hype: building tools, not agents
Throughout our discussions, Berger was adamant about something else that might catch peoples’ attention: The real value for enterprises today lies in perfecting recommender systems, not in chasing "agentic hype." He also refused to talk about the specific models that the company used for the searches, suggesting it almost doesn't matter. The company selects models based on which one it finds the most efficient for the task.
The new AI-powered people search is a manifestation of Berger’s philosophy that it’s best to optimize the recommender system first. The architecture includes a new "clever question routing layer," as Berger explained, that itself is LLM-powered. This router pragmatically decides if a user's query — like "belief professional" — should go to the new semantic, natural-language stack or to the old, reliable lexical search.
This entire, complex system is designed to be a "device" that a future agent will use, not the agent itself.
"Agentic merchandise are solely pretty much as good because the instruments that they use to perform duties for folks," Berger said. "You possibly can have the world's finest reasoning mannequin, and for those who're attempting to make use of an agent to do folks search however the folks search engine is just not excellent, you're not going to have the ability to ship."
Now that the people search is available, Berger suggested that one day the company will be offering agents to use it. But he didn’t provide details on timing. He also said the recipe used for job and people search will be spread across the company’s other products.
For enterprises building their own AI roadmaps, LinkedIn's playbook is clear:
Be pragmatic: Don't try to boil the ocean. Win one vertical, even if it takes 18 months.
Codify the "cookbook": Flip that win right into a repeatable course of (coverage docs, distillation pipelines, co-design).
Optimize relentlessly: The true 10x positive factors come after the preliminary mannequin, in pruning, distillation, and inventive optimizations like an RL-trained summarizer.
LinkedIn's journey exhibits that for real-world enterprise AI, emphasis on particular fashions or cool agentic techniques ought to take a again seat. The sturdy, strategic benefit comes from mastering the pipeline — the 'AI-native' cookbook of co-design, distillation, and ruthless optimization.
(Editor's word: We shall be publishing a full-length podcast with LinkedIn's Erran Berger, which can dive deeper into these technical particulars, on the VentureBeat podcast feed quickly.)

