Vivid Knowledge, the Israeli internet scraping firm that defeated each Meta and Elon Musk’s X in federal court docket, unveiled a complete AI infrastructure suite Wednesday designed to present synthetic intelligence techniques unfettered entry to real-time internet information — a functionality the corporate argues Huge Tech platforms try to monopolize.
The announcement of Deep Lookup, Browser.ai, and enhanced information assortment protocols represents a dramatic growth for the decade-old firm, which has remodeled from a specialised internet scraping service into what CEO Or Lenchner calls “a unique infrastructure layer for AI companies.” The transfer comes as synthetic intelligence firms more and more battle to entry present internet info wanted to energy chatbots, autonomous brokers, and different AI purposes.
“The intelligence of today’s LLMs is no longer its limiting factor; access is,” Lenchner stated in an unique interview with VentureBeat. “We’ve spent the last decade fighting for open access to public web data, and these new offerings bring us to the next chapter in our journey, one characterized by truly accessible data and the subsequent rise of contextually-aware agents.”
The launch follows Vivid Knowledge’s high-profile authorized victories in 2024, when federal judges dismissed lawsuits from each Meta and X alleging the corporate illegally scraped their platforms. These rulings established essential authorized precedent defining what constitutes “public data” on the web — info that may be considered with out logging in and subsequently will be legally collected and used.
The court docket instances revealed that each Meta and X had been Vivid Knowledge clients even whereas suing the corporate, highlighting the contradictory stance many tech giants have taken towards internet scraping. The rulings have broader implications for the AI trade, which depends closely on internet information to coach and function language fashions.
“It was revealed in court that both of them were a Bright Data customer, because everyone needs data, everyone, especially those who are building models,” Lenchner defined. “We are the only company that has the financial resources, and I would even say the courage to do that.”
Choose William Alsup, who presided over the X case, wrote that giving social media firms “free rein to decide, on any basis, who can collect and use data” dangers creating “information monopolies that would disserve the public interest.” The ruling established that information viewable with out login credentials constitutes public info that may be legally scraped.
Vivid Knowledge had beforehand filed a countersuit in opposition to X, alleging the platform violated antitrust legal guidelines by attempting to create an information monopoly to learn Musk’s AI firm, xAI. Nevertheless, that case has since been settled. “Though the terms confidential, Bright Data has never backed down from its fundamental belief that public data should be available to the public. Consistent with that belief, we are pleased to report that Bright Data will continue to provide the same industry-leading services that it always has and that our customers have come to expect,” Lenchner stated.
Deep Lookup and Browser.ai goal AI firms scuffling with information entry
The corporate’s new merchandise deal with what Lenchner identifies because the three core necessities for AI techniques: algorithms, compute energy, and information entry. Whereas Vivid Knowledge doesn’t develop AI algorithms or present computing sources, it goals to grow to be the definitive resolution for the third requirement.
Deep Lookup features as a pure language analysis engine designed to reply advanced, multi-layered enterprise questions in real-time. In contrast to general-purpose serps or AI chatbots that present summaries, Deep Lookup focuses on complete outcomes for queries starting with “find all.” For instance, customers can ask for “all shipping companies that went through the Panama and Suez canals in 2023 whose Q3 revenues declined by over 2 percent.”
The system attracts from Vivid Knowledge’s large internet archive, which at present comprises over 200 billion HTML pages and provides 15 billion month-to-month. By subsequent yr, the archive is predicted to exceed 500 billion pages. “It’s not just random web pages, it’s actually what the world cares about, because our 20,000 customers represent billions of internet users,” Lenchner famous.
Browser.ai represents what the corporate calls “the industry’s first unblockable, AI-native browser.” Designed particularly for autonomous AI brokers, the cloud-based service mimics human conduct to entry web sites with out triggering bot detection techniques. It helps pure language instructions and may carry out advanced internet interactions like reserving flights or making restaurant reservations.
The browser infrastructure already processes over 150 million internet actions every day, in response to the corporate. “Almost all of them are customers,” Lenchner stated of AI agent firms which have raised vital funding. “Because what we figured out, and they figured out, is that we solve that problem of entering a website without being blocked and executing web actions on the website.”
MCP Servers (Mannequin Context Protocol) supplies a low-latency management layer enabling AI brokers to go looking, crawl, and extract stay information in real-time. The protocol permits builders to construct AI techniques that may act on present info moderately than relying solely on coaching information.
Patent portfolio and proxy community create aggressive moat in opposition to blocking
Vivid Knowledge’s aggressive benefit stems from what Lenchner describes as an “obsession” with overcoming web site blocking mechanisms. The corporate holds over 5,500 patent claims on its expertise and operates the world’s largest proxy community with greater than 150 million IP addresses throughout 195 international locations.
“We have such a good look into the internet,” Lenchner defined. “For a long time now, we have been mapping the internet, and for a long time now, we’re also archiving big chunks of the internet.”
The corporate’s method entails refined methods to imitate human conduct, utilizing actual gadgets, IP addresses, and browser fingerprints moderately than easy automated scripts. This makes detection and blocking extraordinarily troublesome for web sites.
“The only way to block us, practically, is to put the data behind the login, then we won’t even try,” Lenchner stated. “Sometimes there is a new blocking logic that we won’t solve immediately. It will take our research team 12 hours, three days that’s like the most it was, and we will unlock it.”
Income surpasses $100 million as AI demand explodes post-ChatGPT
Whereas Vivid Knowledge stays privately held by a non-public fairness agency, Lenchner confirmed with VentureBeat the corporate’s annual recurring income surpassed $100 million a number of years in the past. The enterprise has skilled explosive progress because the launch of ChatGPT in late 2022, as AI firms scrambled to entry coaching information and real-time info.
“Starting March 2023, which is pretty much when GPT-3 changed the world, the AI, or what we call the data for AI, use case just absolutely exploded for us as a company,” Lenchner stated. “Everything else is also growing, because everyone needs more data, period. But this use case is just like nothing we’ve seen before.”
The corporate serves over 20,000 companies, together with Fortune 500 firms and main AI laboratories. Conventional clients embody e-commerce platforms monitoring competitor pricing, monetary companies companies looking for market intelligence, and enterprises conducting enterprise analysis.
GDPR compliance and moral practices differentiate from rivals
Vivid Knowledge has invested closely in compliance infrastructure to handle privateness issues round information assortment. The corporate follows European GDPR and California CCPA rules, mechanically notifying people when their private info is collected from public sources and offering deletion choices.
The corporate maintains a big compliance crew and intensive documentation of its practices, which proved priceless throughout court docket proceedings. “Enterprises especially love us because we have our ethical stand that was scrutinized in US courts twice,” Lenchner stated.
Net entry wars intensify as tech giants search information monopolies
The battle over internet information entry displays broader tensions within the AI trade about info management and aggressive benefit. As AI techniques grow to be extra refined, entry to present, complete internet information turns into more and more priceless — and contentious.
Lenchner predicts the online will grow to be “more closed” over time, much like how Google maintains unique entry to its internet crawling capabilities whereas others should use various companies. “A few tech giants are gonna get free access to every website with their agents,” he stated. “The rest will need to use our infrastructure or someone else’s infrastructure.”
The corporate can be observing new traits, together with companies scraping AI chatbots for advertising functions and the emergence of latest protocols like MCP that allow AI brokers to work together with internet companies extra successfully.
“All of these guys that are consuming massive amounts of data, and all of us are using them, it’s all going towards building the brains of the robots,” Lenchner stated. “It’s okay that you have a chatbot that is talking to a human, because that’s eventually what a robot will do.”
Robotic brains and agent economic system drive subsequent section of progress
Vivid Knowledge’s transformation from internet scraping service to AI infrastructure supplier displays the quickly evolving wants of the bogus intelligence trade. As firms rush to deploy AI brokers and autonomous techniques, entry to real-time internet information turns into as essential as computing energy and algorithmic sophistication.
The authorized precedents established via Vivid Knowledge’s court docket victories might show as vital as its technical improvements, doubtlessly shaping how all the AI trade accesses and makes use of internet info. With main tech platforms more and more limiting information entry whereas concurrently growing their very own AI techniques, unbiased infrastructure suppliers like Vivid Knowledge might grow to be important for sustaining aggressive stability within the AI ecosystem.
“We’re an infrastructure company,” Lenchner emphasised. “We’re very talented engineers that hardly go anywhere, just sit with our computers and write code. We’re doing it well. We have no intentions to do anything else.”
The Deep Lookup beta launches Tuesday for enterprise clients, with normal public entry out there via a waitlist. Browser.ai and MCP Servers are already out there to enterprise shoppers via Vivid Knowledge’s current platform.
Each day insights on enterprise use instances with VB Each day
If you wish to impress your boss, VB Each day has you lined. We provide the inside scoop on what firms are doing with generative AI, from regulatory shifts to sensible deployments, so you may share insights for max ROI.
An error occured.