We collect cookies to analyze our website traffic and performance; we never collect any personal data. Cookie Policy
Accept
NEW YORK DAWN™NEW YORK DAWN™NEW YORK DAWN™
Notification Show More
Font ResizerAa
  • Home
  • Trending
  • New York
  • World
  • Politics
  • Business
    • Business
    • Economy
    • Real Estate
  • Crypto & NFTs
  • Tech
  • Lifestyle
    • Lifestyle
    • Food
    • Travel
    • Fashion
    • Art
  • Health
  • Sports
  • Entertainment
Reading: Meta’s new world mannequin lets robots manipulate objects in environments they’ve by no means encountered earlier than
Share
Font ResizerAa
NEW YORK DAWN™NEW YORK DAWN™
Search
  • Home
  • Trending
  • New York
  • World
  • Politics
  • Business
    • Business
    • Economy
    • Real Estate
  • Crypto & NFTs
  • Tech
  • Lifestyle
    • Lifestyle
    • Food
    • Travel
    • Fashion
    • Art
  • Health
  • Sports
  • Entertainment
Follow US
NEW YORK DAWN™ > Blog > Technology > Meta’s new world mannequin lets robots manipulate objects in environments they’ve by no means encountered earlier than
Meta’s new world mannequin lets robots manipulate objects in environments they’ve by no means encountered earlier than
Technology

Meta’s new world mannequin lets robots manipulate objects in environments they’ve by no means encountered earlier than

Last updated: June 12, 2025 11:37 pm
Editorial Board Published June 12, 2025
Share
SHARE

Be a part of the occasion trusted by enterprise leaders for practically twenty years. VB Remodel brings collectively the folks constructing actual enterprise AI technique. Be taught extra

Whereas giant language fashions (LLMs) have mastered textual content (and different modalities to some extent), they lack the bodily “common sense” to function in dynamic, real-world environments. This has restricted the deployment of AI in areas like manufacturing and logistics, the place understanding trigger and impact is important.

Meta’s newest mannequin, V-JEPA 2, takes a step towards bridging this hole by studying a world mannequin from video and bodily interactions.

V-JEPA 2 can assist create AI functions that require predicting outcomes and planning actions in unpredictable environments with many edge instances. This method can present a transparent path towards extra succesful robots and superior automation in bodily environments.

How a ‘world model’ learns to plan

People develop bodily instinct early in life by observing their environment. In the event you see a ball thrown, you instinctively know its trajectory and might predict the place it’ll land. V-JEPA 2 learns an identical “world model,” which is an AI system’s inner simulation of how the bodily world operates.

mannequin is constructed on three core capabilities which can be important for enterprise functions: understanding what is going on in a scene, predicting how the scene will change based mostly on an motion, and planning a sequence of actions to realize a selected purpose. As Meta states in its weblog, its “long-term vision is that world models will enable AI agents to plan and reason in the physical world.”

The mannequin’s structure, known as the Video Joint Embedding Predictive Structure (V-JEPA), consists of two key components. An “encoder” watches a video clip and condenses it right into a compact numerical abstract, referred to as an embedding. This embedding captures the important details about the objects and their relationships within the scene. A second element, the “predictor,” then takes this abstract and imagines how the scene will evolve, producing a prediction of what the following abstract will appear to be. 

V-JEPA consists of an encoder and a predictor (supply: Meta weblog)

This structure is the newest evolution of the JEPA framework, which was first utilized to photographs with I-JEPA and now advances to video, demonstrating a constant method to constructing world fashions.

In contrast to generative AI fashions that attempt to predict the precise shade of each pixel in a future body — a computationally intensive activity — V-JEPA 2 operates in an summary house. It focuses on predicting the high-level options of a scene, comparable to an object’s place and trajectory, slightly than its texture or background particulars, making it much more environment friendly than different bigger fashions at simply 1.2 billion parameters

That interprets to decrease compute prices and makes it extra appropriate for deployment in real-world settings.

Studying from remark and motion

V-JEPA 2 is skilled in two phases. First, it builds its foundational understanding of physics by way of self-supervised studying, watching over a million hours of unlabeled web movies. By merely observing how objects transfer and work together, it develops a general-purpose world mannequin with none human steerage.

Within the second stage, this pre-trained mannequin is fine-tuned on a small, specialised dataset. By processing simply 62 hours of video exhibiting a robotic performing duties, together with the corresponding management instructions, V-JEPA 2 learns to attach particular actions to their bodily outcomes. This leads to a mannequin that may plan and management actions in the actual world.

V-JEPA two-stage training pipeline (source: Meta)V-JEPA two-stage coaching pipeline (supply: Meta)

This two-stage coaching permits a important functionality for real-world automation: zero-shot robotic planning. A robotic powered by V-JEPA 2 might be deployed in a brand new atmosphere and efficiently manipulate objects it has by no means encountered earlier than, without having to be retrained for that particular setting.

It is a vital advance over earlier fashions that required coaching knowledge from the precise robotic and atmosphere the place they might function. The mannequin was skilled on an open-source dataset after which efficiently deployed on totally different robots in Meta’s labs.

For instance, to finish a activity like selecting up an object, the robotic is given a purpose picture of the specified end result. It then makes use of the V-JEPA 2 predictor to internally simulate a spread of potential subsequent strikes. It scores every imagined motion based mostly on how shut it will get to the purpose, executes the top-rated motion, and repeats the method till the duty is full.

Utilizing this technique, the mannequin achieved success charges between 65% and 80% on pick-and-place duties with unfamiliar objects in new settings.

Actual-world affect of bodily reasoning

This means to plan and act in novel conditions has direct implications for enterprise operations. In logistics and manufacturing, it permits for extra adaptable robots that may deal with variations in merchandise and warehouse layouts with out intensive reprogramming. This may be particularly helpful as corporations are exploring the deployment of humanoid robots in factories and meeting strains.

The identical world mannequin can energy extremely reasonable digital twins, permitting corporations to simulate new processes or practice different AIs in a bodily correct digital atmosphere. In industrial settings, a mannequin might monitor video feeds of equipment and, based mostly on its discovered understanding of physics, predict issues of safety and failures earlier than they occur.

This analysis is a key step towards what Meta calls “advanced machine intelligence (AMI),” the place AI programs can “learn about the world as humans do, plan how to execute unfamiliar tasks, and efficiently adapt to the ever-changing world around us.” 

Meta has launched the mannequin and its coaching code and hopes to “build a broad community around this research, driving progress toward our ultimate goal of developing world models that can transform the way AI interacts with the physical world.” 

What it means for enterprise technical decision-makers

V-JEPA 2 strikes robotics nearer to the software-defined mannequin that cloud groups already acknowledge: pre-train as soon as, deploy anyplace. As a result of the mannequin learns normal physics from public video and solely wants a number of dozen hours of task-specific footage, enterprises can slash the data-collection cycle that usually drags down pilot tasks. In sensible phrases, you’ll be able to prototype a pick-and-place robotic on an inexpensive desktop arm, then roll the identical coverage onto an industrial rig on the manufacturing facility ground with out gathering hundreds of contemporary samples or writing customized movement scripts.

Decrease coaching overhead additionally reshapes the price equation. At 1.2 billion parameters, V-JEPA 2 matches comfortably on a single high-end GPU, and its summary prediction targets cut back inference load additional. That lets groups run closed-loop management on-prem or on the edge, avoiding cloud latency and the compliance complications that include streaming video exterior the plant. Finances that after went to large compute clusters can fund further sensors, redundancy, or sooner iteration cycles as a substitute.

Day by day insights on enterprise use instances with VB Day by day

If you wish to impress your boss, VB Day by day has you coated. We provide the inside scoop on what corporations are doing with generative AI, from regulatory shifts to sensible deployments, so you’ll be able to share insights for max ROI.

An error occured.

vb daily phone

You Might Also Like

AI denial is turning into an enterprise threat: Why dismissing “slop” obscures actual functionality positive factors

GAM takes purpose at “context rot”: A dual-agent reminiscence structure that outperforms long-context LLMs

The 'reality serum' for AI: OpenAI’s new technique for coaching fashions to admit their errors

Anthropic vs. OpenAI pink teaming strategies reveal completely different safety priorities for enterprise AI

Inside NetSuite’s subsequent act: Evan Goldberg on the way forward for AI-powered enterprise methods

TAGGED:encounteredenvironmentsletsmanipulateMetasmodelobjectsrobotstheyveWorld
Share This Article
Facebook Twitter Email Print

Follow US

Find US on Social Medias
FacebookLike
TwitterFollow
YoutubeSubscribe
TelegramFollow
Popular News
Kidneys essential for high-altitude acclimatization, analysis signifies
Health

Kidneys essential for high-altitude acclimatization, analysis signifies

Editorial Board December 30, 2024
There’s one ‘SNL’ impression Bowen Yang tried to speak his manner out of. (It did not work.)
Borderlands 4: Leap into the countless combat within the Vault
Coachella Valley venues anchor an rising comedy scene in Palm Springs and past
5×7 Lavatory Rework Price Breakdown for 2025: A Full Information

You Might Also Like

Nvidia's new AI framework trains an 8B mannequin to handle instruments like a professional
Technology

Nvidia's new AI framework trains an 8B mannequin to handle instruments like a professional

December 4, 2025
Gong examine: Gross sales groups utilizing AI generate 77% extra income per rep
Technology

Gong examine: Gross sales groups utilizing AI generate 77% extra income per rep

December 4, 2025
AWS launches Kiro powers with Stripe, Figma, and Datadog integrations for AI-assisted coding
Technology

AWS launches Kiro powers with Stripe, Figma, and Datadog integrations for AI-assisted coding

December 4, 2025
Workspace Studio goals to unravel the true agent drawback: Getting staff to make use of them
Technology

Workspace Studio goals to unravel the true agent drawback: Getting staff to make use of them

December 4, 2025

Categories

  • Health
  • Sports
  • Politics
  • Entertainment
  • Technology
  • Art
  • World

About US

New York Dawn is a proud and integral publication of the Enspirers News Group, embodying the values of journalistic integrity and excellence.
Company
  • About Us
  • Newsroom Policies & Standards
  • Diversity & Inclusion
  • Careers
  • Media & Community Relations
  • Accessibility Statement
Contact Us
  • Contact Us
  • Contact Customer Care
  • Advertise
  • Licensing & Syndication
  • Request a Correction
  • Contact the Newsroom
  • Send a News Tip
  • Report a Vulnerability
Term of Use
  • Digital Products Terms of Sale
  • Terms of Service
  • Privacy Policy
  • Cookie Settings
  • Submissions & Discussion Policy
  • RSS Terms of Service
  • Ad Choices
© 2024 New York Dawn. All Rights Reserved.
Welcome Back!

Sign in to your account

Lost your password?