Common AI picture era service Midjourney has launched its first AI video era mannequin V1, marking a pivotal shift for the corporate from picture era towards full multimedia content material creation.
Beginning right this moment, Midjourney’s almost 20 million customers can animate photographs through the web site, reworking their generated or uploaded stills into 5-second lengthy clips with choices for extending the era longer as much as 20 seconds (in 5 second bursts), and guiding them with textual content.
With the launch, the bootstrapped small lab Midjourney positions itself in a quickly intensifying AI video race. On the identical time, it’s additionally confronting critical authorized challenges from two of the biggest leisure studios on this planet.
What does it imply for AI creators and enterprises trying to harness the newest in inventive tech for promoting, advertising and marketing or consumer engagement? And the way does Midjourney stack up in opposition to a protracted and rising record of AI video mannequin rivals? Learn on to seek out out.
A brand new product constructed straight atop Midjourney’s in style AI picture generator
Midjourney’s new providing extends its acquainted image-based workflow, together with its new v7 text-to-image mannequin.
Customers generate a nonetheless picture, both throughout the Midjourney platform or by importing an exterior file, then press “Animate” to show that picture into video.
Two main modes exist: one makes use of automated movement synthesis, whereas the opposite lets customers write a customized movement immediate to dictate through textual content how parts ought to transfer within the scene. So Midjourney video arrives with assist for each image-to-video and text-to-video edits and modifications.
From a inventive standpoint, customers can toggle between two movement settings. There’s a low movement mode is optimized for ambient or minimalist motion — similar to a personality blinking or a lightweight breeze shifting surroundings — and excessive movement mode that makes an attempt extra dynamic animation of each topic and digicam, although this could improve the prospect of visible errors.
These are accessed beneath a generated or uploaded picture on the Midjourney web site in the precise hand choices pane beneath a subject labeled “Animate Image,” as seen right here:
Every video job generates 4 completely different 5-second clips as choices, and customers can prolong the animation by 4 seconds per clip, as much as a complete of 20 seconds.

Whereas that is comparatively short-form, the corporate has indicated that video length and options will increase in future updates.
Midjourney, launched in summer season 2022, is broadly thought-about by many AI picture creators because the premiere or “gold standard” in AI picture era to today due to its comparatively frequent and extra lifelike and various creation choices, so there have been excessive expectations surrounding its entry into the AI video house.
Right here’s a pattern era I created with my private Midjourney account:
Reasonably priced worth
Midjourney is providing video entry as a part of its current subscription plans, beginning at $10 monthly.
The corporate states that every video job will value roughly 8x greater than a picture era process. Nevertheless, since every video job produces 20 seconds of content material, the cost-per-second is roughly equal to producing one nonetheless picture—a pricing mannequin that seems to undercut many rivals.
A “video relax mode” is being examined for “Pro” subscribers and above. This mode, like its counterpart in picture era, would supply delayed processing in trade for diminished compute prices. Quick era stays metered by means of GPU minutes primarily based on tiered subscription plans.
It’s corresponding to rival Luma AI’s “Web Lite Plan” for $9.99 monthly and beneath Runway’s “Standard” plan ($15 month-to-month).
Listed here are among the different choices accessible:
No sound but and a extra restricted built-in editor than AI video rivals similar to Runway, Sora, Luma
The mannequin’s most noticeable limitation is its lack of sound.
In contrast to rivals similar to Google’s Veo 3 and Luma Labs’ Dream Machine, Midjourney’s system doesn’t generate accompanying audio tracks or ambient sound results.
For now, any soundtrack would have to be added manually in post-production utilizing separate instruments.
As well as, Midjourney’s outputs stay brief and are capped at 20 seconds. There isn’t a present assist for modifying timelines, scene transitions, or continuity between clips.
Midjourney has acknowledged that is solely the start and that the preliminary launch is meant to be exploratory, accessible, and scalable.
Rising stakes in crowded AI video market
The launch lands at a time when AI video era is quickly turning into probably the most aggressive corners of the generative AI panorama.
Tech giants, venture-backed startups, and open-source initiatives are all shifting quick.
This week, Chinese language startup MiniMax launched Hailuo 02, an improve to its earlier video mannequin. Early suggestions has praised its realism, movement adherence to prompts, and 1080p decision, although some reviewers famous that render instances are nonetheless comparatively sluggish.
The mannequin seems particularly adept at deciphering complicated movement or cinematic digicam angles, placing it in direct comparability with Western choices like Runway’s Gen-3 Alpha and Google’s Veo line.
In the meantime, Luma Labs’ Dream Machine has gained traction for its capacity to co-generate audio alongside high-fidelity video, a characteristic lacking from Midjourney’s new launch, and like Runway, permits for re-stylizing or “re-skinning” video with a brand new characteristic known as Modify Video.
Google’s Veo 3 and OpenAI’s upcoming Sora mannequin are equally working towards broader multimodal synthesis, integrating textual content, picture, video, and sound into cohesive, editable scenes.
Midjourney’s wager seems to be on simplicity and cost-effectiveness—a “good enough” resolution priced for scale—however that additionally means it launches with out many superior options now normal within the premium AI video tier.
The shadow of litigation from Disney and Common over IP infringement
The studios allege that Midjourney has created a “bottomless pit of plagiarism,” deliberately enabling customers to provide downloadable photographs that includes characters like Darth Vader, Elsa, Iron Man, Bart Simpson, Shrek, and Toothless with little friction.
Of specific be aware: the lawsuit preemptively names Midjourney’s Video Service as a possible supply of future infringement, stating that the corporate had begun coaching the mannequin earlier than launch and was probably already replicating protected characters in movement.
In accordance with the criticism, Midjourney earned $300 million in income in 2024 and serves almost 21 million customers. The studios argue that this scale provides the platform a industrial benefit constructed atop uncompensated inventive labor.
Disney’s common counsel, Horacio Gutierrez, acknowledged plainly: “Piracy is piracy. And the fact that it’s done by an AI company does not make it any less infringing.”
For enterprises involved about infringement dangers, providers with built-in indemnity like OpenAI’s Sora or Adobe Firefly Video are most likely higher choices for AI video creation.
A ‘world model’ and realtime world era is the purpose
Regardless of the quick dangers, Midjourney’s long-term roadmap is obvious and impressive. In public statements surrounding the video mannequin’s launch, the corporate stated its purpose is to ultimately merge static picture era, animated movement, 3D spatial navigation, and real-time rendering right into a single, unified system, often known as a world mannequin.
These techniques intention to let customers navigate by means of dynamically generated environments—areas the place visuals, characters, and consumer inputs evolve in actual time, like immersive video video games or VR experiences.
They envision a future the place customers can concern instructions like “walk through a market in Morocco at sunset,” and the system responds with an explorable, interactive simulation—full with evolving visuals and maybe, ultimately, generative sound.
For now, the video mannequin serves as an early step on this route. Midjourney has described it as a “technical stepping stone” to extra complicated techniques.
However Midjourney is much from the one AI analysis lab pursuing such bold plans.
Odyssey, a startup co-founded by self-driving tech veterans Oliver Cameron and Jeff Hawke, just lately debuted a system that streams video at 30 frames per second with spatial interplay capabilities. Their mannequin makes an attempt to foretell the “next state of the world” primarily based on prior states and actions, enabling customers to go searching and discover scenes as if navigating a 3D house.
Odyssey combines AI modeling with its personal 360-degree digicam {hardware} and is pursuing integrations with 3D platforms like Unreal Engine and Blender for post-generation modifying. Nevertheless, it doesn’t but permit for a lot consumer management past shifting the place of the digicam and seeing what random sights the mannequin produces because the consumer navigates the generated house.
Equally, Runway, a longtime participant in AI video era, has begun folding world modeling into its public roadmap. The corporate’s AI video fashions — the newest amongst them, Gen-4 launched in April 2025 — assist superior AI digicam controls that permit customers to arc round topics, zoom out and in, or easily glide throughout environments—options that start to blur the road between video era and scene simulation.
In a 2023 weblog put up, Runway’s CTO Anastasis Germanidis outlined common world fashions as techniques that perceive environments deeply sufficient to simulate future occasions and interactions inside them. In different phrases, they’re not simply producing what a scene appears like—they’re predicting the way it behaves.
Different main AI efforts on this house embrace:
DeepMind, which has performed foundational analysis into world modeling for robotic coaching and reinforcement studying;
World Labs, the brand new enterprise led by AI researcher Fei-Fei Li, targeted particularly on simulation-centric fashions;
Microsoft, which is exploring world fashions for enterprise functions like digital twins and simulation-based coaching;
Decart, a stealthier however well-funded startup engaged on multi-agent simulation fashions.
Whereas Midjourney’s method has to date emphasised accessibility and ease of use, it’s now signaling an evolution towards these extra subtle simulation frameworks. The corporate says that to realize this, it should first construct the mandatory parts: static visuals (its unique picture fashions), movement (video fashions), spatial management (3D positioning), and real-time responsiveness. Its new video mannequin, then, serves as one foundational block on this longer arc.
This places Midjourney in a world race—not simply to generate stunning media, however to outline the infrastructure of interactive, AI-generated worlds.
A calculated and promising leap into an more and more difficult aggressive house
Midjourney’s entry into video era is a logical extension of its in style picture platform, priced for broad entry and designed to decrease the barrier for animation experimentation. It provides a straightforward path for creators to deliver their visuals to life—at a price construction that, for now, seems each aggressive and sustainable.
However this launch additionally locations the corporate squarely within the crosshairs of a number of challenges. On the product aspect, it faces succesful and fast-moving rivals with extra options and fewer authorized baggage. On the authorized entrance, it should defend its practices in a lawsuit that would reshape how AI corporations are allowed to coach and deploy generative fashions within the U.S.
For enterprise leaders evaluating AI inventive platforms, Midjourney’s launch presents a double-edged sword: a low-cost, fast-evolving software with robust consumer adoption — however with unresolved regulatory and IP publicity that would have an effect on reliability or continuity in enterprise deployments.
The query going ahead is whether or not Midjourney can preserve its velocity with out hitting a authorized wall or whether or not it must considerably restructure its enterprise and know-how to remain viable in a maturing AI content material ecosystem.
Each day insights on enterprise use circumstances with VB Each day
If you wish to impress your boss, VB Each day has you lined. We provide the inside scoop on what firms are doing with generative AI, from regulatory shifts to sensible deployments, so you may share insights for max ROI.
An error occured.


