ByteDance researchers have developed a man-made intelligence system that transforms single pictures into real looking movies of individuals talking, singing and shifting naturally — a breakthrough that would reshape digital leisure and communications.
The brand new system, referred to as OmniHuman, generates full-body movies exhibiting individuals gesturing and shifting in ways in which match their speech, surpassing earlier AI fashions that would solely animate faces or higher our bodies.
How OmniHuman Makes use of 18,700 Hours of Coaching Knowledge to Create Sensible Movement
“End-to-end human animation has undergone notable advancements in recent years. However, existing methods still struggle to scale up as large general video generation models, limiting their potential in real applications,” the researchers wrote in a paper revealed on arXiv.
The workforce skilled OmniHuman on greater than 18,700 hours of human video information utilizing a novel strategy that mixes a number of sorts of inputs — textual content, audio, and physique actions. This “omni-conditions” coaching technique permits the AI to be taught from a lot bigger and extra numerous datasets than earlier strategies.
Credit score: ByteDance
AI video technology breakthrough exhibits full-body motion and pure gestures
“Our key insight is that incorporating multiple conditioning signals, such as text, audio, and pose, during training can significantly reduce data wastage,” the analysis workforce defined.
The know-how marks a major advance in AI-generated media, demonstrating capabilities from creating movies of individuals delivering speeches to exhibiting topics enjoying musical devices. In testing, OmniHuman outperformed present techniques throughout a number of high quality benchmarks.
Credit score: ByteDance
Tech giants race to develop next-generation video AI techniques
The event emerges amid intensifying competitors in AI video technology, with corporations like Google, Meta, and Microsoft pursuing comparable know-how. ByteDance’s breakthrough might give the TikTok mother or father firm a bonus on this quickly evolving discipline.
Trade consultants say such know-how might rework leisure manufacturing, instructional content material creation, and digital communications. Nevertheless, it additionally raises considerations about potential misuse in creating artificial media for misleading functions.
The researchers will current their findings at an upcoming pc imaginative and prescient convention, although they haven’t but specified which one.
Each day insights on enterprise use instances with VB Each day
If you wish to impress your boss, VB Each day has you lined. We provide the inside scoop on what corporations are doing with generative AI, from regulatory shifts to sensible deployments, so you possibly can share insights for optimum ROI.
An error occured.