EX-4D: A Breakthrough in 4D Video Generation with Innovative Technologies

AI快讯5小时前发布 niko
3 0
AiPPT - 一键生成ppt

ByteDance’s PiCO – MR team has made a significant step by open – sourcing theEX – 4D, a revolutionary 4D video generation framework. This framework cantransform single – viewpoint videos into high – quality, multi – view 4D videosequences, combining 3D space with the time dimension, which is a remarkableachiEVEment in video generation technology.

The core of EX – 4D lies in its Depth – Enclosed Mesh (DW – Mesh) technology.Unlike traditional methods that face challenges in multi – view generation,such as relying on costly multi – view cameras and datasets and strugGlingwith occluded areas, DW – Mesh constructs a fully enclosed mesh structure. Itrecords both visible and hidden surfaces in the scene, enabling unifiedprocessing of complex scene topologies without multi – view supervision. Byprojecting single – frame pixels into 3D space with a pre – trained depthprediction model, it forms mesh vertices and precisely marks occluded regions,ensuring physical consistency and detail integrity in extreme perspectives.

To address the shortage of multi – view training data, EX – 4D introduces twosimulated mask generation strategies: rendering masks and tracking masks.These strategies simulate perspective movement and inter – frame consistency,allowing EX – 4D to generate full – view data from monocular videos, thusgreatly reducing data collection costs.

In performance tests, EX – 4D outshines existing open – source methods inindustry – standard metrics like FID (Fréchet Inception Distance), FVD(Fréchet Video Distance), and VBench. In extreme view generation tasks,especially near 90°, it shows more realistic details and occlusion logic. In asubjective evaluation, 70.7% of participants recognized its superiority inphysical consistency at extreme perspectives.

ByteDance has published the code and related documentation of EX – 4D onGitHub, making it freely accessible to global developers. Based on the pre -trained WAN – 2.1 model and a LoRA – based Adapter ARChitecture, EX – 4Dmaintains computational efficiency while ensuring geometric consistency andframe Coherence in generated videos. This lightweight design makes it suitablefor various development Scenarios, even in resource – constrainedenvironments.

EX – 4D’s release is a major advancement in building ‘world models’. It allowsusers to freely explore video content, similar to switching perspectives in a’parallel uniVerse‘. This camera – controllable 4D generation technologyoffers infinite possibilities for immersive content creation in fields such asinteractive 3D movies, virtual tourism, and game development. The PICO – MRteam will continue to optimize the model and explore broader applicationscenarios.

© 版权声明
Trea - 国内首个原生AI IDE