According to Decrypt, Stability AI has announced the release of Stable Video Diffusion, a text-to-video tool designed for high-resolution text-to-video and image-to-video generation. The company's research paper highlights its adaptability and open-source technology, which allows for various applications in advertising, education, and entertainment. Stable Video Diffusion is currently available in a research preview and claims to outperform image-based methods at a fraction of their compute budget.
Stability AI has developed two models under the Stable Video Diffusion umbrella: SVD and SVD-XT. The SVD model transforms still images into 576x1024 videos in 14 frames, while SVD-XT uses the same architecture but extends to 24 frames. Both models offer video generation at frame rates ranging from three to 30 frames per second, showcasing the cutting-edge of open-source text-to-video technology. Stable Video Diffusion competes with innovative models from Pika Labs, Runway, and Meta in the rapidly evolving field of AI video generation.
Despite its technological achievements, Stability AI faces challenges, including ethical considerations around using copyrighted data in AI training. The company emphasizes that the model is not intended for real-world or commercial applications at this stage, focusing on refining it based on community feedback and safety concerns.