Recently, a new free and open-source model for generating videos from text and images, “Pyramid Flow SD3,” became available to a wide audience. The main feature of the model is its ability to generate videos up to 10 seconds long at 768p resolution and 24 frames per second, which significantly exceeds the capabilities of competitors that, by default, create 5-second videos.
This model is fully open and available for local use, allowing enthusiasts and developers to work with it without the restrictions of proprietary software. Its availability on the HuggingFace platform opens up broad opportunities for experimentation and the development of new solutions in the field of video generation.

The foundation of “Pyramid Flow SD3” is a new technique called Pyramid Flow, which combines autoregressive video generation and the Flow Matching method, ensuring smooth transitions between frames and creating realistic, dynamic videos. This approach allows the model not only to process text prompts but also to work in “image-to-video” mode.
This model could become a real breakthrough for content creators, enabling them to quickly produce high-quality videos without complex equipment or software. Its open-source code and use of publicly available datasets create conditions for transparent technology development and accelerate progress in the field of AI video.