
Skywork AI's 1.8B open-source interactive world model generating real-time 25 FPS gameplay from keyboard and mouse inputs, with long-sequence consistency and free weights on GitHub and Hugging Face.
Some links may be affiliate links. We may earn a small commission at no extra cost to you. Learn more

Skywork AI's 1.8B open-source interactive world model generating real-time 25 FPS gameplay from keyboard and mouse inputs, with long-sequence consistency and free weights on GitHub and Hugging Face.
Category
AI Simulation
Matrix-Game 2.0 is fully open-source and available at no cost. Model weights are downloadable from Hugging Face and the inference code is available on GitHub. Users pay only for the compute resources required for local deployment, such as GPU cloud time if not running on local hardware. No subscription, license fee, or API charge is required to use the model.
| Plan | Details |
|---|---|
| Free | Full open-source access to model weights (1.8B parameters), inference pipeline, streaming generation code, and project documentation on GitHub and Hugging Face at no cost under the project's open-source license. |
| Paid | No paid tier. The model is entirely open-source with no commercial licensing requirements. |
Quick Summary
Matrix-Game 2.0 is an open-source interactive world foundation model developed by Skywork AI, released on August 12, 2025, that generates real-time interactive video at 25 frames per second across continuous sequences extending to several minutes in length, controlled via keyboard and mouse inputs. It is built on a 1.8-billion-parameter Multimodal Diffusion Transformer architecture trained on approximately 1,200 hours of footage from Unreal Engine and GTA 5, and is the first fully open-source model of its kind to deliver real-time, long-sequence interactive world generation. The model weights and inference code are freely available on GitHub and Hugging Face, making it a practical research baseline for game AI, embodied AI training, and spatial intelligence research.
Associated Tags
interactive world model, real-time AI generation, open source world model, AI game engine, embodied AI training, autoregressive video generation, Skywork AI, Genie 3 alternative
Discover practical workflows and real-world scenarios where Matrix-Game 2.0 delivers key solutions.
An AI researcher downloads the Matrix-Game 2.0 weights and runs the inference pipeline to study autoregressive diffusion behavior across long action sequences, using it as a benchmark baseline for a new world model paper.
A game developer uploads a single concept environment image and uses the streaming inference mode to generate a navigable real-time world draft, evaluating whether AI-generated environments can replace early engine prototyping in their workflow.
An embodied AI team generates diverse indoor and outdoor virtual training environments from varied starting images to create a broad dataset for training navigation and manipulation agents without manual scene authoring.
A VFX researcher uses the model's physics-aware frame generation to study how AI world models handle object interaction, movement physics, and scene dynamics without explicit physics engine rules.
A developer fine-tunes the model on a custom domain-specific dataset to adapt the world generation style for a specialized interactive application beyond the original training distribution.
An academic team uses the open-source codebase as a reproducible implementation to compare Matrix-Game 2.0 against Oasis and other world models on the GameWorld benchmark for a survey paper on generative game environments.
Reviewed by Sohail Akhtar
Lead Editor & Founder
What we like
Limitations
Who should use Matrix-Game 2.0?
Open-source AI model by Tencent that generates explorable, interactive 3D worlds from text or image inputs using panoramic scene reconstruction.
Multimodal AI world model by World Labs that generates persistent, navigable 3D environments from text, images, video, or 3D layouts, with in-scene editing and Gaussian splat, mesh, and video export.
Open-source AI world model by Decart and Etched that generates real-time Minecraft-style interactive gameplay at 20 FPS using next-frame prediction, with no traditional game engine required.
Google DeepMind research model for generating interactive virtual environments from text prompts at 720p and 24fps.