LocateAnything
Detect and label objects in images and videos
Detect and label objects in images and videos
generate a video from an image with a text prompt
Audio-driven talking-head video generation (Meituan LongCat)
Generate, edit, and understand images and videos with Lance!
Demo of the Collection of Qwen Image Edit LoRAs
Generate images from text prompts instantly
generate a video from an image with a text prompt
text to video, image to video, video extend
FireRed-Image-Edit Γ Qwen-Image-Edit-Rapid (Transformers)
Text-to-audio with SA3 Medium / Small Music / Small SFX.
Segment objects in live webcam and uploaded media
High-fidelity 3D Generation from images
High-quality voice cloning TTS for 600+ languages
Demo of the Collection of Qwen Image Editing LoRAs
Generate a 3D model from a single image
Generate videos from text, images, audio, or video clips
Pixel Diffusion Decoder
High-fidelity pixel-aligned image-to-3D generation.
NVIDIA Cosmos3-Nano β text/image to video + audio
High-resolution 2K image editing with FireRed
End-to-end pixel-space 6B diffusion via L2P
generate a video from an image with a text prompt
Generate a video from an image and motion prompt
3D reconstruction from images/video with VGGT-Omega