Image, video, audio processing, UI frameworks, and design utilities
109 tools
LongLive: Real-time Interactive Long Video Generation
A high-quality rapid TTS voice cloning model that reaches speeds of 150x realtime.
Extract any sound with text prompts. Memory-optimized SAM-Audio with modern UI.
Diagram as Code Tool Written in Rust with Draggable Editing
We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThi...
๐ฅ Visual workflow builder for AI agents powered by Firecrawl - drag-and-drop web scraping pipelines with real-time e...
OpenReel Video - Professional browser-based video editor. Open source CapCut alternative. 100% browser-based, no inst...
Your personal voice interface into any app. Speak naturally and your words appear wherever your cursor is, with fully...
ViPE: Video Pose Engine for Geometric 3D Perception
Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.
a free local self hosted video compressor webui designed for performance and ease of use. inspired by 8mb.video
Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"
voice activated sticker dreamer and printer.
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery
Glass Keep is Keep Notes alternative using Glass design. Made in React + Tailwind
Official Python inference and LoRA trainer package for the LTX-2 audioโvideo generative model.
Fast markdown preview server with live reload and theme support.
Official Implementations for Paper - HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives
Official repo for paper "Video-As-Prompt: Unified Semantic Control for Video Generation"
Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multi...
MoCha: End-to-End Video Character Replacement without Structural Guidance
From baby GPT to diffusion GPT: An annotated implementation of a character-level discrete diffusion model (adapted fr...
Conversational voice AI agents
High-Quality Text-to-Video Generation with Alpha Channel