Image, video, audio processing, UI frameworks, and design utilities
109 tools
[Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
Bridge the gap between photo and video color grading. Accurately apply any creative LUT to your RAW files with this t...
โโUnlimited-length talking video generationโโ that supports image-to-video and video-to-video generation
Create stunning visual designs with Stage A modern canvas editor that brings your ideas to life. Add images, text, ba...
A native macOS menu bar app for managing audio device priorities
Portable file server with accelerated resumable uploads, dedup, WebDAV, SFTP, FTP, TFTP, zeroconf, media indexer, thu...
[arXiv 2025] VisualMimic: Visual Humanoid Loco-Manipulation via Motion Tracking and Generation
The AIO solution to your self hosted media gathering needs
Open source WhatsApp inbox for the Cloud API. Template messages, interactive buttons, media support, and 24-hour wind...
rCM: SOTA Diffusion Distillation & Few-Step Video Generation based on sCM/MeanFlow
A general-purpose AI image generation framework that supports Hugging Face, Gitee, Model Scope, and more.
LLM-written music
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation
An interactive 3D globe that lets you explore the history of any location on the planet. Born from a love of "doom-sc...
Extract any websiteโs design system into tokens in seconds: logo, colors, typography, borders & more. One command.
Trace Anything: Representing Any Video in 4D via Trajectory Fields
Tiny truly local voice-activated LLM Agent that runs on a Raspberry Pi
lsp audio feedback in neovim
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understan...
A modern selfhosted media management system for your media library
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
SoTA open-source TTS
Speakr is a personal, self-hosted web application designed for transcribing audio recordings
A tool to snap pixels to a perfect grid. Designed to fix messy and inconsistent pixel art generated by AI.