back to top
HomeSoftwareAI ToolsOvi: Open Source AI Video & Audio Generator Like Sora 2 and...

Ovi: Open Source AI Video & Audio Generator Like Sora 2 and Veo 3

An Open Source Alternative to Veo 3 & Sora 2 AI Video Generator for free

- Advertisement -

File Information

PropertyDetails
NameOvi
File TypeGit repository (zip available)
LicenseOpen Source (Apache 2.0)
RepositoryGitHub Repository
PlatformWindows, Linux, macOS
Required Python3.10+
DependenciesPyTorch, torchvision, torchaudio, flash_attn, others

Description

Ovi is a groundbreaking open-source AI model developed by Character AI that revolutionizes the way creators generate audiovisual content. Ovi can simultaneously produce synchronized video and audio directly from text or a combination of text and images. Inspired by advanced models like Veo 3 and Sora 2, it empowers users to bring their ideas to life with stunning cinematic results, without the need for expensive software or high-end hardware.

With Ovi, creators can produce 5-second high-quality videos with audio at 24 frames per second, supporting multiple aspect ratios including 9:16, 16:9, and 1:1. This makes it perfect for social media clips, marketing content, and even educational short videos. The model’s flexible input system allows you to simply provide a text prompt describing the scene, or enhance it further by adding an image reference, giving you full creative control over the generated content.

One of Ovi’s most impressive features is its ability to generate synchronized audio and video seamlessly. Using special tags like <S> for speech and <AUDCAP> for audio descriptions, users can define exactly what dialogue or sound effects appear in the video. This ensures that every generated clip has perfectly timed narration, effects, or background sounds, opening endless possibilities for storytelling, music videos, or even animated scenes.

Ovi is also designed to be highly accessible. The model is open-source, meaning developers can modify it, integrate it into their own projects, or experiment with custom enhancements. Its performance is optimized for modern GPUs, but it also works with slightly older hardware, making it approachable for both hobbyists and professional creators.

For creators who love experimenting, Ovi supports multiple modes including text-to-video, image-to-video, and hybrid workflows. You can generate clips in bulk, test variations, and even fine-tune the output using configuration files. Multi-GPU setups are supported for faster generation, while single GPU usage is straightforward for beginners.

Usage of Ovi AI Video Generator

Whether you’re a filmmaker, content creator, or AI enthusiast, Ovi provides a powerful, flexible, and fully open-source tool to produce high-quality audiovisual content. With its combination of ease-of-use, creative freedom, and advanced AI capabilities, Ovi truly sets itself apart as one of the most exciting video generation tools available today.

You can easily find some of the video Generations from Sora 2 & veo3 alternative Ovi on their official github repository

How Ovi was Developed??

Ovi is built with contributions and ideas from some of the best open-source video and audio generation projects. Its video branch is initialized from Wan2.2, ensuring cutting-edge capabilities for text-to-video generation. The audio encoder and decoder components are borrowed from MMAudio, which allows Ovi to synchronize audio with video seamlessly.

Features of Ovi AI Video Generator

FeatureDescription
Video+Audio GenerationCreate synchronized video and audio clips in one go
Flexible InputSupports text-only or text+image prompts
Multi-Aspect Ratios9:16, 16:9, 1:1 and more
High FPSGenerates 5-second clips at 24 FPS
Customizable PromptsUse <S> tags for speech and <AUDCAP> tags for audio descriptions
Multi-GPU SupportAccelerated generation with multiple GPUs
Open-SourceFully modifiable and extensible
Easy ConfigurationYAML files to control generation quality, audio/video balance, and output directory
Example PromptsReady-to-use CSV examples for quick start
Flexible Memory UsageSupports fp8 quantization and CPU offloading for lower VRAM GPUs

Screenshots

System Requirements

ComponentMinimumRecommended
GPU32 GB VRAM for standard model, 24 GB for fp8 quantizedRTX 40XX, 30XX, 20XX, GTX 10XX, or equivalent
CPU8 cores12+ cores for multi-GPU setups
RAM16 GB32+ GB
Disk Space10 GB50+ GB for models, videos, and caches
OSWindows, Linux, macOSSame
Python3.10+Same

Also Read: Ovi AI Video + Audio Generator in ComfyUI: Best Open-Source Alternative to Veo 3 & Sora 2

How to Install Ovi: An Open Source Alternative to Sora 2 & Veo3 Locally

Step-by-Step Installation

  1. Clone the repository

You can clone the repository from command below or download it manually from download section of the page.

git clone https://github.com/character-ai/Ovi.git
cd Ovi
  1. Create and activate a virtual environment
virtualenv ovi-env
source ovi-env/bin/activate
  1. Install PyTorch
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1
  1. Install dependencies
pip install -r requirements.txt
  1. Install Flash Attention
pip install flash_attn --no-build-isolation

Alternative method (if needed)

git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention/hopper
python setup.py install
cd ../..
  1. Download pre-trained weights
python3 download_weights.py
# Optional: custom directory
python3 download_weights.py --output-dir <custom_dir>
  1. Run Ovi
python3 inference.py --config-file ovi/configs/inference/inference_fusion.yaml

For multi-GPU setups:

torchrun --nnodes 1 --nproc_per_node 8 inference.py --config-file ovi/configs/inference/inference_fusion.yaml

You can also launch the Gradio interface:

python3 gradio_app.py
# Optional flags: --cpu_offload, --use_image_gen, --fp8

Download Ovi: An Open Source Veo3 & Sora2 Alernative to generate AI video with audio for free

Tips & Advantages of Ovi AI Video + Audio Generator

  • Generate videos directly from text prompts in seconds.
  • Include synchronized speech or sound effects using special tags.
  • Fine-tune video quality, denoising steps, and audio/video balance via config files.
  • Share .lset style prompt files with embedded configurations for collaboration.
  • Works efficiently on both older and newer GPUs, with fp8 quantization reducing VRAM usage.
  • Open-source and fully modifiable, perfect for research or personal projects.

Ovi is an exceptional choice for anyone exploring AI-driven video creation. It offers the same capabilities as Sora 2 or Veo 3, but with full open-source flexibility, multi-GPU support, and customizable inputs for truly unique results. Whether you are making short clips for social media, experimental AI films, or storytelling videos, Ovi provides unmatched control and efficiency.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

YOU MAY ALSO LIKE
puremac macos cleaner

PureMac: A Simple macOS Cleaner for Removing Apps, Junk Files, and Leftovers

0
macOS doesn’t really delete apps. It removes the app bundle, sure, but everything else stays behind. Preferences, caches, random support files buried in Library folders. PureMac goes after that leftover mess. Pick an app and it pulls up everything linked to it. It digs through metadata, bundle IDs, and other hints to figure out what belongs where. You get a list, you decide what goes. It also checks for leftovers from apps you already removed. That one surprised me a bit. Old files from things I don’t even remember installing. There’s a cleaning section too. Caches, logs, Xcode data, Homebrew downloads, large files sitting around doing nothing. It’s all there, grouped in a way that doesn’t feel random.
File Converter Pro offline file converter for images audio video and documents

File Converter Pro offline file converter for images audio video and documents

0
Most file converters still push you to upload your files somewhere. Even for basic stuff like changing a PDF or converting an image. It works, but it’s not something you feel great about, especially with random files. File Converter Pro works like a simple offline converter. You drop files in, pick what you want, and it converts everything locally. No uploads or any server. The UI isn’t just functional, it actually looks like someone cared. Smooth startup, proper dark mode, small touches that make it feel like a real app instead of a side project. There’s also some extra stuff like stats and achievements. Sounds gimmicky, but it kind of works. You start noticing how often you use it. It’s not lightweight though. And if you want audio or video conversions, you’ll need FFmpeg. But once that’s sorted, you’re done setting things up.
DockDoor macOS app for window previews and Alt Tab switching

DockDoor macOS app for window previews and Alt Tab switching

0
macOS looks clean until you have five Safari windows open and no clue which one actually has the tab you need. DockDoor fixes that in the simplest way possible. Hover over an app in the dock, and it shows you every open window right there. You just click the one you want. That’s it. It also adds a proper Alt+Tab experience. Not the macOS version that switches apps, but actual window switching with previews, the way Windows users are used to. Once you try it, going back feels weird.

Don’t miss any Tech Story

Subscribe To Firethering NewsLetter

You Can Unsubscribe Anytime! Read more in our privacy policy