Audio & Speech Models

Speech recognition, text-to-speech, and audio processing models.

📊 Updated: Dec 31, 2025, 03:58 PM UTC
Text Generation Code Generation Embedding Image Generation Vision & Multimodal Audio & Speech

🏆 Top Ranked

📊 Full Rankings

Showing 100 of 0 models
📦 Source
🛡️ FNI 53

sora-2-pro

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 52

speech-02-hd

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 51

music-01

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 51

whisper-diarization-advanced

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 50

veo-3

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 50

veo-3.1-fast

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 50

kokoro-82m

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 50

kokoro-82m

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 50

veo-3-fast

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 50

mmaudio

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 50

whisper-diarization

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 49

lipsync

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 49

speech-2.6-hd

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 49

lyria-2

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 49

speech-2.6-turbo

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 48

gpt-4o-transcribe

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 48

voice-cloning

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 48

kling-lip-sync

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 48

music-1.5

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 47

canary-qwen-2.5b

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 47

gpt-4o-mini-transcribe

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 46

kimi-audio-7b-instruct

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 46

xtts-v2

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 46

clipforge

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 45

orpheus-3b-0.1-ft

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 45

tangoflux

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 45

ace-step

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 45

csm-1b

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 45

audio-flamingo-3

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 44

parakeet-rnnt-1.1b

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 44

realistic-voice-cloning

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 44

cog-orpheus-3b-0.1-ft

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 44

whisperx-a40-large

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 44

sarra-video-maker-v1

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 44

demucs

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 44

kokoro-82m

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 44

play-dialog

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 44

force-align-wordstamps

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 44

codemusic

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 43

hunyuanvideo-foley

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 43

indextts-2

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 43

voicecraft

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 43

dia

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 42

resemble-enhance

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 42

thinksound

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 42

whisper

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 42

hate-speech-detector

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 42

musicgen

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 42

video-stitcher

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 41

voxtral-small-24b-2507

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 41

voxtral-mini-3b

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 41

forced-alignment

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 41

all-in-one-music-structure-analyzer

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 41

all-in-one-audio

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 41

yue

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 41

musicology-80s

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 41

mmaudio-t4

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 41

music-gen-fn-200e

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 41

rvc-v2

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

musicgen-90s

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

musicgen-00s

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

higgs-audio-v2

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

whisperx-subtitles-replicate

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

trim-video

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

kokoro-82m-all-voices

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

memo

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

video-retalking

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

demucs

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

piper_persian

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

pdf-to-podcast

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

f5-tts

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 40

mvsep-mdx23-music-separation

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

moss-ttsd

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

musicgen-remixer

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

whisper-large-v3

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

whisperx

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

xtts-v2-fork

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

speech-enhancer

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

vibevoice

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

music-arousal-valence

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

aniportrait-audio2vid

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

video-audio-merge

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

musicgen-fine-tuner

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

speaker-diarization

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

music-classifiers

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 39

maest

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 38

flux-music

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 38

zonos

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 38

whisper-timestamped

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 38

whisperx

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 38

musicgen-songstarter-v0.2

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 38

speaker-diarization-3

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 38

styletts2

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 38

hierspeechpp

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 38

video-to-audio-and-piano

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 37

whisperx

by

No description available.

❤️ 0
📥 0
📦 Source
🛡️ FNI 37

whisper-lazyloading

by

No description available.

❤️ 0
📥 0

About Audio & Speech Models

Audio AI models handle speech recognition, text-to-speech synthesis, and audio processing. They enable voice assistants, transcription services, and audio content creation.