Triage of Tiny and Base model families (all quantization and language variants) to evaluate whether these smaller models are worth further benchmarking.

May 9, 2026Apple M4 Max / 36 GB / macOS 26.4.112 models50 samples per dataset

Accuracy (%) per language. Speed in ms per second of audio.
Model ⇅	Samples ⇅	Disk ⇅	RAM ⇅	Speed ⇅	Avg overall ↓	Avg EN ⇅	Avg multi ⇅	EN ⇅	EN noisy ⇅	ES ⇅	DA ⇅	HU ⇅
Base full	50	142 MB	334 MB	57 ms	69.1%	91.2%	54.4%	91.8%	90.6%	88.4%	39.5%	35.3%
Base q8_0	50	78 MB	247 MB	57 ms	69.0%	91.2%	54.2%	92.4%	90.1%	88.6%	38.4%	35.5%
Base q5_1	50	57 MB	219 MB	57 ms	68.7%	90.3%	54.3%	90.8%	89.9%	88.9%	37.6%	36.4%
Tiny full	50	75 MB	224 MB	57 ms	58.0%	87.0%	38.7%	87.6%	86.4%	85.0%	14.2%	16.9%
Tiny q8_0	50	42 MB	174 MB	57 ms	57.6%	87.0%	38.0%	87.2%	86.8%	85.0%	11.2%	17.9%
Tiny q5_1	50	31 MB	157 MB	58 ms	56.5%	86.5%	36.5%	87.1%	86.0%	83.5%	14.4%	11.7%
Base q5_1 en	50	57 MB	217 MB	58 ms	33.7%	92.4%	-5.3%	92.7%	92.1%	3.8%	-9.6%	-10.2%
Base q8_0 en	50	78 MB	247 MB	58 ms	33.7%	92.1%	-5.2%	92.4%	91.8%	3.2%	-7.9%	-10.9%
Base full en	50	142 MB	333 MB	59 ms	33.0%	92.4%	-6.7%	93.1%	91.8%	0.8%	-7.0%	-13.8%
Tiny full en	50	75 MB	224 MB	59 ms	26.5%	88.4%	-14.8%	88.9%	87.8%	-1.2%	-18.7%	-24.4%
Tiny q5_1 en	50	31 MB	157 MB	59 ms	25.2%	88.2%	-16.7%	88.8%	87.5%	-1.0%	-26.0%	-23.2%
Tiny q8_0 en	50	42 MB	173 MB	58 ms	24.7%	88.6%	-17.9%	89.1%	88.1%	-2.3%	-20.7%	-30.7%

At a glance

Ratings computed from benchmark data, scaled 1 to 10. Accuracy is based on Word Error Rate (WER) and does not include punctuation yet.

Name ⇅	Lang ⇅	Translate ⇅	Speed ⇅	Accuracy ↓
Base full	all	✔	9	5
Base q5_1	all	✔	9	5
Base q8_0	all	✔	9	5
Tiny full	all	✔	9	3
Tiny q8_0	all	✔	9	3
Tiny q5_1	all	✔	9	2
Base full en	en	✘	8	1 (en: 9)
Base q5_1 en	en	✘	9	1 (en: 9)
Base q8_0 en	en	✘	9	1 (en: 9)
Tiny full en	en	✘	8	1 (en: 9)
Tiny q5_1 en	en	✘	8	1 (en: 9)
Tiny q8_0 en	en	✘	8	1 (en: 9)

Charts

Bar chart showing average English, multilingual, and overall accuracy per model. — Average accuracy by group

Bar chart comparing transcription speed across models and test conditions. — Speed comparison across conditions

Bar chart comparing model accuracy across English, Spanish, Danish, and Hungarian benchmark conditions. — Accuracy by model and test condition