About
Distilled 6B parameter text-to-image model from Alibaba Tongyi Lab.
Uses Scalable Single-Stream DiT (S3-DiT) architecture for maximum parameter efficiency.
Sub-second inference latency on H800 GPUs, fits within 16GB VRAM consumer devices.
Only 8 NFEs (steps) needed. Use euler sampler, guidance_scale 0.0.
Excels at photorealistic generation and accurate bilingual text rendering (English & Chinese).
Available as native safetensors (bf16/nvfp4) or GGUF quantizations.
Preview
Variants
| Variant | Format | Size | VRAM | Install |
|---|---|---|---|---|
| bf16 ⓘ | safetensors | 13.2 GB | 16+ GB | mods install z-image-turbo --variant bf16 |
| nvfp4 ⓘ | safetensors | 4.8 GB | 8+ GB | mods install z-image-turbo --variant nvfp4 |
| gguf-q8-0 ⓘ | gguf | 7.8 GB | 10+ GB | mods install z-image-turbo --variant gguf-q8-0 |
| gguf-q6-k ⓘ | gguf | 6.3 GB | 8+ GB | mods install z-image-turbo --variant gguf-q6-k |
| gguf-q5-k-m ⓘ | gguf | 5.9 GB | 8+ GB | mods install z-image-turbo --variant gguf-q5-k-m |
| gguf-q5-k-s ⓘ | gguf | 5.6 GB | 8+ GB | mods install z-image-turbo --variant gguf-q5-k-s |
| gguf-q4-k-m ⓘ | gguf | 5.3 GB | 6+ GB | mods install z-image-turbo --variant gguf-q4-k-m |
| gguf-q4-k-s ⓘ | gguf | 5.0 GB | 6+ GB | mods install z-image-turbo --variant gguf-q4-k-s |
| gguf-q3-k-m ⓘ | gguf | 4.4 GB | 6+ GB | mods install z-image-turbo --variant gguf-q3-k-m |
| gguf-q3-k-s ⓘ | gguf | 4.1 GB | 6+ GB | mods install z-image-turbo --variant gguf-q3-k-s |
Dependencies
These models are automatically installed when you run
mods install z-image-turbo. No extra steps needed — mods resolves and downloads all dependencies for you.