Local Generative AI Models
Privacy-first LLMs for text transformation, summarization, and AI commands running locally.
Phi-4 Mini 3.8B
LLM
2.2 GB
Latest Microsoft Phi-4 Mini ONNX export. State-of-the-art small model. ~2.3GB.
memory
VRAM
4GB
speed
CPU
Intel Core i5 / Apple M2 / Snapdragon X Elite
Phi-3 Mini 4K Instruct
LLM
2.7 GB
Microsoft Phi-3 Mini 4K Instruct. Efficient small model with strong reasoning. ~2.1GB.
memory
VRAM
4GB
speed
CPU
Intel Core i5 / Apple M2 / Snapdragon X Elite
Phi-3 Small 8K Instruct
LLM
5.0 GB
Microsoft Phi-3 Small 8K Instruct. Strong mid-range model. ~4.2GB.
memory
VRAM
8GB
speed
CPU
Intel Core i7 / Apple M3 / Snapdragon X Elite
Yi 1.5 6B Chat
LLM
3.8 GB
High-performance medium model by 01.AI. Balanced speed and intelligence. ~3.8GB.
memory
VRAM
8GB
speed
CPU
Intel Core i7 / Apple M3 / Snapdragon X Elite
Llama-3 8B Instruct (FP16)
LLM
16.0 GB
High-fidelity FP16 export of Llama-3 8B. Requires significant VRAM. ~16GB.
memory
VRAM
24GB
speed
CPU
Intel Core i9 / Apple M4 / Snapdragon X Elite
DeepSeek-R1 Distill Qwen 1.5B
LLM
1.0 GB
Efficient small distilled model by DeepSeek. INT4 CPU.
memory
VRAM
2GB
speed
CPU
Intel Core i3 / Apple M1 / Snapdragon 8cx
DeepSeek-R1 Distill Qwen 7B
LLM
6.7 GB
Powerful distilled model by DeepSeek. INT4 CPU. ~6.7GB.
memory
VRAM
8GB
speed
CPU
Intel Core i7 / Apple M3 / Snapdragon X Elite
Llama 3.2 1B Instruct
LLM
1.9 GB
Ultra‑lightweight Meta Llama‑3.2 1B Instruct ONNX model. Fast and efficient. ~1.1GB.
memory
VRAM
2GB
speed
CPU
Intel Core i3 / Apple M1 / Snapdragon 8cx