Local Generative AI Models

Privacy-first LLMs for text transformation, summarization, and AI commands running locally.

Phi-4 Mini 3.8B
LLM 2.2 GB
psychology_alt

Latest Microsoft Phi-4 Mini ONNX export. State-of-the-art small model. ~2.3GB.

memory
VRAM 4GB
speed
CPU Intel Core i5 / Apple M2 / Snapdragon X Elite
Phi-3 Mini 4K Instruct
LLM 2.7 GB
psychology_alt

Microsoft Phi-3 Mini 4K Instruct. Efficient small model with strong reasoning. ~2.1GB.

memory
VRAM 4GB
speed
CPU Intel Core i5 / Apple M2 / Snapdragon X Elite
Phi-3 Small 8K Instruct
LLM 5.0 GB
psychology_alt

Microsoft Phi-3 Small 8K Instruct. Strong mid-range model. ~4.2GB.

memory
VRAM 8GB
speed
CPU Intel Core i7 / Apple M3 / Snapdragon X Elite
Yi 1.5 6B Chat
LLM 3.8 GB
cyclone

High-performance medium model by 01.AI. Balanced speed and intelligence. ~3.8GB.

memory
VRAM 8GB
speed
CPU Intel Core i7 / Apple M3 / Snapdragon X Elite
Llama-3 8B Instruct (FP16)
LLM 16.0 GB
psychology

High-fidelity FP16 export of Llama-3 8B. Requires significant VRAM. ~16GB.

memory
VRAM 24GB
speed
CPU Intel Core i9 / Apple M4 / Snapdragon X Elite
DeepSeek-R1 Distill Qwen 1.5B
LLM 1.0 GB
temp_preferences_custom

Efficient small distilled model by DeepSeek. INT4 CPU.

memory
VRAM 2GB
speed
CPU Intel Core i3 / Apple M1 / Snapdragon 8cx
DeepSeek-R1 Distill Qwen 7B
LLM 6.7 GB
temp_preferences_custom

Powerful distilled model by DeepSeek. INT4 CPU. ~6.7GB.

memory
VRAM 8GB
speed
CPU Intel Core i7 / Apple M3 / Snapdragon X Elite
Llama 3.2 1B Instruct
LLM 1.9 GB
psychology_alt

Ultra‑lightweight Meta Llama‑3.2 1B Instruct ONNX model. Fast and efficient. ~1.1GB.

memory
VRAM 2GB
speed
CPU Intel Core i3 / Apple M1 / Snapdragon 8cx