Which AI model fits your smart home? From local LLMs to cloud APIs, from speech recognition to image recognition — a practical overview.
LLMs understand natural language and can generate automations, analyze sensor data, or act as a chatbot.
AI models that analyze camera images: detect objects, identify people, detect packages.
Speech to text and text to speech — the building blocks for a local voice assistant.
The software that ties it all together: from workflow engines to PII scrubbing.
All models at a glance.
| Model | Type | RAM | Local | Cost | Best for |
|---|---|---|---|---|---|
| Llama 3 8B | LLM | 8GB | ✓ | Free | All-rounder |
| Phi-3 Mini | LLM | 4GB | ✓ | Free | Edge / Pi 5 |
| Mistral 7B | LLM | 8GB | ✓ | Free | Code / YAML |
| Claude | LLM | — | ✗ | $3-15/M | Complex tasks |
| GPT-4o | LLM+Vision | — | ✗ | $2.50-10/M | Images + text |
| LLaVA | Vision | 8GB | ✓ | Free | Local image recognition |
| Whisper large-v3 | STT | 8GB | ✓ | Free | Speech recognition |
| Piper | TTS | <1GB | ✓ | Free | Text-to-speech |
| Frigate + Coral | Object Det. | 4GB | ✓ | ~30€ TPU | Camera surveillance |