AI now fits in your hand: LLMs got about 10x smaller
AI now fits in your hand. LLMs are roughly 10x smaller than they were a year ago, and that's huge for device makers considering on-device AI.

- Not just for hyperscalers: any enterprise can build models for their niche without exorbitant training costs (think contextual AI).
- Targeted applications: perfect for IoT and embedded systems in unique contexts like smart appliances, retail, or industrial HMI.
- Compact and capable: models like Alibaba Qwen2.5, Meta's MobileLLM, or Hugging Face SmolLM2 are available under 1B parameters, running in under 1GB RAM.
- Rapid innovation: open-source models like SmolLM2 democratise access with cutting-edge tools and datasets.
- All the benefits of edge: privacy, latency, reliability, and more.
The future of AI is small, efficient, accessible, and on-device.
the new 1 billion parameter Llama model (version 3.2) is head-to-head with the 13x larger version from one year ago (Llama 2 13B) on the LMSYS chatbot arena
Thomas Wolf, Co-Founder and Chief Scientist, Hugging Face. (That's a Synaptics Astra Machina SL1680 board in the picture.)