Tagged "cpu-only"

Phison and Intel Roll Out aiDAPTIV to Boost Local AI on Intel AI PC Platforms 2 June 2026
Microsoft and Nvidia to Unveil First Windows PCs with Nvidia CPUs and AI Capabilities 31 May 2026
Snapdragon C Debuts with 6nm Process and Dedicated On-Device AI Engine 30 May 2026
Tweaking Local Language Model Settings with Ollama 29 May 2026
Lenovo Bets on On-Device AI to Lift Business PC Upgrades 28 May 2026
The Anatomy of an LLM 28 May 2026
Developer Switches from LM Studio to llama.cpp, Reports No Performance Downgrade 26 May 2026
Dell Launches 14 Plus Laptop with Intel Core Ultra 9 and 32GB RAM at $1,499.99, Enabling Local Model Inference 26 May 2026
Redditor Successfully Runs 1 Trillion Parameter LLM Using Cheap Intel Optane DIMMs 24 May 2026
Intel llm-scaler-vllm 1.4 Released With Updated Components and Arc Pro B70 Support 21 May 2026
AMD's New Ryzen AI Max Pro 400 with 192GB LPDDR5X Memory 21 May 2026
llama.cpp Adds Multi-Token Prediction, Doubles Qwen 3.6B Throughput for Local Inference 19 May 2026
Linux 7.1-rc4 Released: Kernel Updates Relevant to Local LLM Inference 18 May 2026
AI/ML Benchmark Tool for Local LLM Inference and XGBoost Training 16 May 2026
Show HN: Find the best local LLM for your hardware, ranked by benchmarks 15 May 2026
Running Local AI LLMs on Mini PCs Without NVIDIA GPUs 14 May 2026
Running a Local LLM on a 12-Year-Old Raspberry Pi 13 May 2026
Mainline Linux 6.12 on Annapurna Labs Alpine V2 (Ubiquiti UNVR, UDM-Pro) 13 May 2026
Lucebox Brings Faster Local AI Inference to AMD Strix Halo 13 May 2026
How I Used a Local LLM to Organize the Store on My NAS 13 May 2026
Running a Local LLM on a 12-Year-Old Raspberry Pi: Practical Edge Inference 12 May 2026
How I Used a Local LLM to Organize the Store on My NAS 9 May 2026
Microsoft VibeVoice C++ Port Enables Local Voice AI on CPU and GPU Without Python 6 May 2026
Sarvam Edge: Indian-Built AI Models Run Offline on Phones and Laptops Without Internet 6 May 2026
llama.cpp Now Supports Multi-Token Prediction in Beta 5 May 2026
Building a Raspberry Pi-Based Local LLM Server for Remote Access 1 May 2026
New Open-Source Tool Automatically Matches Local LLMs to Your PC Hardware 1 May 2026
Running Capable Local LLMs Without Expensive GPU Hardware 30 April 2026
Why the Same LLM Gives Different Answers in Different Environments 28 April 2026
The New Linux Kernel AI Bot Uncovering Bugs Is A Local LLM On Framework Desktop + AMD Ryzen AI Max 27 April 2026
Intel OpenVINO 2026.1 Integrates llama.cpp with Wildcat Lake and Arc Pro B70 23 April 2026
The Open-Source AI Ecosystem Keeps Treating llama.cpp Like a Second-Class Citizen 21 April 2026
Sorting 1M u64 KV-Pairs in 20ms on i9-13980HX Using Branchless Rust Implementation 18 April 2026
Dynamic Expert Cache in llama.cpp Achieves 27% Faster Inference on Large MoE Models 15 April 2026
Qwen 3.5 Small – On-Device Multimodal Models Released 14 April 2026
A Deep Dive into Tinygrad AI Compiler 12 April 2026
The Best Local AI Model for Home Assistant Isn't Always the Biggest One 12 April 2026
Building Offline AI Companions on Severely Constrained Hardware (8GB RAM) 10 April 2026
5 Open-Source Projects Running Transformers on CPUs to GPUs in Pure Java 10 April 2026
Speculative Decoding Made My Local LLM Actually Usable 9 April 2026
Run Qwen3.5 on an Old Laptop: A Lightweight Local Agentic AI Setup Guide 9 April 2026
Intel Releases OpenVINO 2026.1 With Backend For Llama.cpp, New Hardware Support 9 April 2026
Your Next Assistant is Your PC: How On-Device AI is Transforming Work, One Workflow at a Time 7 April 2026
Octopoda: Open Source Memory Layer for Fully Offline AI Agents 7 April 2026
AMD Announces Day 0 Support for Google Gemma 4 Across Processors and GPUs 7 April 2026
TurboQuant in Llama.cpp Achieves 6X Smaller KV Cache 6 April 2026
Kokoro TTS Achieves 20× Realtime Speed on CPU-Only On-Device Inference 4 April 2026
Gemma 4 KV Cache Memory Issues Fixed in llama.cpp 4 April 2026
AMD Rolls Out Gemma 4 Model Support Across Full Range of GPUs & CPUs 4 April 2026
OpenUMA – Apple-Style Unified Memory for x86 AI Inference 3 April 2026
Show HN: Extra-Platforms, Python Library to Detect OS, Arch, Shell, CI, AI 2 April 2026
Local AI Ecosystem Extends Far Beyond Ollama 1 April 2026
Claw64 – Full Agentic Loop in <4KB on Commodore 64 1 April 2026
PrismML Announces 1-Bit Bonsai: First Commercially Viable 1-Bit LLMs 1 April 2026
Select the Right Hardware for Your Local LLM Deployment with This Online Guide 30 March 2026
DeepSeek V3 Complete Guide: Deploy and Optimize Local AI in 2026 30 March 2026
TurboQuant KV Cache Compression Achieves 22.8% Faster Decoding at 32K Context 28 March 2026
Samsung Galaxy Book6 Series Brings Intel Core Ultra Chips for On-Device LLM Inference 28 March 2026
TurboQuant Benchmarked in Llama.cpp: Google's Extreme Compression Research Tested in Practice 27 March 2026
Coding Implementation to Run Qwen3.5 Reasoning Models Distilled With Claude-Style Thinking Using GGUF and 4-Bit Quantization 27 March 2026
HP Launches IQ On-Device AI Assistant, Advancing Enterprise AI Adoption on PCs 25 March 2026
.APKs Are Just .ZIPs: Semi-Legally Hacking Software for Orphaned Hardware 25 March 2026