Tagged "on-device-inference"

Velr: Embedded Property-Graph Database for Local LLM Applications 23 March 2026
Self-Hostable AI Agents and Internal Software Framework Released 23 March 2026
Qt 6.11 Released with Enhanced Cross-Platform Deployment Capabilities 23 March 2026
Alibaba Commits to Continuous Open-Sourcing of Qwen and Wan Models 23 March 2026
Building a Production AI Receptionist: Practical Local LLM Deployment Case Study 23 March 2026
Careless Whisper – Personal Local Speech to Text 22 March 2026
A Little Gap That Will Ensure the Future of AI Agents Being Autonomous 22 March 2026
Running an AI Agent on a 448KB RAM Microcontroller 21 March 2026
Qualcomm and Samsung's 30-Year AI Alliance Enters a New Phase as On-Device AI Chip Race Heats Up 21 March 2026
NVIDIA Nemotron 3 Nano 4B Enables On-Device Inference Directly in Web Browsers via WebGPU 20 March 2026
Meet Sarvam Edge: India's AI Model That Runs on Phones and Laptops With No Internet 19 March 2026
Multiverse Computing Targets On-Device AI With Compressed Models and New API Portal 19 March 2026
Dell Pro Max 16 Plus Launches With Enterprise-Grade Discrete NPU for On-Device AI 19 March 2026
On-Device AI: Tether's QVAC Fabric Enables Local Training 18 March 2026
Snapdragon 8 Elite Gen 5 Hands the Galaxy S26 the AI Upgrade We've Been Waiting For 18 March 2026
LucidShark – Local-first, open-source quality and security gate 18 March 2026
OpenJarvis: Local-First AI Agents That Run Entirely On-Device 17 March 2026
A New Magnetic Material for the AI Era 17 March 2026
How I Used Lima for an AI Coding Agent Sandbox 17 March 2026
The Moment AI Agents Stopped Being a Feature and Started Becoming a System 17 March 2026
Qwen 3.5 122B Demonstrates Exceptional Reasoning for Local Deployment 16 March 2026
OmniCoder-9B: Efficient Coding Model for 8GB GPUs 16 March 2026
Apple's On-Device AI Raises Privacy Alarms Across British Parliament 16 March 2026
AMD Declares 'AI on the PC Has Crossed an Important Line' – Agent Computers as Next Breakthrough 16 March 2026
India's Mobile-First AI Strategy Could Accelerate Local Inference Adoption in Emerging Markets 15 March 2026
Hybrid AI Desktop Layer Combining DOM-Automation and API-Integrations 15 March 2026
Cicikus v3 Prometheus 4.4B – An Experimental Franken-Merge for Edge Reasoning 15 March 2026
Local Manga Translator: Production LLM Pipeline with YOLO, OCR, and Inpainting 14 March 2026
I Fed My Home Assistant Logs Into a Local LLM, and It Found Problems I'd Been Ignoring for Months 14 March 2026
Best Local LLM Models 2026: Developer Comparison 14 March 2026
MeepaChat – Slack for AI Agents (iOS, macOS, Web / Cloud, Self-Hosted) 12 March 2026
Local AI Coding Assistant: Complete VS Code + Ollama + Continue Setup 12 March 2026
Simple Layer Duplication Technique Achieves Top Open LLM Leaderboard Performance 11 March 2026
Kali Linux Integrates Local Ollama and MCP for AI-Driven Penetration Testing 11 March 2026
SK Hynix Develops 1c LPDDR6 DRAM to Boost On-Device AI Performance in Mobile Devices 10 March 2026
Qwen 3.5 Ultra-Compact Models Enable On-Device AI from Watches to Gaming 10 March 2026
Google Delivers On-Device AI Features in New Chromebook Plus Model 10 March 2026
Qwen 3.5 Small Expands On-Device AI to Phones and IoT with Offline Support 9 March 2026
Engram – Open-Source Persistent Memory for AI Agents 9 March 2026
Samsung Opens Registration for Vision AI QLED and OLED Television Integration 8 March 2026
Show HN: Ivy – the first proactive, offline AI tutor 8 March 2026
Windows 11 Notepad Gets On-Device AI Text Generation Without Subscription 7 March 2026
Building PyTorch-Native Support for IBM Spyre Accelerator 7 March 2026
Llama.cpp Merges Automatic Parser Generator to Mainline 7 March 2026
Turning Your Linux Terminal into a Local AI Assistant 7 March 2026
IBM Granite 4.0 1B Speech Model Released for Multilingual Speech Recognition 7 March 2026
Show HN: Asterode – Multi-Model AI App with Memory and Power Features 7 March 2026
Alibaba Releases Qwen 3.5 AI Model with On-Device AI Support 7 March 2026
Windows 11 Notepad to Feature On-Device AI Text Generation Without Subscription 6 March 2026
The Emerging Role of SRAM-Centric Chips in AI Inference 6 March 2026
OPPO and MediaTek Highlight On-Device AI Innovations at MWC 2026 6 March 2026
Alibaba Releases Qwen 3.5 AI Model with On-Device AI Support 6 March 2026
MediaTek Advances Omni Model for Efficient Smartphone Inference 5 March 2026
Kakao Launches Kanana AI for On-Device Schedule and Recommendation Management 5 March 2026
Apple Unveils MacBook Pro with M5 Pro and M5 Max Featuring On-Device AI 5 March 2026
RunAnywhere Launches Production-Grade On-Device AI Platform for Enterprise Scale 4 March 2026
Qualcomm Snapdragon Wear Elite Brings On-Device AI to Smartwatches 4 March 2026
OpenWrt 25.12.0 – Stable Release 4 March 2026
Glyph – A Local-First Markdown Notes App for macOS Built With Rust 4 March 2026
Apple Unveils MacBook Pro With M5 Pro and M5 Max for On-Device AI 4 March 2026
Apple M5 Pro and M5 Max: 4× Faster LLM Processing 4 March 2026
AMD Launches Copilot+ Desktop Chips to Compete in On-Device AI Market 4 March 2026
VibeWhisper – macOS Voice-to-Text with 100% Local Processing Option 3 March 2026
Qwen 3.5 Small Models Released: 0.8B to 9B Parameters Optimized for On-Device Inference 3 March 2026
Apple M4 iPad Air Targets AI Users with Double M1 Speed Performance 3 March 2026
Alibaba's Qwen 3.5 Small Model Runs Directly on iPhone 17 3 March 2026
Qualcomm Launches Snapdragon Wear Elite for On-Device AI on Wearables 2 March 2026
HP ZBook Ultra 14 G1a Workstation Reclaims Local AI Workflows for Professionals 2 March 2026
Browser Use vs. Claude Computer Use: Comparing Agent Automation Frameworks 2 March 2026
AMD Expands Ryzen AI 400 Series Portfolio for Consumer and Enterprise AI PC Options 2 March 2026
ParseHive – AI-Powered Invoice Data Extraction for Windows and Mac 1 March 2026
DeepSeek V4 Multimodal Model Coming Next Week With Image and Video Generation 1 March 2026
Apple Intelligence, Galaxy AI, Gemini: Why Your AI-Powered Phone Is Worth Repairing 1 March 2026
Serve Markdown to LLMs from your Next.js app 28 February 2026
On-Device AI in Mobile Apps: What Should Run on the Phone vs the Cloud (A 2026 Decision Guide) 28 February 2026
Meta Reveals AI-Packed Smartwatch In 2026 – Why Wearables Shift Now 28 February 2026
Galaxy S26 Debuts AI-Powered Scam Detection in Bold Security Push 28 February 2026
Snapdragon 8 Elite Gen 5 for Galaxy Official: 5 Key Improvements that Push the Boundaries 27 February 2026
Snapdragon 8 Elite Gen 5 Powers Galaxy S26 Series With Enhanced On-Device AI 27 February 2026
On-Device AI in Mobile Apps: What Should Run on the Phone vs the Cloud (A 2026 Decision Guide) 27 February 2026
Show HN: Caret – Tab to Complete at Any App on Your Mac 27 February 2026
Arduino, Qualcomm Bring On-Device AI and Robotics Learning to Indian School Systems 27 February 2026
Android Phones Are Getting Smarter Without Internet — Here's Why On-Device AI Is the Next Big Shift 27 February 2026
Android Phones Are Getting Smarter Without Internet — On-Device AI as the Next Shift 27 February 2026
Building a Privacy-Preserving RAG System in the Browser 26 February 2026
Researchers Develop Persistent Memory System for Local LLMs—No RAG Required 26 February 2026
Ollama for JavaScript Developers: Building AI Apps Without API Keys 26 February 2026
DeepSeek Paper – DualPath: Breaking the Bandwidth Bottleneck in LLM Inference 26 February 2026
The Complete Developer's Guide to Running LLMs Locally: From Ollama to Production 26 February 2026
Apple: Python bindings for access to the on-device Apple Intelligence model 26 February 2026
Agent System – 7 specialized AI agents that plan, build, verify, and ship code 26 February 2026
New Era of On-Device AI Driven by High-Speed UFS 5.0 Storage 25 February 2026
Mirai Announces $10M to Advance On-Device AI Performance for Consumer Devices 25 February 2026
Show HN: MCP-Enabled File Storage for AI Agents, Auth via Ethereum Wallet 25 February 2026
Mirai Tech Raises $10 Million for On-Device AI Innovation 24 February 2026
No, Local LLMs Can't Replace ChatGPT or Gemini — I Tried 24 February 2026
Kioxia Sampling UFS 5.0 Embedded Flash Memory for Next-Generation Mobile Applications 24 February 2026
Enhanced Interface Speed Enables High-Performance On-Device AI Features in Smartphones 24 February 2026
Elastic Introduces Best-in-Class Embedding Models for High Performance Semantic Search 24 February 2026
Apple Accelerates U.S. Manufacturing with Mac Mini Production 24 February 2026
Comparing Manual vs. AI Requirements Gathering: 2 Sentences vs. 127-Point Spec 24 February 2026
South Korea to Launch $687 Million Project to Develop On-Device AI Semiconductors 23 February 2026
Qwen3's Voice Embeddings Enable Local Voice Cloning and Mathematical Voice Manipulation 23 February 2026
Custom Portable Workstation Optimized for Local AI Inference Builds 23 February 2026
Open-Source llama.cpp Finds Long-Term Home at Hugging Face 23 February 2026
Future of Mobile AI: What On-Device Intelligence Means for App Developers 23 February 2026
Future of Mobile AI: What On-Device Intelligence Means for App Developers 23 February 2026
The Complete Stack for Local Autonomous Agents: From GGML to Orchestration 23 February 2026
AI Is Stress Testing Processor Architectures and RISC-V Fits the Moment 22 February 2026
How Slow Local LLMs Are on My Framework 13 AMD Strix Point 22 February 2026
At India AI Impact Summit, Intel Showcases AI PCs and Cost-Efficient Frugal AI 22 February 2026
Asus ExpertBook B3 G2 with 50 TOPS AI Sets New Enterprise Standard 22 February 2026
AI PCs Explained: 7 Critical Truths About NPUs and Privacy 22 February 2026
Vellium v0.3.5: Major Writing Mode Overhaul and Native KoboldCpp Support 21 February 2026
Taalas Etches AI Models onto Transistors to Rocket Boost Inference 21 February 2026
I Run Local LLMs in One of the World's Priciest Energy Markets, and I Can Barely Tell 21 February 2026
[Release] Ouro-2.6B-Thinking: ByteDance's Recurrent Model Now Runnable Locally 21 February 2026
At India AI Impact Summit, Intel Showcases Its AI PCs and Cost-Efficient Frugal AI 21 February 2026
Open-Source + AI: ggml Joins Hugging Face, llama.cpp Stays Open—Local AI's Long-Term Home 21 February 2026
GGML.AI Acquired by Hugging Face 21 February 2026
Apple Researchers Develop On-Device AI Agent That Interacts With Apps for You 21 February 2026
SanityBoard Adds 27 New Model Evaluations Including Qwen 3.5 Plus, GLM 5, and Gemini 3.1 Pro 20 February 2026
PaddleOCR-VL Now Integrated into llama.cpp for Multilingual OCR 20 February 2026
Why AI Models Fail at Iterative Reasoning and What Could Fix It 20 February 2026
Show HN: Forked – A Local Time-Travel Debugger for OpenClaw Agents 20 February 2026
Mihup and Qualcomm Collaborate to Advance Secure On-Device Voice AI for BFSI 19 February 2026
Local-First RAG: Vector Search in SQLite with Hamming Distance 19 February 2026
Kitten TTS V0.8 Released: State-of-the-Art Super-Tiny Text-to-Speech Model Under 25MB 19 February 2026
Clipthesis: Free Local App for Video Tagging and Search Across Drives 19 February 2026
Why My Country's AI Scene Is Built on Sand 18 February 2026
Sarvam AI Launches Edge Model to Challenge Major AI Players with Local-First Approach 18 February 2026
Qualcomm Ventures Positions India as Blueprint for Affordable On-Device AI Infrastructure 18 February 2026
Can We Leverage AI/LLMs for Self-Learning? 18 February 2026
Meet Sarvam Edge: India's AI Model That Runs on Phones and Laptops With No Internet 17 February 2026
Cohere Releases Tiny Aya: Efficient 3.3B Multilingual Model for 70+ Languages 17 February 2026
ASUS Zenbook 14 Launches in India with AI-Capable Hardware, Starting at Rs 1,15,990 17 February 2026
Asus ExpertBook B3 G2 Laptop Features Ryzen AI 9 HX 470 CPU in 1.41kg Ultraportable Form Factor 17 February 2026
Ask HN: What is the best bang for buck budget AI coding? 17 February 2026
I broke into my own AI system in 10 minutes. I built it 17 February 2026
Sourdine: Open-Source macOS App for 100% Local AI Transcription 16 February 2026
MiniMax-M2.5 230B MoE Model Released with GGUF Support for Local Deployment 14 February 2026
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts 11 February 2026
Arm SME2 Technology Expands CPU Capabilities for On-Device AI 11 February 2026