Tagged "local-llm-deployment"
-
From Specialists to Builders: How AI Agentic Coding Is Reshaping Software Teams
-
Two LLM UI Patterns That Aren't Chat
-
Nvidia Enters Windows Laptop Market, Taking on Intel and AMD
-
Snapdragon C Specs Revealed: 6nm Process, On-Device AI Engine for Budget Laptops
-
Show HN: Egress WAF to Limit AI Agents and NPM Malware Based on mitmproxy
-
Slow Journal App with AI Integration
-
Show HN: AI-org – Org-mode Powered by AI
-
Tweaking Local Language Model Settings with Ollama
-
The Infrastructure Behind Making Local LLM Agents Actually Useful
-
Google Launches Tiny Board for Running Gemma 3 Locally
-
Money Printer Pro – Open-source AI Content Generator
-
The Anatomy of an LLM
-
Meet EAGLE 3.1: The Speculative Decoding Algorithm That Fixes Attention Drift in LLM Inference
-
Dell Launches 14 Plus Laptop with Intel Core Ultra 9 and 32GB RAM at $1,499.99, Enabling Local Model Inference
-
MCP Servers Transform Local LLM Stack, Replacing $249 Paid Tools
-
Why Your Docker Container Is 1.2GB When It Should Be 80MB
-
How to Self-Host LibreChat with Docker
-
User Migration from LM Studio/Ollama to llama.cpp Shows Growing Preference
-
Google Makes Gemini 3.5 Flash the Default AI Model for Billions of Users
-
A/B Tested Gemini 3.1 Pro vs. Claude Opus 4.6 – Usage Quota and Quality Comparison
-
AMD's New Ryzen AI Max Pro 400 with 192GB LPDDR5X Memory
-
Occupy Wall Street Co-Founder Builds Offline-Running AI Organizing Mentor
-
Local LLMs Enable Intelligent Smart Camera Control Without Cloud Dependency
-
The AI Layoff Receipts: Market Consolidation Accelerates Open-Source Model Adoption
-
Maker Builds Offline Jetson-Powered Chatbot Suitcase
-
HP's On-Device AI Needs More If It Is Going to Compete With Copilot
-
My Thoughts on AI, Part 1: Fears, Opinions, and Mental Journey
-
Local LLM Integration Enables Replacement of Paid Subscription Services
-
I Stopped Paying for ChatGPT and Switched to a Local LLM That Runs on My Laptop
-
Privatemode.ai – AI Provider with Confidential Computing
-
Gemma 4 Replaces Entire Local LLM Stack for Many Practitioners
-
I Built My Second Brain for Meetings. No Monthly Subscription
-
Mlx-serve: Run LLMs Natively on Your Mac
-
EU AI Act Article 50: Transparency Rules Impact on Local Deployments
-
How I Used a Local LLM to Organize the Store on My NAS
-
How to Run LLMs Locally on Your Laptop for Free: A Beginner's Guide
-
Dikaletus: Open-Source Meeting Recording and Transcription Using Mistral AI
-
Bun's Experimental Rust Rewrite Achieves 99.8% Test Compatibility on Linux
-
Perplexity Brings On-Device AI Workflow to Macs with 'Personal Computer' Feature
-
Critical Ollama Memory Leak Vulnerability Exposes 300,000 Servers Globally
-
Local LLM Rewrites Resume Better Than ChatGPT, and It's Not Even Close
-
0ctx – Local-First Project Memory for AI Workflows
-
Critical Ollama Memory Leak Vulnerability Exposes 300,000 Servers Globally
-
Locked, stocked, and losing budget: AI vendor lock-in bites back
-
Improving Code Quality with Local Claude and Codex Models
-
Google Accelerates Gemma 4 Inference Speed 3x With Multi-Token Prediction Drafters
-
US State Dept Orders Global Warning About Alleged AI Thefts by DeepSeek
-
Major Smartphone Brands Introduce Advanced On-Device AI Features
-
Ruflo: Multi-Agent AI Orchestration for Claude Code
-
Gemma 4 Just Replaced My Whole Local LLM Stack
-
Building a Jira Alternative with Claude in 8 Days
-
Local AI Just Got Easier on Windows and the Implications Go Beyond the Benchmark
-
Google Drops COSMO: Experimental On-Device AI Assistant for Android
-
Study: AI Models That Consider User Feelings Are More Likely to Make Errors
-
AI Coding Tools Are Silently Disagreeing with Each Other
-
Xmemory: Benchmarking Structured AI Memory Against RAG and Hybrid RAG
-
New Open-Source Tool Automatically Matches Local LLMs to Your PC Hardware
-
Meta Just Killed Open-Source AI
-
Running Capable Local LLMs Without Expensive GPU Hardware
-
Building a Remote-Accessible Local LLM Server on Raspberry Pi
-
Local AI Isn't Just Ollama—Here's the Ecosystem That Actually Makes It Useful
-
Hipfire: A Rust-Native AMD Inference Engine That Outperforms llama.cpp
-
Linux Crushes Windows on llama.cpp Inference by Double Digits
-
The New Linux Kernel AI Bot Uncovering Bugs Is A Local LLM On Framework Desktop + AMD Ryzen AI Max
-
75% of US Health Systems Are Using AI. Only 18% of That Deployment Is Governed
-
Critical Security Flaw: Hackers Can Exploit Ollama Model Uploads to Leak Sensitive Server Data
-
Seed3D 2.0
-
How to Make Sense of AI
-
Intel OpenVINO 2026.1 Integrates llama.cpp with Wildcat Lake and Arc Pro B70
-
Developer Replaced GPT-4 with a Local SLM and CI/CD Pipeline Stability Improved
-
Llama.cpp's Auto Fit Feature Quietly Reshapes Local AI Inference on Consumer Hardware
-
Google's Gemma 4 Finally Makes Local LLM Deployment Compelling for Practitioners
-
Malicious GGUF Models Could Trigger Remote Code Execution on SGLang Servers
-
Intel Extends AI PC Reach With New Core Ultra Series 3 Launch
-
Running DeepSeek R1 Locally: Your Complete Setup Guide
-
The AI-Ready Product Data Framework for B2B Commerce
-
AI Quota Inflation Is No Token Effort. It's Baked In
-
Minisforum Launches N5 Max AI NAS with OpenClaw
-
Local AI Isn't Just Ollama—Here's the Ecosystem That Actually Makes It Useful
-
Gemma 4 Just Replaced My Whole Local LLM Stack
-
We Built a Local Model Arena in 30 Minutes — Infrastructure Mattered More Than the App
-
Laimark – 8B LLM That Self-Improves on Consumer GPUs
-
Project Glasswing and the ASF: Open-Source's Chance to Win the AI Era
-
Book Translator: Two-Pass Local Translation with Self-Reflection via Ollama
-
Self-Hosted LLMs Transform Personal Knowledge Management Systems
-
Minisforum N5 MAX AI NAS Delivers 126 TOPS with 200TB Storage for Local LLM Workloads
-
Developer Shares Golden Stack for Local Coding Assistant Integration Directly Inside Code Editors
-
Copilot Rate-Limiting Issues Highlight Cloud AI Service Limitations
-
Show HN: SkillCompass – Open-Source Quality Evaluator for Your AI Skills
-
Defender – Local Prompt Injection Detection for AI Agents
-
ASUS Malaysia to Bring UGen300 USB AI Accelerator in Q2 for Portable On-Device AI Inferencing
-
Universal Knowledge Store and Grounding Layer for AI Reasoning Engines
-
The Best Local AI Model for Home Assistant Isn't Always the Biggest One
-
Self-Hosted LLMs Transform Personal Knowledge Management Systems
-
Google's Gemini Nano 4 Offers Faster, Smarter Local Inference Capabilities
-
Ollama's Limitations for Production Local LLM Deployments
-
LLM Wiki v2: Extended Knowledge Base for LLM Practitioners
-
5 Open-Source Projects Running Transformers on CPUs to GPUs in Pure Java
-
Speculative Decoding Made My Local LLM Actually Usable
-
Run Qwen3.5 on an Old Laptop: A Lightweight Local Agentic AI Setup Guide
-
Ask HN: Local-First Meetings Recorder and Transcriber
-
LiteLLM Integrates with Ollama to Simplify Running 100+ Models Locally
-
Quantization Strategy Comparison: Balancing Quality and Speed on Consumer Laptops
-
Qwen 3.6 Free Model Available via OpenRouter
-
Google Previews Gemini Nano 4 for Android AICore with On-Device Capabilities
-
Gemma 4 26B MoE Emerges as Optimal All-Around Local Model for Consumer Hardware
-
Samsung Launches Galaxy Book6 Series with NVIDIA RTX 5070 and On-Device AI
-
NVIDIA and Google Optimize Gemma 4 AI Models for Local RTX Deployment
-
GPUs vs. TPUs: Decoding the Powerhouses of AI
-
Gemma 4 KV Cache Memory Issues Fixed in llama.cpp
-
5 Useful Docker Containers for Agentic Developers
-
Gemma 4 Makes Local AI Agents Practical
-
How to Integrate VS Code with Ollama for Local AI Assistance
-
Qwen 3.6-Plus Released
-
Show HN: Memsearch – Persistent, Cross-Agent, Cross-Session Memory for AI Agents
-
Lotte Innovate and DeepX Collaborate on Mass Production of Domestic AI Semiconductors
-
git11 Is an AI Workspace for GitHub Engineering Teams
-
Satcove – Query 5 AI Models Simultaneously and Get Structured Verdicts
-
If Your AI Agent Ran NPM Install During the Axios Attack, You're Compromised
-
Local AI Ecosystem Extends Far Beyond Ollama
-
Intel's Arc GPU Offers 32GB VRAM for Local AI, But Software Ecosystem Lags Behind
-
GPU Passthrough to LXCs in Proxmox Simplifies Local Inference Infrastructure
-
ByteShape Releases Qwen 3.5 9B Quantisations with Hardware-Matched Tuning Guide
-
PrismML Announces 1-Bit Bonsai: First Commercially Viable 1-Bit LLMs
-
I built an O(1) physics engine to stop LLM hallucinations in construction
-
Closed Source AI = Neofeudalism
-
Select the Right Hardware for Your Local LLM Deployment with This Online Guide
-
Dell Technologies Unveils 10 AI PC Models for Business, from Ultralight Laptops to Ultracompact Desktops
-
DeepSeek-R1 Chain-of-Thought Debugging: A Developer's Guide
-
Google's TurboQuant Shows Memory Constraints Remain Critical for Local LLM Inference
-
Samsung Galaxy Book6 Brings Consumer-Grade On-Device AI Hardware to Market
-
Samsung Galaxy Book6 Series Brings Intel Core Ultra Chips for On-Device LLM Inference
-
Prompt Security Challenges Emerge as Critical Concern for Local LLM Deployments
-
Introduction to Nyreth v1.0
-
M5 Max Delivers 1.7x Faster Inference Than M3 Max on Qwen 3.5 Models
-
GPU Passthrough to LXCs in Proxmox Simplifies Local LLM Deployment
-
Acer TravelMate AI Laptops Launch in UAE for Business On-Device Inference
-
This Self-Hosted Tool Makes My Local LLMs Feel Exactly Like ChatGPT, but Nothing Leaves My Network
-
RotorQuant: 10-19x Faster Quantisation Alternative Using Clifford Algebra
-
mlx-Code: Run Claude Code Locally with MLX-LM
-
Homelab Consolidation: Replacing 3 Models with Single 122B MoE Model on AMD Ryzen AI MAX+
-
Book on AI Agents for the Layman: Understanding Agent-Based Systems
-
Google's TurboQuant: The Unsexy AI Breakthrough Worth Watching