Tagged "resource-efficiency"

JetBrains Releases Mellum2: A 12B MoE Model for Fast, Specialized Tasks 2 June 2026
Mistral AI Launches Mistral Vibe 28 May 2026
Developer Switches from LM Studio to llama.cpp, Reports No Performance Downgrade 26 May 2026
Users Report Superior Performance Switching from LM Studio to llama.cpp 25 May 2026
Google Tensor SDK Beta with LiteRT Enables Efficient On-Device AI 20 May 2026
Gemma 4 Just Replaced My Whole Local LLM Stack 4 May 2026
GPU Passthrough to LXCs in Proxmox Outperforms VMs and Simplifies Local AI Infrastructure 25 April 2026
16 Ways to Make a Small Language Model Think Bigger 21 April 2026
Gemma 4 Makes Local AI Agents Practical 3 April 2026
TinyGPU Adds Mac Support for External Nvidia GPU Acceleration 2 April 2026
GPU Passthrough to LXCs in Proxmox Simplifies Local LLM Deployment 28 March 2026
Mistral AI Releases Voxtral: Open-Source TTS Model Beating ElevenLabs on Local Hardware 27 March 2026
NVIDIA Releases GPT-OSS-Puzzle-88B, a Deployment-Optimized Model 26 March 2026
OmniCoder-9B: Efficient Coding Model for 8GB GPUs 16 March 2026
Cutile.jl Brings Nvidia CUDA Tile-Based Programming to Julia 12 March 2026
Alibaba's Qwen 3.5 Small Model Runs Directly on iPhone 17 3 March 2026
Wave Field LLM Achieves O(n log n) Scaling: 825M Model Trained to 1B Parameters in 13 Hours 23 February 2026
GPT-OSS 120B Uncensored Model Released in Native MXFP4 Precision 14 February 2026
Running Mistral-7B on Intel NPU Achieves 12.6 Tokens/Second 12 February 2026
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts 11 February 2026