Tagged "developer-tooling"
- GGML Joins Hugging Face: What This Means for Local Model Optimization
- GGML.AI Acquired by Hugging Face
- Enhanced Quantization Visualization Methods for Understanding LLM Compression Trade-offs
- Kitten TTS V0.8 Released: State-of-the-Art Super-Tiny Text-to-Speech Model Under 25MB
- GPT4All Replaces Ollama On Mac After Quick Trial
- Hardware Economics Shift: DDR5 RDIMM Pricing Now Comparable to GPUs for Local Inference
- Aegis.rs: Open Source Rust-Based LLM Security Proxy Released
- Show HN: Shiro.computer Static Page, Unix/NPM Shimmed to Host Claude Code
- Qualcomm Ventures Positions India as Blueprint for Affordable On-Device AI Infrastructure
- Matmul-Free Language Model Trained on CPU in 1.2 Hours
- Real-World Coding Benchmark Tests LLMs on 65 Production Codebase Tasks
- Ask HN: How Do You Debug Multi-Step AI Workflows When the Output Is Wrong?
- AMD Announces Day 0 Support for Qwen 3.5 LLM on Instinct GPUs
- Self-Hosted AI: A Complete Roadmap for Beginners
- Qwen3-Next 80B MoE Achieves 39 Tokens/Second on RTX 5070/5060 Ti Dual-GPU Setup
- Qwen 3.5-397B-A17B Now Available for Local Inference with Aggressive Quantisation
- Show HN: PgCortex – AI enrichment per Postgres row, zero transaction blocking
- Show HN: Inkog – Pre-flight check for AI agents (governance, loops, injection)
- Cohere Releases Tiny Aya: Efficient 3.3B Multilingual Model for 70+ Languages
- Ask HN: What is the best bang for buck budget AI coding?
- I broke into my own AI system in 10 minutes. I built it
- InitRunner: YAML-Based AI Agent Framework with RAG and Memory
- GPU-Accelerated DataFrame Library for Local Inference Workloads
- Alibaba Unveils Major AI Model Upgrade Ahead of DeepSeek Release
- Switching From Ollama and LM Studio to llama.cpp: Performance Benefits
- Optimal llama.cpp Settings Found for Qwen3 Coder Next Loop Issues
- GitHub Announces Support for Open Source AI Project Maintainers
- MiniMax M2.5: 230B Parameter MoE Model Coming to HuggingFace
- ByteDance Releases Seedance 2.0 AI Development Platform
- Qwen Coder Next Shows Specialized Agent Performance
- OpenClaw with vLLM Running for Free on AMD Developer Cloud
- Microsoft MarkItDown: Document Preprocessing Tool for LLMs
- Heaps Do Lie: Debugging a Memory Leak in vLLM
- New Header-Only C++ Benchmark Tool for Predictive Models on Raw Binary Streams
- GLM-5 Released: 744B Parameter MoE Model Targeting Complex Tasks
- I Tried a Claude Code Rival That's Local, Open Source, and Completely Free
- Analysis Reveals AI's Real Impact on Software Launches and Development
- Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts
- 5 Practical Ways to Use Local LLMs with MCP Tools