Tagged "benchmarks"
- Qwen3.5 Series Releases Comprehensive Model Lineup Across All Tiers
- Qwen3.5-35B-A3B Emerges as Game-Changer for Agentic Coding Tasks
- Qwen3.5-27B Identified as Sweet Spot for Mid-Range Local Deployment
- Show HN: 100% LLM Accuracy–No Fine-Tuning, JSON Only
- Advanced Quantization Techniques Show Surprising Performance Gains Over Standard Methods
- No, Local LLMs Can't Replace ChatGPT or Gemini — I Tried
- The Real AI Competition Is Closed-Source vs Open-Source, Not America vs China
- Anthropic Has Never Open-Sourced an LLM: Implications for Local Deployment Strategy
- Which Web Frameworks Are Most Token-Efficient for AI Agents?
- Breaking the Speed Limit: Strategies for 17k Tokens/Sec Local Inference
- How Do You Know Which SKILL.md Is Good?
- Qwen3-Code-Next Proves Practical for Local Development: Real-World Coding Tasks on Mac Studio
- A Tool to Tell You What LLMs Can Run on Your Machine
- GLM-5 Becomes Top Open-Weights Model on Extended NYT Connections Benchmark
- How Slow Local LLMs Are on My Framework 13 AMD Strix Point
- CPU-Trained Language Model Outperforms GPU Baseline After 40 Hours
- Asus ExpertBook B3 G2 with 50 TOPS AI Sets New Enterprise Standard
- Strix Halo Performance Benchmarks: Minimax M2.5, Step 3.5 Flash, Qwen3 Coder
- I Run Local LLMs in One of the World's Priciest Energy Markets, and I Can Barely Tell
- Qwen3 Coder Next Remains Effective at Aggressive Quantization Levels
- Enhanced Quantization Visualization Methods for Understanding LLM Compression Trade-offs
- GPT4All Replaces Ollama On Mac After Quick Trial
- Hardware Economics Shift: DDR5 RDIMM Pricing Now Comparable to GPUs for Local Inference
- Alibaba's Qwen3.5-397B Achieves #3 Position in Open Weights Model Rankings
- Real-World Coding Benchmark Tests LLMs on 65 Production Codebase Tasks
- Qwen 3.5-397B-A17B Now Available for Local Inference with Aggressive Quantisation
- Optimal llama.cpp Settings Found for Qwen3 Coder Next Loop Issues
- MiniMax M2.5: 230B Parameter MoE Model Coming to HuggingFace
- Running Your Own AI Assistant for €19/Month: Complete Self-Hosting Guide
- Running Mistral-7B on Intel NPU Achieves 12.6 Tokens/Second
- OpenClaw with vLLM Running for Free on AMD Developer Cloud
- New Header-Only C++ Benchmark Tool for Predictive Models on Raw Binary Streams
- Use Recursive Language Models to address huge contexts for local LLM