Qwen3.5-35B Unsloth Dynamic GGUFs Achieve SOTA Across Nearly All Quantisation Levels
1 min readThe LocalLLaMA community has released comprehensive GGUF quantisations for Qwen3.5-35B achieving state-of-the-art performance across nearly all bit levels. This release includes exhaustive benchmarking with over 150 KL Divergence tests and 9TB of quantised variants, providing practitioners with detailed performance data to guide model selection.
Beyond the quantisations themselves, a critical bug fix in the tool calling chat template was identified and addressed—an issue that affects all existing quantisation uploads. This kind of infrastructure-level fix is crucial for production deployments where tool use reliability is essential.
For local deployment practitioners, this release provides both cutting-edge quantisation quality and comprehensive benchmarks to validate which variants suit specific hardware constraints. The breadth of variants ensures options from edge devices to high-end GPUs.
Read the full article on r/LocalLLaMA.
Source: r/LocalLLaMA · Relevance: 10/10