Tagged "apple"
- Llama 4 Scout on MLX: The Complete Apple Silicon Guide (2026)
- Running Gemma 4 on an iPhone 13 Pro
- DFlash Doubles Token Generation Speed of Qwen3.5 27B on Mac M5 Max
- oMLX Framework Implements DFlash Attention for Optimized Inference
- MiniMax M2.7 Achieves SOTA Performance Under 64GB on Mac with TQ Quantization
- DFlash Speculative Decoding Achieves 3.3x Speedup on Apple Silicon
- Parakeet Streaming ASR on Apple Silicon via CoreML
- AIYO Wisper: Local Voice-to-Text for macOS Using WhisperKit
- On-Device Apple Intelligence Vulnerable to Prompt Injection Attacks
- Running a 1.7B Parameters LLM on an Apple Watch
- Comprehensive Benchmark: 37 LLMs Tested on MacBook Air M5 With Open-Source Tool
- Real-time Multimodal AI on Apple Silicon: Gemma E2B Demo Shows Practical Edge Deployment
- Apple Brings Enhanced On-Device AI Features to iPhone
- Ollama Gets Blazing Fast on Macs with Full MLX Support and 2× Speedups
- Apple Research Shows Self-Distillation Significantly Improves Local Code Generation
- Mixed Precision Quantization on MLX with TurboQuant Implementation
- Kokoro TTS Achieves 20× Realtime Speed on CPU-Only On-Device Inference
- OpenUMA – Apple-Style Unified Memory for x86 AI Inference
- Google Gemma 4 Released with GGUF Quantizations
- Gemma 4 26B A4B Outperforms Qwen 3.5 35B on Apple Silicon
- Apple Silicon Macs Run Local AI Faster with Ollama's New MLX Support
- TinyGPU Adds Mac Support for External Nvidia GPU Acceleration
- Ollama Adopts Apple's MLX Framework for Faster Local AI on Mac
- Is Anyone Working on an AI Operating System?
- Select the Right Hardware for Your Local LLM Deployment with This Online Guide
- TurboQuant KV Cache Compression Achieves 22.8% Faster Decoding at 32K Context
- M5 Max Delivers 1.7x Faster Inference Than M3 Max on Qwen 3.5 Models
- mlx-Code: Run Claude Code Locally with MLX-LM
- Apple Gets Full Gemini Access and Uses Distillation to Build Lightweight On-Device AI
- Apple Plans Slimmed-Down Gemini Models for Local iPhone AI Features
- Running an Open-Weight LLM Locally on an Apple Watch
- Open-Source Tool Helps Determine Which Local LLMs Run on Your PC
- Ditching Paid AI Services: Building Self-Hosted LLM Solutions as ChatGPT, Claude, and Gemini Alternatives
- Multi-Token Prediction support coming to MLX-LM for Qwen 3.5
- Apple M5 Max 128GB real-world performance benchmarks for local inference
- DeepSeek R1 RTX 4090 vs Apple M3 Max: Benchmark & Performance Guide
- Apple's On-Device AI Raises Privacy Alarms Across British Parliament
- Startup Transforms Mac Mini Into Full-Powered AI Inference System With External GPU
- AMD Launches Agent System Optimized for Local AI Inference With Ryzen and Radeon
- Local LLMs on Apple Silicon Mac 2026: M1 M2 M3 Guide
- Apple M5 Max 128GB Benchmark Results for Local LLM Inference
- M5 Max and M5 Ultra Chipsets Demonstrate Significant Bandwidth Improvements for Local LLM Inference
- Apple Launches MacBook Neo with A18 Pro Chip for Affordable Local AI Inference
- Windows 11 Notepad Gets On-Device AI Text Generation Without Subscription
- Real-World Qwen 3.5 9B Agent Performance on M1 Pro Validates Edge Deployment
- Apple Unveils MacBook Pro with M5 Pro and M5 Max Featuring On-Device AI
- Apple Unveils MacBook Pro With M5 Pro and M5 Max for On-Device AI
- Apple M5 Pro and M5 Max: 4× Faster LLM Processing
- AMD Launches Copilot+ Desktop Chips to Compete in On-Device AI Market
- Qualcomm Snapdragon Wear Elite: 2B Parameter NPU for Personal AI Wearables
- Apple M4 iPad Air Targets AI Users with Double M1 Speed Performance
- Alibaba's Qwen 3.5 Small Model Runs Directly on iPhone 17
- Running Local AI Models on Mac Studio 128GB: 4B, 20B & 120B Tested
- Qualcomm Launches Snapdragon Wear Elite for On-Device AI on Wearables
- Apple Neural Engine Reverse-Engineered for Local Model Training on Mac Mini M4
- AMD Expands Ryzen AI 400 Series Portfolio for Consumer and Enterprise AI PC Options
- Apple Intelligence, Galaxy AI, Gemini: Why Your AI-Powered Phone Is Worth Repairing
- Apple: Python bindings for access to the on-device Apple Intelligence model
- Apple Accelerates U.S. Manufacturing with Mac Mini Production
- Qwen3-Code-Next Proves Practical for Local Development: Real-World Coding Tasks on Mac Studio
- AI-Powered Reverse-Engineering of Rosetta 2 for Linux
- Apple Researchers Develop On-Device AI Agent That Interacts With Apps for You
- GPT4All Replaces Ollama On Mac After Quick Trial
- Sourdine: Open-Source macOS App for 100% Local AI Transcription