Tagged "advanced"
-
Show HN: MCP-Enabled File Storage for AI Agents, Auth via Ethereum Wallet
-
Advanced Quantization Techniques Show Surprising Performance Gains Over Standard Methods
-
What Breaks When AI Agent Frameworks Are Forced Into <1MB RAM and Sub-ms Startup
-
Show HN: A Ground Up TLS 1.3 Client Written in C
-
Enterprise Infrastructure Guide: Running Local LLMs for 70-150 Developers
-
Show HN: Agora – AI API Pricing Oracle with X402 Micropayments
-
Making Wolfram Technology Available as Foundation Tool for LLM Systems
-
Wave Field LLM Achieves O(n log n) Scaling: 825M Model Trained to 1B Parameters in 13 Hours
-
Breaking the Speed Limit: Strategies for 17k Tokens/Sec Local Inference
-
Qwen3 Demonstrates Advanced Voice Cloning via Embeddings
-
Custom Portable Workstation Optimized for Local AI Inference Builds
-
Open-Source Framework Achieves Gemini 3 Deep Think Level Performance Through Local Model Scaffolding
-
Nvidia Could Launch Its First Laptops With Its Own Processors
-
Massu: Governance Layer for AI Coding Assistants with 51 MCP Tools
-
FORTHought: Self-Hosted AI Stack for Physics Labs Built on OpenWebUI
-
The Complete Stack for Local Autonomous Agents: From GGML to Orchestration
-
Breaking the Speed Limit: Strategies for 17k Tokens/Sec Local Inference
-
AI-Powered Reverse-Engineering of Rosetta 2 for Linux
-
AI Is Stress Testing Processor Architectures and RISC-V Fits the Moment
-
O-TITANS: Orthogonal LoRA Framework for Gemma 3 with Google TITANS Memory Architecture
-
Google Open-Sources NPU IP, Synaptics Implements It for Hardware Acceleration
-
CPU-Trained Language Model Outperforms GPU Baseline After 40 Hours
-
AI PCs Explained: 7 Critical Truths About NPUs and Privacy
-
Taalas Etches AI Models onto Transistors to Rocket Boost Inference
-
Search and Analyze Documents from the DOJ Epstein Files Release with Local LLM
-
Qwen3 Coder Next Remains Effective at Aggressive Quantization Levels
-
[Release] Ouro-2.6B-Thinking: ByteDance's Recurrent Model Now Runnable Locally
-
24 Simultaneous Claude Code Agents on Local Hardware
-
Sarvam Brings AI to Feature Phones, Cars, and Smart Glasses
-
Running Local LLMs and VLMs on Arduino UNO Q with yzma
-
Complete Offline AI System: Voice Control and Smart Home via Local LLM and Radio Without Internet
-
LayerScale Launches Inference Engine Faster Than vLLM, SGLang, and TRT-LLM
-
Hardware Economics Shift: DDR5 RDIMM Pricing Now Comparable to GPUs for Local Inference
-
Aegis.rs: Open Source Rust-Based LLM Security Proxy Released
-
Show HN: Shiro.computer Static Page, Unix/NPM Shimmed to Host Claude Code
-
Alibaba's Qwen3.5-397B Achieves #3 Position in Open Weights Model Rankings
-
Qualcomm Ventures Positions India as Blueprint for Affordable On-Device AI Infrastructure
-
Same INT8 Model Shows 93% to 71% Accuracy Variance Across Snapdragon Chipsets
-
GLM-5 Technical Report: DSA Innovation Reduces Training and Inference Costs
-
Matmul-Free Language Model Trained on CPU in 1.2 Hours
-
Cloudflare Releases Agents SDK v0.5.0 with Rust-Powered Infire Engine for Edge Inference
-
Ask HN: How Do You Debug Multi-Step AI Workflows When the Output Is Wrong?
-
Qwen3-Next 80B MoE Achieves 39 Tokens/Second on RTX 5070/5060 Ti Dual-GPU Setup
-
Qwen 3.5-397B-A17B Now Available for Local Inference with Aggressive Quantisation
-
Show HN: PgCortex – AI enrichment per Postgres row, zero transaction blocking
-
I attacked my own LangGraph agent system. All 6 attacks worked
-
Show HN: Inkog – Pre-flight check for AI agents (governance, loops, injection)
-
High Bandwidth Flash Memory Could Alleviate VRAM Constraints in Local LLM Inference
-
I broke into my own AI system in 10 minutes. I built it
-
First Vibecoded AI Operating System for Local Deployment
-
Switching From Ollama and LM Studio to llama.cpp: Performance Benefits
-
MiniMax M2.5: 230B Parameter MoE Model Coming to HuggingFace
-
Ming-flash-omni-2.0: 100B MoE Omni-Modal Model Released
-
Scaling llama.cpp On Neoverse N2: Solving Cross-NUMA Performance Issues
-
Scaling llama.cpp On Neoverse N2: Solving Cross-NUMA Performance Issues
-
Samsung's REAM: Alternative Model Compression Technique
-
Heaps Do Lie: Debugging a Memory Leak in vLLM
-
New Header-Only C++ Benchmark Tool for Predictive Models on Raw Binary Streams
-
GLM-5 Released: 744B Parameter MoE Model Targeting Complex Tasks
-
Use Recursive Language Models to address huge contexts for local LLM
-
Mistral AI Debugs Critical Memory Leak in vLLM Inference Engine
-
175,000 Publicly Exposed Ollama Servers Create Major Security Risk
-
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts
-
Building a RAG Pipeline on 2M+ Pages: EpsteinFiles-RAG Project
-
Energy-Based Models Compared Against Frontier AI for Sudoku Solving
-
DeepSeek Launches Model Update with 1M Context Window
-
Carmack Proposes Using Long Fiber Lines as L2 Cache for Streaming AI Data
-
Anthropic Releases Claude Opus 4.6 Sabotage Risk Assessment
-
Community Member Builds 144GB VRAM Local LLM Powerhouse