Ask HN: What do you use for local embeddings?

31 March 2026 1 min read

Hacker Newscommunity forum Hacker Newspublisher

A timely question on Hacker News asking the community about their preferred solutions for local embedding generation. This is a critical component of RAG (Retrieval-Augmented Generation) pipelines and many production local LLM deployments, yet embedding model selection often receives less attention than base model choices.

Local embeddings are essential for maintaining data privacy, reducing API costs, and enabling offline-capable systems. The discussion likely covers popular options like ONNX-optimized models, sentence-transformers, and lightweight alternatives suitable for resource-constrained environments. For practitioners building local LLM stacks, embedding choice directly impacts latency and memory requirements.

This community-driven conversation provides practical insights into what experienced developers are actually using in production, making it valuable reference material for selecting embedding solutions for your local deployment pipeline.

Source: Hacker News · Relevance: 8/10