| Jokes aside, on a technical level, Google/brave search and vector stores basically work in a very similar way. The main difference is scale. From an LLM point of view, both fall under RAG. You can even ignore embedding models entirely and just use TF-IDF or BM25. Elastic and OpenSearch (and technically Lucene) are powerhouses when it comes to this kind of retrieval. You can also enable a small BERT model as a vector embedding, around 100 MB (FP32), running in on CPU, within either Elastic or OpenSearch. If your document set is relatively small (under ~10K) and has good variance, a small BERT model can handle the task well, or you can even skip embeddings entirely. For deeper semantic similarity or closely related documents, more powerful embedding models are usually the go to. [link] [comments] |
I came from Data Engineering stuff before jumping into LLM stuff, i am surprised that many people in this space never heard Elastic/OpenSearch
Reddit r/LocalLLaMA / 3/23/2026
💬 OpinionDeveloper Stack & InfrastructureTools & Practical Usage
Key Points
- Elastic/OpenSearch and Lucene are presented as strong retrieval options for LLM-backed pipelines, comparable to vector stores and traditional search, with scale being the main differentiator.
- A small BERT model (around 100 MB FP32) can run on CPU inside Elastic/OpenSearch to generate embeddings, enabling embedding-based retrieval within existing infrastructure.
- For small document sets (roughly under 10,000) with good variance, a compact embedding model can suffice, and in some cases embeddings can be skipped in favor of simpler methods like TF-IDF or BM25.
- The overall takeaway is that Elastic/OpenSearch can be a practical, scalable choice for RAG workflows, especially when you want to leverage familiar tooling and avoid introducing new stack complexity.
Related Articles
How to Enforce LLM Spend Limits Per Team Without Slowing Down Your Engineers
Dev.to
v1.82.6.rc.1
LiteLLM Releases
Reduce errores y costos de tokens en agentes con seleccion semantica de herramientas
Dev.to
How I Built Enterprise Monitoring Software in 6 Weeks Using Structured AI Development
Dev.to
Engenharia de Prompt: Por Que a Forma Como Você Pergunta Muda Tudo(Um guia introdutório)
Dev.to