Vector Databases and RAG: Semantic Search, pgvector, and Answering Questions from Your Data

Dev.to / 4/7/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

The article explains how vector databases enable semantic search by matching documents based on meaning rather than exact keywords, improving results for use cases like documentation and customer support.
It outlines an embedding pipeline using an LLM embeddings model (with an example using OpenAI’s text-embedding-3-small) to convert text into vectors for indexing and retrieval.
It describes storing and querying vectors directly in Postgres using pgvector, positioning Postgres as a practical backend for semantic search workflows.
It presents RAG (Retrieval-Augmented Generation) as a way to combine vector retrieval with LLMs to answer questions grounded in an organization’s own data.
It also briefly notes alternatives for embeddings (e.g., using Claude via Voyage AI) and considerations like model speed/cost for large-scale indexing.

Vector databases make semantic search possible — finding documents by meaning rather than exact keywords. Combined with LLMs, they power RAG (Retrieval-Augmented Generation) applications that answer questions from your own data. Here's the practical implementation.

What Vector Search Solves

Keyword search: finds documents containing the exact words.
Vector search: finds documents with similar meaning.

Query: "how do I reset my password"

Keyword search finds: "password reset instructions", "reset password page"
Vector search also finds: "account recovery", "forgot credentials", "login issues"

For documentation, customer support, and knowledge bases: vector search returns dramatically more relevant results.

The Embedding Pipeline

import OpenAI from 'openai'

const openai = new OpenAI()

async function embedText(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small', // 1536 dimensions, $0.02/1M tokens
    input: text,
  })
  return response.data[0].embedding
}

// Or with Claude (via Voyage AI)
// voyage-3-lite: fast and cheap for large-scale indexing

Storing Vectors in Postgres with pgvector

-- Enable the pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Table with a vector column
CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  content TEXT NOT NULL,
  metadata JSONB,
  embedding vector(1536)  -- dimension matches your model
);

-- Index for fast similarity search
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

// Insert a document with its embedding
async function indexDocument(content: string, metadata: object) {
  const embedding = await embedText(content)
  await db.$executeRaw`
    INSERT INTO documents (content, metadata, embedding)
    VALUES (${content}, ${JSON.stringify(metadata)}, ${JSON.stringify(embedding)}::vector)
  `
}

Semantic Search Query

async function semanticSearch(query: string, limit = 5) {
  const queryEmbedding = await embedText(query)

  const results = await db.$queryRaw<Array<{
    id: number
    content: string
    metadata: object
    similarity: number
  }>>`
    SELECT id, content, metadata,
      1 - (embedding <=> ${JSON.stringify(queryEmbedding)}::vector) AS similarity
    FROM documents
    ORDER BY embedding <=> ${JSON.stringify(queryEmbedding)}::vector
    LIMIT ${limit}
  `

  return results.filter(r => r.similarity > 0.7) // threshold
}

RAG: Answering Questions from Your Data

async function answerFromDocs(question: string): Promise<string> {
  // 1. Find relevant documents
  const relevantDocs = await semanticSearch(question, 5)

  if (relevantDocs.length === 0) {
    return 'I couldn\'t find relevant information to answer that question.'
  }

  // 2. Build context from retrieved documents
  const context = relevantDocs
    .map((doc, i) => `[${i + 1}] ${doc.content}`)
    .join('

')

  // 3. Ask Claude with grounding context
  const response = await anthropic.messages.create({
    model: 'claude-sonnet-4-6',
    system: 'Answer questions using only the provided context. If the context doesn\'t contain the answer, say so.',
    messages: [{
      role: 'user',
      content: `Context:
${context}

Question: ${question}`,
    }],
  })

  return response.content[0].text
}

Chunking Strategy

Document chunking significantly affects retrieval quality:

function chunkDocument(text: string, chunkSize = 500, overlap = 50): string[] {
  const words = text.split(' ')
  const chunks: string[] = []

  for (let i = 0; i < words.length; i += chunkSize - overlap) {
    chunks.push(words.slice(i, i + chunkSize).join(' '))
    if (i + chunkSize >= words.length) break
  }

  return chunks
}

// Index each chunk separately
for (const chunk of chunkDocument(document)) {
  await indexDocument(chunk, { sourceDocId: document.id })
}

Managed Options

If you don't want to manage pgvector yourself:

Pinecone: Fully managed, generous free tier
Qdrant: Open-source, self-hostable or cloud
Supabase Vector: pgvector on Supabase
Neon: pgvector on Neon (same DB as your app)

The AI SaaS Starter at whoffagents.com includes a vector search module with pgvector + Prisma, embedding pipeline, semantic search, and RAG pattern pre-built. $99 one-time.

Black Hat USA

AI Business

Black Hat Asia

AI Business

Vector Databases for AI Apps: Pinecone vs pgvector vs Weaviate

Dev.to

How to Build a Free Crypto Portfolio Tracker with AI Alerts (No Coding Required)

Dev.to

Top Enterprise AI Gateways for Semantic Caching

Dev.to

Vector Databases and RAG: Semantic Search, pgvector, and Answering Questions from Your Data

Key Points

What Vector Search Solves

The Embedding Pipeline

Storing Vectors in Postgres with pgvector

Semantic Search Query

RAG: Answering Questions from Your Data

Chunking Strategy

Managed Options

Related Articles

Black Hat USA

Black Hat Asia

Vector Databases for AI Apps: Pinecone vs pgvector vs Weaviate

How to Build a Free Crypto Portfolio Tracker with AI Alerts (No Coding Required)

Top Enterprise AI Gateways for Semantic Caching

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer