Vector DB 2026: pgvector vs Pinecone vs Qdrant comparison

Vector databases became critical with generative AI. Storing embeddings + fast similarity search = foundation of RAG, semantic search, recommendation. 3 dominant 2026 options: pgvector (Postgres extension), Pinecone (SaaS), Qdrant (open-source self-host).

TL;DR
- pgvector: Postgres extension, simplicity, $0 hosting if Postgres exists.
- Pinecone: managed SaaS, massive scalability, $70+/month.
- Qdrant: open-source self-host, performance + control.

Detailed comparison

Criterion	pgvector	Pinecone	Qdrant
Model	Postgres extension	Cloud SaaS	Open-source + cloud
Setup	30 sec	5 min	1-4h self-host
Starter cost	0 (Postgres exists)	$0 (free tier 5 indexes)	0 (self-host)
1M vectors scaling	OK	OK	OK
100M vectors scaling	Hard	Excellent	OK
1B vectors scaling	No	OK ($$$)	OK
p95 latency	50-200ms	20-50ms	30-100ms
Hybrid SQL filters	✓ excellent	Limited	Good
API	SQL	REST	REST + gRPC
Lock-in	None	High	None

When to pick pgvector

Ideal cases:

Project already with Postgres
<10M vectors
Complex SQL filters (per tenant, date, etc.)
Tight budget

Setup:

`sql

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE documents (

id SERIAL PRIMARY KEY,

content TEXT,

embedding vector(1536),

organization_id UUID,

created_at TIMESTAMP DEFAULT NOW()

);

CREATE INDEX ON documents

USING hnsw (embedding vector_cosine_ops)

WITH (m = 16, ef_construction = 64);

CREATE INDEX ON documents (organization_id);

`ts

const embedding = await openai.embeddings.create({

model: 'text-embedding-3-small',

input: documentContent,

});

await prisma.$executeRaw`

INSERT INTO documents (content, embedding, organization_id)

VALUES (${content}, ${embedding.data[0].embedding}::vector, ${orgId})

const results = await prisma.$queryRaw`

SELECT id, content, 1 - (embedding <=> ${queryEmbedding}::vector) as similarity

FROM documents

WHERE organization_id = ${orgId}

ORDER BY embedding <=> ${queryEmbedding}::vector

LIMIT 10

When to pick Pinecone

Ideal cases:

Scale >50M vectors
No Postgres
Multi-region serverless
Critical latency (<50ms)

Setup:

`ts

import { Pinecone } from '@pinecone-database/pinecone';

const pc = new Pinecone({ apiKey: process.env.PINECONE_API_KEY });

const index = pc.index('kolonell-docs');

await index.namespace(orgId).upsert([

{

id: docId,

values: embedding,

metadata: { content, createdAt: new Date().toISOString() },

]);

const results = await index.namespace(orgId).query({

vector: queryEmbedding,

topK: 10,

includeMetadata: true,

filter: { createdAt: { '$gte': '2026-01-01' } },

});

2026 pricing:

Free: 5 indexes, 100K vectors
Standard: $70/month (1 pod)
Enterprise: custom

When to pick Qdrant

Ideal cases:

Open-source preference
Self-host for data sovereignty
Critical performance
Hybrid search (vector + filters)

Self-host Hetzner setup:

`bash

docker run -d --name qdrant \

-p 6333:6333 \

-v qdrant_data:/qdrant/storage \

Need a professional website?

Kolonell builds websites that attract clients, optimized for the Sénégalese market. Free quote in 2 minutes.

Free quote WhatsApp

qdrant/qdrant:latest

`ts

import { QdrantClient } from '@qdrant/js-client-rest';

const client = new QdrantClient({ url: 'https://qdrant.kolonell.com' });

await client.createCollection('documents', {

vectors: {

size: 1536,

distance: 'Cosine',

optimizers_config: { default_segment_number: 2 },

});

await client.upsert('documents', {

points: [

{

id: docId,

vector: embedding,

payload: { content, organizationId: orgId },

});

const results = await client.search('documents', {

vector: queryEmbedding,

limit: 10,

filter: {

must: [{ key: 'organizationId', match: { value: orgId } }],

});

Self-host cost: $20-50/month Hetzner CX31. 50M+ vectors capacity.

Standard RAG implementation

`ts

async function ragQuery(question: string, orgId: string) {

const questionEmbedding = await openai.embeddings.create({

model: 'text-embedding-3-small',

input: question,

});

const relevantDocs = await searchVector(questionEmbedding, orgId, 5);

const context = relevantDocs.map(d => d.content).join('\n\n');

const response = await anthropic.messages.create({

model: 'claude-opus-4-7',

max_tokens: 1024,

system: Answer based on context only. If not in context, say "Not in the documentation".,

messages: [{

role: 'user',

content: Context:\n${context}\n\nQuestion: ${question},

}],

});

return {

answer: response.content[0].text,

sources: relevantDocs.map(d => d.id),

};

}

Real case — Africa SaaS support RAG

1M vectors knowledge base perf comparison:

Metric	pgvector	Pinecone	Qdrant
Setup time	30 min	1h	4h
Monthly cost	$0 (Postgres exists)	$70	$25 (Hetzner)
p95 latency	95ms	35ms	60ms
Tenant filtering	Excellent	Limited	Good
Monthly maintenance	0 (managed Postgres)	0	1-2h

Optimal choice this case: pgvector (Postgres + SQL filters).

Common pitfalls

Wrong embedding model — text-embedding-3-small (1536d) good default. Don't mix models.
No HNSW index — full scan 1M vectors = 5-10 seconds vs 50ms with index.
No chunk strategy — split long docs in 500-1000 token chunks before embedding.
No re-ranking — top 50 vector search → re-rank with cross-encoder = +30% quality.
Embedding cost ignored — $0.10/1M tokens add up. Cache embedding by document hash.

FAQ

Q: Open-source embeddings vs OpenAI?

A: Sentence-transformers (bge-large-en) free and good. OpenAI simpler + slightly higher quality.

Q: Migrate between vector DBs?

A: Possible but costly. Re-embedding may be needed if dimensions differ.

Q: How many chunks?

A: 500-1000 tokens per chunk with 50-100 tokens overlap. Test for your domain.

Conclusion

2026 vector DB choice:

Postgres exists + <10M vectors: pgvector (clear winner)
Massive scale + multi-region: Pinecone
Sovereignty + open-source: Qdrant self-host

Migration possible. Start well with pgvector if Postgres already exists.

Tags:#Vector DB#pgvector#Pinecone#Qdrant#RAG#AI

Mohamed Bah

Fondateur, Kolonell

Passionate about digital and entrepreneurship in Africa, Mohamed has been helping Sénégalese businesses with their digital transformation since 2020. Founder of Kolonell, he believes every SME deserves a professional and accessible online présence.

Vector DB 2026: pgvector vs Pinecone vs Qdrant compared

Vector DB 2026: pgvector vs Pinecone vs Qdrant compared

Detailed comparison

When to pick pgvector

When to pick Pinecone

When to pick Qdrant

Need a professional website?

Standard RAG implementation

Real case — Africa SaaS support RAG

Common pitfalls

FAQ

Conclusion

Mohamed Bah

Need a website?

Related articles

Agentic AI workflows production: 2026 architecture (Claude, GPT-4, agents)

Edge AI inference Cloudflare Workers AI: <100ms latency 2026

Website chatbot Claude API: FAQ + lead qualification for African SMEs (2026)