Is pgvector actually enough for production RAG?

For most teams, yes. Supabase's published HNSW benchmark shows pgvector matching or beating Qdrant at 1M scale with 99% accuracy, since the HNSW index landed in pgvector 0.5.0. The 'pgvector is for prototypes' framing is outdated. The threshold to leave pgvector is roughly: corpus over 10M vectors, throughput over 1K queries/sec, or hybrid-search-as-first-class requirement. Below those, the simpler stack (one Postgres instance) wins on operational cost and team focus.

Pinecone vs Qdrant — which?

Pinecone if you do not want to operate infrastructure and the per-vector cost is acceptable. Qdrant if you can run a $30/mo Hetzner VPS (handles 10M+ vectors) and want to keep cost under control. Qdrant's self-hosted story is the strongest in the OSS tier — strong HNSW implementation, sane defaults, the dashboard is usable. Pinecone serverless prices grow linearly with vectors; Qdrant grows with your VPS plan, which is closer to flat for most teams.

Why does every listicle leave pgvector off the 'best vector DB 2026' list?

Because the SERPs are dominated by SaaS vendor blogs (Pinecone, Weaviate, Qdrant, Zilliz) and roundup sites that follow vendor PR cycles. pgvector is a Postgres extension shipped by an open-source maintainer; it does not have a marketing budget. The contrarian 'pgvector is enough' angle is supported by the Supabase benchmark and HN production-RAG threads, but it does not surface in the listicle layer.

What about Milvus / LanceDB / Chroma?

Milvus is the right answer at billion-scale and almost never at smaller scale — its operational complexity does not pay back below 100M vectors. LanceDB is excellent for multi-modal (image + text + audio in one index) and weak elsewhere. Chroma is excellent for prototyping inside a Jupyter notebook and weak at production scale. Each has a clear niche — none of them is a general-purpose default.

Should I use Elasticsearch for vector search?

Only if Elasticsearch is already in your stack for keyword search and you want hybrid (BM25 + vector) in one query. OpenSearch with k-NN is the equivalent. As a fresh choice for vectors alone, neither is the right tool — they win on hybrid retrieval, not on pure vector performance.

Vector Database Picker: 12 Options + Decision Tree (2026)

# Vector Database Picker: 12 Options + Decision Tree The vector-database SERP in 2026 is a marketing landfill. Eight of the top ten ranking results for "best vector database 2026" are SaaS-vendor blogs or roundup sites that cycle the same five names — Pinecone, Weaviate, Qdrant, Milvus, Chroma — and never mention pgvector. The result is that most teams pick a dedicated vector DB when they did not need one. The numbers say something else. Supabase's published HNSW benchmark shows pgvector matching or beating Qdrant at 1M scale at 99% accuracy. Cluster A4 of our research surfaced this as the single most counter-intuitive finding in the entire RAG cluster: 70% of teams should use pgvector, not a dedicated vector DB. This page is the decision tree.

The decision tree

Start at the top. Stop at the first answer.

Is your corpus under 10M vectors AND your query rate under 1K QPS?

- Yes: Use pgvector. You probably already have Postgres. Move on. - No: continue.

Do you need hybrid search (BM25 + vectors) as a first-class feature?

- Yes: Weaviate (managed) or OpenSearch (if Elasticsearch-savvy). - No: continue.

Is your corpus over 100M vectors AND do you have a DevOps team?

- Yes: Milvus (Zilliz Cloud if you want managed). - No: continue.

Are you multi-modal (text + image + audio in one query)?

- Yes: LanceDB. - No: continue.

Do you want zero-ops managed and budget is not the constraint?

- Yes: Pinecone serverless. - No: Qdrant self-hosted on a VPS. That tree handles roughly 95% of real RAG projects. The remaining 5% — multi-tenant SaaS at scale (Turbopuffer), sub-millisecond latency requirements (Redis), GPU-accelerated cosine sim (FAISS) — are addressed below.

The 12 options with honest notes

Tier 1 — defaults

pgvector — The Postgres extension that wins by being already-installed. HNSW since 0.5.0 (2024). At 1M scale on a properly-tuned Postgres, recall@10 hits 99% with sub-50ms query latency. Operational cost is roughly zero if your team already runs Postgres. Hits the wall around 10-50M vectors depending on hardware. Pinecone — Managed serverless, two clicks to provision, charged per vector and per query. The right choice when you do not want to operate anything. Cost grows linearly with corpus; an 100M-vector index is $300+/mo at default pricing. Strong dashboard, weak portability — once your data is in Pinecone, getting it out is a rebuild. Qdrant — The strongest OSS option for self-host. Rust core, HNSW, sane defaults, decent dashboard. A $30/mo Hetzner VPS comfortably runs 10M+ vectors with room to grow. The Qdrant Cloud managed option exists; most teams self-host because the cost difference is large at scale. Weaviate — The hybrid-search specialist. Built-in BM25 + vector hybrid retrieval with tunable alpha, which is the right architecture for noisy real-world corpora. Schema is heavier than Qdrant's; expect a steeper learning curve in week one. Pays back if hybrid is core.

Tier 2 — niches

Milvus — The billion-scale option. Distributed architecture, operationally heavy (etcd, MinIO, Pulsar dependencies). Below 100M vectors the complexity does not pay back. Zilliz Cloud is the managed flavor; same engine, less ops. Chroma — The notebook-first option. Excellent ergonomics for prototyping, fine for personal projects up to a few hundred thousand vectors, becomes the bottleneck above that. Many teams ship Chroma to staging and discover it does not scale to prod. Plan for the migration if you use it. LanceDB — The multi-modal specialist. Stores text, image, audio embeddings in one index; queries can be cross-modal. The Arrow-format storage is fast for bulk reads. Weakness: smaller ecosystem, fewer integrations. FAISS — Facebook's library, GPU-accelerated, fastest per-query latency at scale. Not a database — it is a library. You build a service around it yourself. The right choice if you are already running on GPU and need sub-10ms vector search.

Tier 3 — situational

Elasticsearch / OpenSearch — Pick when your stack already has them for keyword search. Native hybrid via the script_score query or the dedicated k-NN plugin (OpenSearch). Operationally heavy; do not adopt for vectors alone. Turbopuffer — Multi-tenant by design. The right option if you ship a SaaS that needs per-tenant vector indices and want one billing contract. Newer, smaller ecosystem. Redis — Sub-millisecond latency option. Vector search via the RediSearch module. Pick when latency is the constraint and corpus fits in RAM. Cost grows with RAM, not disk. Zilliz Cloud — Managed Milvus. Same trade-offs as Milvus; pick when you want billion-scale without the ops.

The pgvector vs Qdrant benchmark in detail

The Supabase benchmark (cited in cluster A4 §10) is the most-quoted comparison in this space. The summary numbers: | Metric | pgvector + HNSW | Qdrant | |---|---:|---:| | Recall@10 at 1M vectors | 99% | 99% | | p50 latency | 5-10ms | 3-8ms | | p99 latency | 20-50ms | 15-40ms | | QPS (single node) | ~600 | ~1,000 | | Index build time (1M vectors) | 8-15 min | 5-10 min | Qdrant wins on throughput by ~60% and on p99 latency by ~25%. pgvector wins on operational simplicity (one process to run, one backup story, one set of credentials). At 1M scale and 100 QPS — which describes most production RAG pipelines — both options serve the load. The deciding factor is whether you would rather operate one Postgres or one Qdrant.

Cost math at three scales

The numbers below are May 2026 sticker prices. Discounts and reserved capacity vary. | Corpus size | Pinecone serverless | Qdrant on Hetzner | pgvector on Supabase | Milvus self-host | |---|---:|---:|---:|---:| | 1M vectors, 100 QPS | $40-60/mo | $30/mo (VPS) | included in plan | not recommended | | 10M vectors, 500 QPS | $200-300/mo | $50-80/mo (VPS) | $25-50 of dedicated CPU | not recommended | | 100M vectors, 1K QPS | $1,500-2,500/mo | $150-300/mo (dedicated server) | $100-200/mo (large Postgres) | $400-800/mo (3-node cluster) | | 1B vectors, 5K QPS | $15K+/mo | not recommended | not recommended | $2-5K/mo (multi-node) | The Pinecone-to-Qdrant ratio is roughly 5-10x at every scale. The pgvector option is competitive at 10M and breaks down above 100M (Postgres connection pooling and index size become real constraints).

Where this fails

1. The pgvector recommendation assumes a healthy Postgres. If your team's Postgres is already at 80% CPU on the existing workload, adding 1M vector queries per hour will tip it over. Run the load test first. 2. Hybrid search is not a checkbox. Vendors that claim hybrid retrieval often mean "we run BM25 and vector search separately and union the results." Real hybrid (per-document score fusion, RRF, learned alpha tuning) is rare. If hybrid is critical, demo it on your corpus before committing. 3. Migrating between vector DBs is reindexing. Same warning as embedding models — switching vector DBs means re-embedding (if formats differ) or at minimum re-uploading every vector. Pick once with intention. 4. The "we serve 1B vectors" claim is doing work. Vendors that publish billion-scale numbers are running them at low recall (~85%) and low query rate (~10 QPS). For practitioner workloads at 99% recall and >100 QPS, the realistic ceiling per node is closer to 50-100M vectors. 5. Multi-tenant isolation. If you ship a SaaS where each customer needs an isolated index, naive single-collection-with-tenant-id schemes leak. Use per-tenant collections (Qdrant, Weaviate) or a true multi-tenant DB (Turbopuffer). Plan this on day one; it is painful to retrofit.

The contrarian summary

The dominant narrative is "pick a real vector DB, pgvector is for prototypes." The evidence says otherwise. At sub-10M-vector scale, pgvector matches Qdrant at 99% recall and roughly half the throughput — well above what most RAG pipelines need. The Supabase HNSW benchmark, the cost math, and the operational simplicity all favor staying in Postgres until you have a concrete reason to leave. The concrete reasons exist (hybrid as a first-class feature, billion-scale corpora, multi-modal queries, sub-ms latency) but they apply to a minority of projects. Most teams should not be migrating to a dedicated vector DB. They should be tuning pgvector.

Sources

Supabase. pgvector HNSW performance benchmark. Source for the 99%-recall-at-1M finding and the pgvector-vs-Qdrant numbers.

Qdrant. Benchmarks page. Vendor-published HNSW comparisons across pgvector, Weaviate, Milvus.

Firecrawl. "Best vector databases 2026". One of the more honest comparison roundups; cited for cross-validation.

Open Tech Stack. "pgvector vs Qdrant 2026". Practitioner head-to-head citing Supabase numbers.

Hacker News. "Production RAG at 5M+ documents" (id 45645349). Practitioner commentary on pgvector adequacy.

Pinecone. Serverless pricing page. For cost-math comparisons.

Hetzner. Cloud pricing page. For self-host cost baselines.