May 2023T1

pgvector Comes of Age — Putting AI Vector Search inside the RDB

Through 2023 the PostgreSQL extension pgvector advanced from v0.4.x to v0.5.x and exploded in popularity as a vector database for RAG (Retrieval-Augmented Generation) in the LLM era. Embeddings—high-dimensional vector representations of text or images—are stored inside ordinary PostgreSQL tables, queried by cosine similarity, L2 distance, or inner product. As RAG demand surged after ChatGPT's November 2022 release, pgvector's 'add an extension to your existing PostgreSQL' approach went head to head with purpose-built vector databases such as Pinecone, Weaviate, and Chroma. The major managed PostgreSQL services—AWS RDS, Azure Database for PostgreSQL, and Supabase—rolled out pgvector support in quick succession.

Diagram of Retrieval-Augmented Generation (RAG): documents are embedded into vectors, stored in a vector database, and retrieved to inform LLM-generated answers
SourceGknor (Wikimedia Commons) · CC BY-SA 4.0 · View on Commons

Metadata

Date
May 2023
Decade
2020s
Tier
T1
Sources
04
Connections
00

pgvector Comes of Age — Putting AI Vector Search inside the RDB

In May 2023 the PostgreSQL extension pgvector went mainstream in a matter of weeks. On 3 May, AWS announced pgvector support in Amazon RDS for PostgreSQL. Azure, Supabase, and Google Cloud SQL followed in quick succession. By June, essentially every major managed PostgreSQL service treated pgvector as an official feature.

Six months after ChatGPT's release in November 2022, pgvector was beginning to give a clear answer to the LLM-era industry debate: specialised vector database, or PostgreSQL extension?

What a Vector Database Is — Essential RAG Infrastructure

LLMs can convert text into embeddings: high-dimensional vectors (typically 768 to 3,072 floating-point dimensions). OpenAI's text-embedding-3-small produces 1,536 dimensions; text-embedding-3-large produces 3,072. Semantically similar text lands at nearby coordinates in this high-dimensional space.

LLM applications—and Retrieval-Augmented Generation (RAG) in particular—store such embeddings in bulk, embed a user's question into the same space, and search for nearby vectors. The retrieved documents are placed into the LLM's prompt so that fresh information, internal documents, or domain knowledge outside the model's training can shape its answers.

What makes this work is Approximate Nearest Neighbour (ANN) search: fast retrieval of the closest vectors to a query out of millions or billions. The three standard distance functions are cosine similarity, L2 (Euclidean) distance, and inner product. The mainstream ANN algorithms are HNSW (Hierarchical Navigable Small World) and IVFFlat (Inverted File with Flat lists).

pgvector's Origins — Andrew Kane, 2021

pgvector is a PostgreSQL extension published on GitHub by Canadian engineer Andrew Kane in April 2021. Kane is a well-known OSS author from the Ruby on Rails community who has maintained a number of Ruby machine-learning libraries.

The original pgvector was simple. It offered only a flat index—no IVFFlat, no HNSW—doing a full scan, useful up to roughly 100,000 rows. In 2022, IVFFlat was added, raising the practical ceiling into the millions of rows.

The decisive turn came in November 2023, when pgvector 0.5.0 added HNSW. HNSW is the de facto algorithm of the vector-DB world; Pinecone, Weaviate, Milvus, and Qdrant all use it. With HNSW in pgvector, the performance gap to dedicated vector databases narrowed sharply.

The 2023 Landscape — Specialised vs Extension

After ChatGPT's shock, the vector-DB market grew rapidly. Specialised vector databases included:

  • Pinecone (2019-, managed SaaS only) — raised at a US$750 million valuation in April 2023; the pioneer that made "RAG = Pinecone" a default mental model.
  • Weaviate (2019-, OSS + managed) — from the Netherlands; GraphQL-based API and hybrid search as strengths.
  • Milvus (2019-, donated to LF AI & Data; commercialised by Zilliz) — Chinese origin; strong on very large-scale workloads.
  • Qdrant (2021-, Rust, OSS + cloud) — Berlin-based; speed-focused newcomer.
  • Chroma (2022-, OSS) — developer-experience focused; close ties with LangChain.

Against these, pgvector hit a different need: "don't add another system to operate". If your application already runs on PostgreSQL, standing up a separate database purely for vector search is operational overhead. Transactions, joins, a consistent type system—keeping all of it inside PostgreSQL is simpler. That practical calculation pulled pgvector up rapidly.

May 2023 — The Managed-PostgreSQL Consensus

On 3 May 2023, AWS announced official pgvector support in Amazon RDS for PostgreSQL. That triggered a cascade.

  • May 2023: AWS RDS for PostgreSQL.
  • May 2023: Supabase (effectively standard support, with "Vector Embeddings" promoted as a headline feature on the home page).
  • May-June 2023: Microsoft Azure Database for PostgreSQL preview.
  • July 2023: Google Cloud SQL for PostgreSQL.
  • October 2023: AWS Aurora PostgreSQL (new release including HNSW).
  • 2024: Neon, PlanetScale, and CockroachDB (PostgreSQL-compatible OLTP players) follow.

"If you are running a managed PostgreSQL, you do not need a separate database for vector search." By late 2023, that had become the industry consensus.

"Add AI Capability to the Database" as an Architecture

pgvector's success is not merely about one extension. It established "add AI capability to the existing RDB" as a viable architectural choice, in opposition to the Pinecone-style "build a new database for AI" approach.

The same current has continued in 2024 and beyond. MySQL added a VECTOR type in version 9.0 (2024). SQL Server announced a VECTOR data type and DiskANN support in 2024. Oracle Database 23ai puts "AI Vector Search" among its headline features. Snowflake's Cortex Search exposes vector search through SQL. Every major DBMS is moving toward "vector search lives inside me".

That does not mean Pinecone, Weaviate, and the rest disappear. At billion-scale vectors, in heavy multi-tenant settings, with unusual distance functions, or with GPU acceleration, specialised products keep the edge. But for the majority of "ordinary RAG applications", the verdict is settling toward "PostgreSQL plus pgvector is enough".

What May 2023 Means

The mainstreaming of pgvector is the first concrete case in which the LLM wave reshaped existing data infrastructure. After ChatGPT, the industry faced a binary choice: build new AI-specific infrastructure, or extend what already exists. In the concrete area of vector search, pgvector answered: "extend it."

PostgreSQL is also a rare case of a system whose 1996 design decision—putting extensibility at the centre—paid off massively in an unforeseeable future. Stonebraker designed Berkeley Postgres's user-defined type system in 1986. Thirty-seven years later, the LLM era's standard data type—the vector—was added to PostgreSQL as an extension. "A DB that bet on extensibility can adapt to a future it never imagined." pgvector is the proof.

Sources

  1. PrimaryAzure Database for PostgreSQL: pgvector is now generally available

    Accessed 2026-05-25

  2. SecondaryRetrieval-augmented generation — Wikipedia

    Accessed 2026-05-25

Share