News
Spin into 2026: join the Christmas Wheel of Fortune

СHRISTMAS
WHEEL OF FORTUNE

Tap the button and win a guaranteed prize right now!

By registering, you are signing up to receiving e-mails.
DF
December 29 2025
Updated December 29 2025

pg_textsearch for PostgreSQL - BM25 full-text search and relevance ranking

Databases

pg_textsearch is an open-source PostgreSQL extension that implements full-text search with relevance-based ranking using the BM25 (Best Matching 25) algorithm. BM25 is an industry standard for evaluating text relevance and is widely used in search engines due to its ability to account for term frequency, word rarity across the corpus, and document length. pg_textsearch turns PostgreSQL into more than just a DBMS — it becomes a full-fledged search platform without the need for external search engines like Elasticsearch or Algolia.

The project’s source code and documentation are available on GitHub:

https://github.com/timescale/pg_textsearch

Practical use cases

The pg_textsearch extension is especially useful when you need to:

  • add relevance-ranked full-text search to applications without a dedicated search server;
  • improve result ranking quality compared to PostgreSQL’s built-in ts_rank;
  • build hybrid search by combining BM25 ranking with semantic embedding search (for example, via pgvector / pgvectorscale) for AI and RAG applications;
  • use PostgreSQL as a unified data and search stack, simplifying architecture and reducing operational costs.

Typical scenarios: searching product catalogs, documentation, blogs, chatbots, recommendation systems, and dashboards.

How to install pg_textsearch

1. Requirements

PostgreSQL v17 or v18 (these versions are supported by the extension).

2. Installing the extension

If you already have PostgreSQL with support for building extensions:

CREATE EXTENSION pg_textsearch;

3. Creating a BM25 index

Example table:

CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
content TEXT
);

INSERT INTO documents (content) VALUES
('PostgreSQL is a powerful database system'),
('BM25 is an effective ranking function'),
('Full-text search with custom scoring');

Creating a BM25 index:

CREATE INDEX docs_bm25_idx
ON documents
USING bm25(content)
WITH (text_config = 'english');

Running a search

SELECT *
FROM documents
ORDER BY content <@> 'database system'
LIMIT 10;

The <@> operator returns a BM25 score (in PostgreSQL, sorting in ascending order means more relevant documents have lower values).

Example of a real search query and ranking explanation

Below is a practical example of a full-text search query using pg_textsearch and an explanation of why certain documents are ranked higher than others.

SELECT
id,
content,
content <@> 'postgresql database search' AS bm25_score
FROM documents
ORDER BY bm25_score
LIMIT 5;

In this example, the query searches for documents related to “postgresql”, “database”, and “search”, and ranks them using the BM25 algorithm.

Documents that contain all query terms receive a better (lower) BM25 score than those matching only one or two terms. At the same time, BM25 does not simply reward repetition: multiple occurrences of the same word increase relevance only up to a certain saturation point. This prevents long documents with excessive keyword repetition from dominating the results.

BM25 also accounts for document length. Shorter documents that match the query terms precisely are often ranked higher than longer texts where the same terms appear less densely. In addition, rare terms such as “postgresql” typically contribute more to the final score than common words, because BM25 incorporates inverse document frequency (IDF).

As a result, the top-ranked documents are those that balance term frequency, term rarity, and document length, providing more meaningful and relevant search results compared to basic frequency-based ranking.

Frequently Asked Questions (FAQ)

  • Q: How is pg_textsearch better than PostgreSQL’s built-in FTS?
    A: PostgreSQL’s built-in ts_rank evaluates matches primarily based on term frequency and does not account for corpus-wide statistics (IDF, length normalization, TF saturation), which reduces ranking quality. pg_textsearch implements BM25, which takes all of these factors into account.
  • Q: Can pg_textsearch be used on large tables?
    A: Yes, it supports partitioned tables and works with real-world data volumes.
  • Q: Are multiple languages supported?
    A: Yes — you can use standard PostgreSQL text configurations (english, french, german, and others) via the text_config parameter.
  • Q: Does pg_textsearch require a separate server or service?
    A: No — the extension runs in the same PostgreSQL database where your data is stored, reducing architectural complexity.

Conclusion

pg_textsearch is a powerful PostgreSQL extension that brings modern BM25-ranked full-text search directly into a relational database. It helps simplify application architecture, improve search quality, and reduce dependence on external search systems. This is especially valuable for startups, MVPs, and hybrid AI applications where having a unified data and search stack is critical.

If you need to quickly add relevance-based search without extra services, pg_textsearch keeps pace with modern requirements of developers and database architects.

Vote:
5 out of 5
Аverage rating : 5
Rated by: 1
33145 North Miami, FL 2520 Coral Way apt 2-135
+1 302 425-97-76
700 300
ITGLOBAL.COM CORP
700 300

You might also like...

We use cookies to make your experience on the Serverspace better. By continuing to browse our website, you agree to our
Use of Cookies and Privacy Policy.