What this usually means
PostgreSQL's full-text search (FTS) is sensitive to text analysis configuration. The most common cause is a mismatch between the text search configuration used to build the tsvector and the configuration used to parse the query. For example, storing a tsvector built with 'english' configuration but querying with 'simple' or a different language. Another frequent culprit is that the search terms are stop words (common words filtered out by the configuration), so they produce an empty tsquery. Additionally, the index might not be used if the query's WHERE clause doesn't match the indexed expression exactly, or if statistics are outdated. Silent failures also occur when the tsvector column is NULL or when the data contains characters that are discarded during parsing (e.g., punctuation).
The first ten minutes — establish facts before touching code.
- 1Run EXPLAIN ANALYZE on the full-text search query to see if the index is used and what the estimated vs actual row counts are.
- 2Test the tsvector directly: SELECT to_tsvector('english', body) FROM your_table LIMIT 5; check if it contains the search term.
- 3Test the tsquery: SELECT to_tsquery('english', 'yoursearchterm'); ensure it returns a valid lexeme, not an empty string.
- 4Check if the search term is a stop word: SELECT * FROM ts_debug('english', 'yoursearchterm'); if lexeme is empty, it's a stop word.
- 5Verify index expression matches the query: \d+ your_index; compare the index definition with the WHERE clause.
- 6Check for NULL tsvector columns: SELECT count(*) FROM your_table WHERE your_tsvector_column IS NULL;
The specific files, logs, configs, and dashboards that usually own this bug.
- searchPostgreSQL logs (pg_log directory) for any errors or warnings about text search configuration.
- searchThe table definition: \d your_table to see the tsvector column and its generation expression (if using generated columns).
- searchThe index definition: \d+ your_index to verify the expression matches the query.
- searchpg_stat_user_tables to check for sequential scans on the table.
- searchStop word files in the PostgreSQL share directory for the used configuration (e.g., /usr/share/postgresql/XX/tsearch_data/english.stop).
- searchApplication code that builds the tsquery – often the query is built incorrectly in ORM layers.
- searchThe database's default text search configuration: SHOW default_text_search_config;
Practical causes, not theory. These are the things you will actually find.
- warningMismatched text search configuration between tsvector and tsquery (e.g., 'english' vs 'simple' or different languages).
- warningSearch term is a stop word filtered out by the configuration, resulting in an empty tsquery.
- warningTsvector column is NULL or not updated after row insert (e.g., missing trigger or generated column).
- warningIndex expression does not exactly match the query's WHERE clause (e.g., index uses to_tsvector('english', body) but query uses to_tsvector('simple', body)).
- warningData contains characters that are ignored by the parser (e.g., numbers, punctuation) or the search term is part of a multi-word phrase but parsed differently.
- warningOutdated table statistics causing the planner to choose a sequential scan over the index.
Concrete fix directions. Pick the one that matches your root cause.
- buildStandardize the text search configuration: always use the same configuration for both tsvector and tsquery, preferably the default one or a custom configuration.
- buildUse a generated tsvector column (PostgreSQL 12+) with ALTER TABLE ... ADD COLUMN ... GENERATED ALWAYS AS (...) STORED to ensure consistency.
- buildCreate a GIN index on the tsvector column exactly matching the expression used in queries.
- buildFor stop words, either use a custom text search configuration that includes them, or search with the 'simple' configuration which does not filter stop words.
- buildUpdate table statistics with ANALYZE to help the planner use the index.
- buildIf using ORM, ensure the query is built correctly – test the raw SQL generated.
A fix you cannot prove is a guess. Close the loop.
- verifiedRun the full-text search query again after the fix and confirm results appear.
- verifiedTest with a word that is definitely not a stop word (e.g., 'database', 'server').
- verifiedVerify the index is being used: EXPLAIN (ANALYZE, BUFFERS) your_query and look for 'Bitmap Heap Scan' or 'Index Scan'.
- verifiedConfirm tsvector column has values: SELECT count(*) FROM your_table WHERE your_tsvector_column IS NOT NULL;
- verifiedTest edge cases: empty string, single character, special characters, multi-words.
- verifiedCheck application logs for any ORM-generated queries and run them directly in psql.
Things that make this bug worse or harder to find.
- warningAssuming the index will match any query – PostgreSQL requires exact expression match.
- warningForgetting to update the tsvector column after data changes (if not generated).
- warningUsing different configurations for different tables without documenting.
- warningIgnoring stop words – they are silently ignored, not errors.
- warningOver-indexing – creating multiple full-text search indexes on different configurations can confuse developers.
- warningNot testing with EXPLAIN ANALYZE before and after changes.
Production Search Returns Zero Results After Upgrade
Timeline
- 09:15Alert: User-facing search returns no results for common queries
- 09:20Check Heroku logs: no errors, application up
- 09:25Run query in psql: SELECT * FROM articles WHERE to_tsvector('english', body) @@ to_tsquery('english', 'database'); returns 0 rows
- 09:30Test with ILIKE: SELECT * FROM articles WHERE body ILIKE '%database%'; returns 500 rows
- 09:35Check tsvector: SELECT to_tsvector('english', body) FROM articles LIMIT 1; returns 'databas':42 - plural issue?
- 09:40Check tsquery: SELECT to_tsquery('english', 'database'); returns 'databas' - stemmed correctly
- 09:45EXPLAIN ANALYZE: sequential scan, no index used
- 09:50Check index: \d+ idx_fts_articles; shows expression: to_tsvector('simple', body)
- 09:55Root cause: Index built with 'simple' config, query uses 'english'. Also tsvector column not generated, trigger missing on bulk update.
- 10:05Fix: Rebuild index with 'english' config and add trigger to update tsvector.
- 10:15Search returns results. Monitor: no further issues.
The alert came in as a P1: user search on our knowledge base returned zero results for any query. Our search was powered by pg_search gem which uses PostgreSQL full-text search. Initial panic – is the database down? No, other queries worked. I jumped into psql and ran the same query the app used. Nothing. But a simple ILIKE found plenty of rows. That's the classic sign of a full-text search configuration problem.
I checked the tsvector output and saw that words were stemmed correctly (e.g., 'database' became 'databas'). The tsquery also stemmed correctly. So why no matches? I ran EXPLAIN ANALYZE and saw a sequential scan – the index wasn't used. That's when I checked the index definition: it was built with 'simple' configuration, but our query used 'english'. The index was useless for our queries. Also, the tsvector column was not a generated column – it was updated only on insert via a trigger, but a recent bulk import bypassed the trigger.
I dropped the old index, created a new one with 'english' config, added a trigger to update the tsvector on every update, and ran a manual update to backfill. After ANALYZE, the query returned results. The lesson: always match index configuration to query configuration, and use generated columns or bulletproof triggers. I also added a test that compares full-text search results to ILIKE results to catch this early.
Root cause
Index built with 'simple' text search configuration while query uses 'english', causing index to never match. Additionally, tsvector column was not updated on bulk data import.
The fix
Rebuilt GIN index with 'english' configuration, added trigger to update tsvector on row update, and backfilled existing rows.
The lesson
Always ensure consistency between text search configuration used in index expressions and queries. Use generated columns if possible.
PostgreSQL's full-text search relies on text search configurations that define parser, dictionaries, and stop words. When you call to_tsvector('english', body), it uses the 'english' configuration to tokenize and stem words. Similarly, to_tsquery('english', 'search') parses the query with the same configuration. If these configurations differ, the lexemes may not match. For example, 'english' stems 'running' to 'run', while 'simple' keeps 'running' as is. So a query with 'english' looking for 'run' won't match a vector built with 'simple' that has 'running'.
To debug, always verify both sides: SELECT to_tsvector('config', body) FROM table WHERE id = 1; and SELECT to_tsquery('config', 'word');. If the lexemes look different (e.g., one has a suffix the other doesn't), you have a mismatch. Fix by standardizing on one configuration across your entire codebase and indexes.
Every text search configuration has a set of stop words that are filtered out during lexeme generation. For example, the 'english' configuration removes words like 'the', 'and', 'is'. If a user searches for 'the', to_tsquery('english', 'the') returns an empty tsquery (since 'the' is a stop word). The query then becomes @@ to_tsquery('') which matches nothing. This is by design, but often confuses developers.
To detect, use ts_debug: SELECT * FROM ts_debug('english', 'the');. If the lexeme field is empty, it's a stop word. Solutions: use a custom configuration that removes stop words from the stop list, or use the 'simple' configuration which has no stop words. However, 'simple' does not do stemming, so be aware of the trade-off.
A GIN index on a tsvector expression is only used if the query's WHERE clause contains the exact same expression. For example, if the index is CREATE INDEX idx ON articles USING GIN (to_tsvector('english', body)), then the query WHERE to_tsvector('english', body) @@ to_tsquery('english', 'word') will use it. But if the query uses a different configuration (e.g., 'simple'), the index won't match – PostgreSQL falls back to a sequential scan.
Also, if table statistics are outdated, the planner might think a sequential scan is cheaper even if the index exists. Always run ANALYZE after large data changes. Use EXPLAIN (ANALYZE, BUFFERS) to see actual rows vs estimated rows. If estimates are way off, ANALYZE fixes it.
Keeping tsvector columns in sync with source data is critical. PostgreSQL 12+ introduced generated columns: ALTER TABLE articles ADD COLUMN fts_vector tsvector GENERATED ALWAYS AS (to_tsvector('english', body)) STORED; This ensures the vector is always up-to-date. For older versions, you need triggers on INSERT and UPDATE. But triggers can be missed during bulk loads or migrations.
The fix is to use a generated column if possible, or ensure your trigger covers all paths. After adding a trigger, manually update existing rows: UPDATE articles SET fts_vector = to_tsvector('english', body); (if column exists) or rebuild index. Verify with SELECT count(*) WHERE fts_vector IS NULL;
Frequently asked questions
Why does full-text search return no results for common words like 'the' or 'and'?
These words are stop words. PostgreSQL's default text search configurations filter them out because they are too common to be useful for search. Use ts_debug to check if a word is a stop word. If you need to search for them, use a custom configuration with an empty stop word list or use the 'simple' configuration.
My index exists but EXPLAIN shows a sequential scan. What's wrong?
The index is not being used because the query's expression does not match the index expression exactly. Check the index definition with \d+ index_name and compare to the WHERE clause. Common mismatch: different text search configuration (e.g., 'english' vs 'simple'). Also run ANALYZE to update statistics.
Can I use multiple text search configurations in the same query?
Yes, but it's not recommended because it can lead to confusion and maintenance issues. If you need to support multiple languages, consider creating separate tsvector columns for each language and an index on each. Then query the appropriate column based on user preference.
How do I create a custom text search configuration without stop words?
First, copy the default configuration: CREATE TEXT SEARCH CONFIGURATION my_config (COPY = english); Then, set an empty stopword file: ALTER TEXT SEARCH CONFIGURATION my_config ALTER MAPPING FOR word WITH simple; This uses the 'simple' dictionary which has no stop words. Test with ts_debug.
What is the difference between to_tsvector and plainto_tsquery?
to_tsvector converts text into a tsvector lexeme list. to_tsquery parses a query string into tsquery with operators. plainto_tsquery simplifies a user input into tsquery by adding & (AND) between words. Both rely on the same text search configuration. Mismatches between these functions can cause empty results.