What this usually means
When MySQL sees multiple single-column indexes that could each filter part of the WHERE clause, it may try to merge them using the index_merge optimization. This uses 'union' or 'intersect' algorithms to combine rowid sets from each index, then does a secondary sort and row lookup. On large tables, the rowid sets can be huge, causing massive temporary storage, sorting, and random IO. The optimizer's cost estimates often underestimate this overhead, leading to a bad plan. The fix is usually to either force a single index with optimizer hints or create a covering composite index that eliminates the merge entirely.
The first ten minutes — establish facts before touching code.
- 1Capture the slow query from slow_query_log or performance_schema; extract the exact SQL.
- 2Run EXPLAIN FORMAT=JSON <query> and look for 'index_merge', 'union', 'intersect', or 'sort_union' in the query_block.
- 3Check the 'rows_examined_per_scan' and 'cost_info' in the JSON output to see if the merge plan's cost is unexpectedly high.
- 4Compare with EXPLAIN after adding FORCE INDEX (idx_single) to see if a single index improves the estimated rows.
- 5If the query runs in <100ms with a simple index but >1s with merge, you've confirmed the issue.
- 6Check optimizer_switch status: SELECT @@optimizer_switch; look for 'index_merge=on' (default).
The specific files, logs, configs, and dashboards that usually own this bug.
- searchMySQL slow query log: /var/log/mysql/mysql-slow.log or configured path
- searchperformance_schema.events_statements_history_long for recent queries
- searchEXPLAIN FORMAT=JSON output for the problematic query
- searchINFORMATION_SCHEMA.OPTIMIZER_TRACE (set optimizer_trace='enabled=on') for detailed optimizer decisions
- searchSHOW INDEX FROM table_name to see available indexes and their cardinality
- searchMySQL error log for any optimizer warnings or cost model changes
- searchpt-query-digest output if using Percona Toolkit
Practical causes, not theory. These are the things you will actually find.
- warningMultiple single-column indexes on columns used in WHERE clauses combined with OR or AND
- warningLarge rowid sets: index merge becomes expensive when each index selects >10% of rows
- warningOutdated index statistics: ANALYZE TABLE not run after significant data changes
- warningOptimizer cost constants not tuned for modern hardware (e.g., default disk seek cost)
- warningUsing OR conditions that cannot be satisfied by a single index
- warningMissing composite index that covers the query's WHERE + ORDER BY or GROUP BY
Concrete fix directions. Pick the one that matches your root cause.
- buildForce a single index with FORCE INDEX (index_name) or USE INDEX, then test performance
- buildReplace multiple single-column indexes with a composite index covering all WHERE columns
- buildRewrite OR conditions as UNION ALL queries if each branch can use a different index efficiently
- buildDisable index_merge for the session or globally: SET optimizer_switch='index_merge=off' (last resort)
- buildFor AND conditions, ensure a composite index exists with leading columns matching the most selective filters
- buildIncrease sort_buffer_size if index_merge involves sorting (sort_union), but prefer index changes
A fix you cannot prove is a guess. Close the loop.
- verifiedCompare EXPLAIN output before and after fix: the 'index_merge' should disappear, replaced by 'Using where' or 'Using index condition'
- verifiedRun the query 10 times, record max and avg latency; should be < 1/10th of pre-fix time
- verifiedCheck Handler_% status variables: SHOW STATUS LIKE 'Handler_read%'; expect drop in Handler_read_rnd_next
- verifiedMonitor InnoDB buffer pool reads: SHOW ENGINE INNODB STATUS\G; check 'Number of rows read'
- verifiedVerify with pt-query-digest that the query no longer appears in the top-N slow queries
- verifiedRun ANALYZE TABLE after creating new indexes to keep statistics fresh
Things that make this bug worse or harder to find.
- warningAdding more single-column indexes hoping one will be used — can make index merge worse
- warningDisabling index_merge globally without understanding impact on other queries that benefit from it
- warningForgetting to run ANALYZE TABLE after creating composite indexes — old stats mislead optimizer
- warningUsing FORCE INDEX on a query that will receive different WHERE values — test with representative data
- warningAssuming index merge is always bad: it can be efficient for small rowid sets or on SSD with low latency
- warningIgnoring the ORDER BY clause: index merge may cause filesort even if a single index could provide sorted output
E-commerce order search crawls after 5M orders
Timeline
- 09:15Alert: order search endpoint p99 latency spikes to 12s (normally 200ms)
- 09:18Check slow query log: SELECT * FROM orders WHERE status='pending' OR priority='high' ORDER BY created_at DESC LIMIT 20
- 09:22EXPLAIN shows 'Using union(idx_status,idx_priority); Using where; Using filesort' with rows=4.2M
- 09:30Confirm index merge: both status and priority have single-column B-tree indexes
- 09:35Force index idx_created_at: query runs in 150ms
- 09:45Create composite index on (status, priority, created_at)
- 09:50Drop old single-column indexes; ANALYZE TABLE
- 09:55Query latency back to 180ms; alert clears
At 09:15, our monitoring showed the order search endpoint suddenly taking 12 seconds. The team was scrambling because this endpoint powers the admin dashboard. I pulled the slow query log and found a simple SELECT with two OR conditions on 'status' and 'priority'. The query had been fine for months with under 1M orders, but we'd crossed 5M overnight due to a bulk import.
EXPLAIN showed 'Using union(idx_status,idx_priority)' with 4.2 million rows examined. The index merge was fetching rowid lists from both indexes, merging them, then doing a filesort. That's 4.2M row lookups plus a massive sort on disk. I forced the query to use the index on created_at — suddenly it only scanned 20 rows and returned instantly. The root cause was clear: index merge was optimizing for the wrong thing.
I created a composite index on (status, priority, created_at) which covered both the WHERE and ORDER BY. After dropping the old indexes and running ANALYZE TABLE, the query consistently ran under 200ms. The lesson: never assume multiple single-column indexes are better than one well-designed composite index, especially under OR conditions.
Root cause
MySQL's index merge optimization combined two large rowid sets from idx_status and idx_priority, causing 4.2M row lookups and a disk-based filesort.
The fix
Created composite index (status, priority, created_at) covering both WHERE conditions and ORDER BY, then dropped the single-column indexes.
The lesson
Index merge can be a trap on large tables; always verify with EXPLAIN and prefer composite indexes that satisfy the entire query.
MySQL's cost-based optimizer evaluates multiple access paths. For a WHERE with OR conditions, it compares the cost of full table scan, each single index scan, and index merge. The index merge cost is estimated as (cost of index1 scan + cost of index2 scan + cost of rowid union + cost of row lookup). However, the row lookup cost is often underestimated because it assumes sequential IO, while in reality the rowid sets can be sparse, causing random IO.
The cost constants in mysql.engine_cost and mysql.server_cost (e.g., row_evaluate_cost, io_block_read_cost) are tuned for HDD era. On modern SSDs with low latency, the actual cost of random reads is much lower, but the merge overhead of sorting rowid sets and temporary table creation remains high. You can inspect these with SELECT * FROM mysql.server_cost and adjust if needed, but the better fix is to avoid merge altogether.
The JSON output is more detailed than tabular EXPLAIN. Look for 'index_merge' key under 'access_type'. Under 'index_merge' you'll see 'keys' (list of indexes used) and 'rows_in_scan' for each. The 'rows' field is the sum of rowid retrievals. If 'filtered' is low (e.g., 5%), it means the index merge still needs additional filtering after row lookup. Pay attention to 'sort_union' if there's a filesort step — that's a red flag.
Example JSON snippet: "access_type": "index_merge", "index_merge": { "keys": ["idx_status","idx_priority"], "rows_in_scan": 4200000, "algorithm": "union" }. That's 4.2M rowid reads. Compare with a single-index plan: "access_type": "ref", "key": "idx_created_at", "rows": 20. The difference is orders of magnitude.
Index merge is not always evil. On tables with highly selective indexes (each returning <1% of rows) and with OR conditions, union can be efficient because the rowid sets are small and the merge is cheap. Similarly, intersect for AND conditions can be beneficial if each index filters many rows. The problem occurs when selectivity is low (e.g., each index returns 20% of rows) or when the rowid sets are large.
A quick heuristic: if the product of estimated rows from each index (e.g., rows_idx1 * rows_idx2) is greater than the table size, the merge will likely scan most rows. Use pt-index-usage or Percona's schema analysis to identify queries that heavily use index merge and test with composite indexes.
For a deeper dive, enable optimizer trace: SET optimizer_trace='enabled=on', end_markers_in_json=on; then run the query. SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE\G will show the full decision tree, including cost estimates for each access path. Look for 'index_merge' trace and compare 'cost' vs 'chosen'. You'll often see that the chosen plan's cost is higher than an alternative, indicating a costing bug.
Example: you might see "index_merge": { "cost": 100000, "rows": 4000000 } while "ref": { "cost": 50000, "rows": 1000 }. If the optimizer still chose index_merge, check if the 'ref' path was pruned due to OR conditions or because the index is not considered usable. The trace shows why — look for 'restricted_columns'.
The best defense against index merge is a solid index design strategy. For tables with multiple filterable columns, create composite indexes that cover the most common query patterns. Use the 'index merge' warning in slow query log as a signal to review your schema. Tools like pt-index-usage can analyze your slow log and suggest missing indexes.
Remember that MySQL 8.0 introduced 'functional indexes' and 'descending indexes' which can further optimize ORDER BY. Also consider using generated columns to create indexes on expressions. Finally, set up regular index health checks: run ANALYZE TABLE weekly and monitor performance_schema.table_io_waits_summary_by_index_usage to identify unused indexes that could be dropped.
Frequently asked questions
Does index merge always cause slow queries?
No. Index merge can be efficient when each index is highly selective (returns small rowid sets). It becomes problematic when the rowid sets are large (e.g., >10% of table each) because the merge and row lookups cause excessive IO. Always verify with EXPLAIN.
How do I force MySQL to use only one index?
Use FORCE INDEX (index_name) in the query: SELECT * FROM t FORCE INDEX (idx_single) WHERE ... . This tells the optimizer to consider only that index. Alternatively, USE INDEX is a hint but can be overridden. For production, prefer composite indexes over hints.
Should I disable index_merge globally?
Not recommended. Disabling index_merge globally (SET GLOBAL optimizer_switch='index_merge=off') can harm queries that legitimately benefit from it. Instead, fix the specific query by adding a composite index or using optimizer hints. Only disable as a temporary workaround.
What's the difference between union, sort_union, and intersect?
Union merges rowid sets from OR conditions; sort_union is used when the rowids need sorting before union (e.g., with ORDER BY); intersect merges rowid sets from AND conditions. All three can be slow with large rowid sets. Look for 'Using union' or 'Using intersect' in EXPLAIN Extra.
How does ANALYZE TABLE affect index merge?
ANALYZE TABLE updates index cardinality estimates, which the optimizer uses to decide whether index merge is cheaper than alternatives. Stale statistics can cause the optimizer to choose merge when a single index would be better. Run ANALYZE after significant data changes or index creation.