When IN(…) is Not Enough
How to improve MySQL performance using JSON_TABLE in place of IN(…).
MySQL’s IN(…)
operator often appears in performance-critical queries — especially when fetching rows by IDs provided externally.
But what if this common pattern can be improved?
In this article, you’ll learn how to use JSON_TABLE
MySQL function to sidestep limitations of IN(…)
operator, even when you’re not working with JSON data.
If your service relies on MySQL and frequently executes such queries, keep reading.
Note: Tested around the date of publication on MySQL 8.0.40 (AWS Aurora). Your mileage may vary slightly across versions.
The Catch with IN(…) and Prepared Statements
Some patterns often justify reaching for its IN(…)
operator.
For example, when the code needs to fetch multiple records using IDs received externally (coming from an API, a webform, and so on).
Consider the query shaped like this:
SELECT * FROM big_table WHERE id IN(…)
This query makes sense when the number of query parameters is not known upfront, otherwise it could be written like this (2 parameters option):
SELECT * FROM big_table WHERE id=? OR id=?
But even if the query seemingly stays the same, as long as the number of parameters for the IN(…)
operator varies between executions, MySQL treats them as different queries.
That is:
SELECT * FROM big_table WHERE id IN(?, ?)
and
SELECT * FROM big_table WHERE id IN(?, ?, ?)
have different number of parameters — 2 and 3 respectively — and cannot be treated as a single shape of a query by MySQL.
This distinction is important if you consider queries executed frequently, over and over again, which is often the case in services backed by MySQL. In such scenarios, to improve performance, it is recommended to use prepared statements : let MySQL parse and build an internal cached representation of a query for a session, and then execute such prepared statement with different parameters. It follows the same principle as session reuse — reducing repeated setup overhead by keeping a prepared statement across multiple queries.
But in the above example two queries have different number of parameters, so they cannot be covered by the same prepared statement.
A Better Way with JSON_TABLE
Even though this use case has nothing to do with JSON, MySQL JSON functions can help here, specifically, the JSON_TABLE function. It extracts data from a JSON document and returns it as a relational table with the specified columns. Its syntax looks like this:
JSON_TABLE(expr, path COLUMNS (column_list)) [AS] alias
An example to make it more clear:
SELECT * FROM JSON_TABLE(
'["1","100","1000"]',
'$[*]' COLUMNS (
id BIGINT UNSIGNED PATH '$' ERROR ON EMPTY ERROR ON ERROR
)
) AS jt;
This call:
- takes in a json array
["1","100","1000"]
(note how numbers can optionally be quoted); - applies json path expression
$[*]
that means “take all elements of a root array”; - converts each matching value into a single-column table row, using json path
$
to select the whole value (string): “id” is the column name, andbigint unsigned
its type.
When run against MySQL, it produces this result:
+------+
| id |
+------+
| 1 |
| 100 |
| 1000 |
+------+
This function effectively allows creating an implicit temporary table, which the main table can be joined against to filter required rows. Queries like:
SELECT * FROM big_table WHERE id IN(?, ?, …, ?)
can be translated into a combination of a join over a common table expression on an implicit temporary table expanded from a JSON_TABLE
function:
WITH t1 AS (
SELECT id FROM JSON_TABLE(
?, -- a single placeholder
'$[*]' COLUMNS (id BIGINT UNSIGNED PATH '$' ERROR ON EMPTY ERROR ON ERROR)
) AS jt
) SELECT big_table.* FROM big_table, t1 WHERE big_table.id=t1.id
Why This Can Be Faster
When shaped like this, the query has only one placeholder taking a json array value.
By using a single placeholder for the JSON array (regardless of the array size), this query can be effectively used as a prepared statement, which improves resource reuse.
This approach is more efficient than a traditional query with IN(…)
operator, even in its most optimal use case — filtering on a table primary key.
Another curious benefit, more important on older MySQL engine versions, is that versions 8.0.22 and up to 8.0.30 had a bug ( fixed in 8.0.31 ), where queries using IN(…)
operator took an increasingly excess amount of CPU time as the list of values grew in length.
Queries using the JSON_TABLE
function to expand json array into a temporary table were fast independent on the number of array values, whereas performance of queries relying on IN(…)
operator was dependent on the number of arguments used.
When to Reach for This Trick
Queries involving IN(…)
can seem as optimized as they get — especially when hitting indexed primary keys.
But if those queries vary in the number of values passed, you’re likely losing performance by bypassing prepared statement caching.
Consider rewriting one such query using the json_table
approach.
Try it as a prepared statement , and watch your performance telemetry.
Especially (but not exclusively) on older MySQL versions, the gains may surprise you.