I have a MOVIE table. My data looks, something like this
- a relational table with various text, numeric, temporal types
- several JSON objects or arrays of objects stored as JSON
And here’s exactly what it looks like:
REST API: GET the distinct list of GENRES
If each row had a single value, acquiring a distinct list of those values would be very straightforward, e.g. the list of years the library encompasses:
select distinct YEAR from movie order by 1 asc;
But alas, I don’t have a single value per row. I have these JSON objects or array of objects. I need a distinct list of say, GENREs or AWARDS or the CAST columns?
Enter the ‘Simple Dot [.] Notation for SQL over JSON’
I’m not going to sit here and tell you I figured this out on my own. What I did was read my own posts and do some googling and came up empty, probably because I had a hard time asking the right question.
So I cheated. I pinged the folks that wrote the code, earned the patents, wear the {JSON} hats and t-shirts around here.
Josh is awesome, go follow Josh.
So, does this work (I mean, of course it does, if I can follow the directions.)
select distinct genre
from movie,
json_table ( genres,'$[*]'
columns (
genre path '$'
)
)
order by 1 asc;
JSON_TABLE (Docs)
SQL/JSON function json_table
projects specific JSON data to columns of various SQL data types.
We’re taking the genres column, and taking all of the values in the array, represented by ‘$[*]’, and pivoting those to a table, with a genre column.
The path is quite simple here because everything is at the ‘root’ level, so just ‘$’.
Why the JOIN?
Again from the docs,
“Typically a json_table
invocation is laterally joined, implicitly, with a source table in the FROM
list, whose rows each contain a JSON document that is used as input to the function. json_table
generates zero or more new rows, as determined by evaluating the row path expression against the input document.”
The REST API would look like this –
But wait, there’s more.
There was another suggestion on how to tackle this, from Rajesh –
Wait, this looks simpler? What does ‘nested data,’ do, exactly?
Again, from the docs:
In a SELECT
clause you can often use a NESTED
clause instead of SQL/JSON function json_table
. This can mean a simpler query expression…
The NESTED
clause is a shortcut for using json_table
with an ANSI left outer join.
Cool, so I run the equivalent of this, do I get back the same set of values?
The docs say they are ‘equivalent,’ so the plans should be as well, yeah? Let’s take a look:
The answer was, ‘out there’
Not everyone can go directly to the product team here at Oracle for help. So I should say if i had looked a bit longer, I would have found the answer in several places, including:
And of course, I should have tried harder looking through the JSON Developer’s Guide, lot’s great stuff there, you should definitely bookmark this!