Getting JSON out of SQL Server

SQL Server 2016 brings support for JSON. For those of you who don’t know, JSON is JavaScript Object Notation – it’s a way of representing application level objects as a string. Imagine XML without anywhere near as much line noise, and you’re pretty much there.

Our Sample Document

In this example, to retrieve orders and their line items from the AdventureWorks database. We need to get the data out in this format for fast processing. In the end, our data should look something like:

Holy cow, that’s a lot of stuff just to represent and order and the line items.

Here’s our starter query:

First Attempt

The documentation hints that we might just be able to use FOR JSON AUTO option to automatically format our JSON. What could go wrong?

Syntax is hard
Syntax is hard

Our results aren’t what we want at all! The structure is close, but not quite what we’re looking for.

Getting the Structure Right

We are getting all the data we want, but we’re not getting the structure that we want. Our requirements say that we need to have and orders array and then each order should be a separate item in that array. An orders line items should also be stored in a separate array.

The square brackets make an array in JSON. The curly brackets denote an object.

Using the root() function will nest our results in a root object. In this case, we can use root('orders') to create a basic orders object to hold our list of orders.

Making each item part of an array inside the order is going to be trickier. We can still use FOR JSON AUTO to get the job done:

We’re almost right!

So wrong I can taste it.
So wrong I can taste it.

Well, that’s … something. To get the output we want, we’re going to have to use a correlated subquery – fans of generating XML in SQL Server may remember jumping through hoops to get the right structure out of SQL Server.

We’re on the right track, but what we really need is better formatting.

The Best Solution, For Now

Since the FOR JSON AUTO isn’t generating the JSON output that we want, we can go down a manual route and use FOR JSON PATH. If this is starting to sound like XML, you’re absolute right.

After attaching a debugger to SQL Server, Paul White shows that the JSON writer is using the old XML code under the hood:

Anyway, the SQL to generate the right JSON document:

Check out the correct (and formatted) results!

Getting JSON out of SQL Server

It’s easy to get incorrectly formatted JSON out of SQL Server. If you do need to get well formatted JSON out of SQL Server be prepared to put in a lot of work coercing your query into the right shape. You may find that you need to do a large number of nested sub-queries, too.

Previous Post
Fake moustaches, barbecue, and SQL Server.
Next Post
Consulting Lines: “Keep going. What else?”

20 Comments. Leave new

  • Interesting!

    Now; how about a feature to take *in* properly formatted JSON and do an insert/update accordingly.This would be useful for doing mass inserts and updates.

    I suppose the same could be said about XML, but I don’t believe inbound XML is supported for this purpose (or I’ve just missed it along the way).

    Reply
  • Is this one of those things that’s nice in theory, but in practice maybe should be done on an app server where the CPUs aren’t the most expensive ones in the building?

    Reply
    • I agree, this is something that I would usually want to push out through an application server. Sometimes you just can’t avoid generating JSON or XML in the database and it’s good to know what you’ve gotta do to get data back to the service that will consume it.

      Reply
      • Jovan Popovic
        October 11, 2015 7:23 am

        You say formatting JSON is too expensive – do you have some numbers? E.g. if you run the following query on AW database:

        SELECT *
        FROM Person.Person
        FOR JSON PATH

        you will get 0% cost of JSON SELECT operator compared to the clustered index scan. You might even try with more complex joins, order by, aggregations, and compare results. FOR JSON similar to the CONCAT function with some built-in logic for adding {, }, :, “, other special characters. If you use AUTO mode and many dots in PATH mode there is some overhead for nesting.
        Could you show some query with and without FOR JSON with a big difference in CPU usage?

        Reply
        • Ah, so you’re saying it’s free and takes no time to convert at all? Don’t you find that moderately suspicious?

          Windows Server Standard Edition is $800 for the entire server.

          SQL Server 2014 Standard Edition is around $2000 per core.
          SQL Server 2014 Standard Edition is around $7000 per core.

          Which is less expensive- an 8 core server with only Windows or an 8 core server with Windows and SQL Server?

          For the record, I never said it was expensive, I said I’d rather do the work somewhere the CPUs are cheaper.

          Reply
          • Jovan Popovic
            October 11, 2015 8:38 am

            Nothing is free, but comparing to other standard operators additional cost is minimal (create any query and look at the execution plan). My question is do you have some query where FOR JSON clause adds additional 10-30% complexity on the batch, CPU usage, or execution time, so you can justify that it should not be used?
            If you want to say that SQL Server is too expensive that is not related to JSON – other operators (JOIN, SORT) are even more expensive. So what you proposing – not to use SQL at all and do everything in C#/LINQ 🙂

          • No, but again, I never said that it added significant cost to the query. You’re the one who has brought that up multiple times now. Is JSON, perhaps, your feature?

            In my experience, the types of queries required to build correct XML or JSON, don’t optimize well. They’re complex, have deeply nested correlated subqueries, and produce sub-optimal query plans.

            Given the choice between adding additional licenses, or possible moving to Enterprise Edition because of core count limitations, the decision is simple – move it out of the SQL Server. I’m guessing that you’ve read our blog once or twice before and you know that we’re big fans of moving things out of SQL Server when the decision makes sense – you can find a lot of content on this site about how to moving sorting, pagination, full text search, file storage, and other actions to more cost effective services when it makes sense.

          • (for some reason I don’t have reply button at the end of the thread so I need to reply this way, sorry)
            Ok, maybe I didn’t understand you right, but when you say “JSON in SQL Server is too expensive” I assumed that you are talking about additional cost that it adds to the query.

            I agree that if you have some operations that add to much cost, or don’t work fine you should move them out of SQL Server (e.g. if you think that CLR is slow/not secure, FILESTREAMs/Service Brokers are complex, etc.) If you are already paying for SQL Server licence and you have almost no additional cost for feature X why not use it?

            Other question is what is typical JSON query and standard JSON format? JSON in your example adds some noise properties such as “order” and “item”. If you try to reference an item in JavaScript using your format you would need to use following path:

            orders[0].order.items[0].item

            If I’m referencing first order why should I use orders[0].order instead of orders[0]?

            However, if you just remove additional “order” and “item” keys you would get much simpler JSON, you can generate it with standard FOR JSON AUTO, and client can use simpler and intuitive path:
            orders[0].items[0]

            So if you keep JSON simple you will have simple queries, standard syntax, no additional cost, no problem with performance. Otherwise you will need new XSLT/XQuery engine in SQL Server to optimize it, and in that case I agree that you should move that kind of processing outside of SQL Server.

            I’m working on JSON I know some use cases where it can help you and a lot of anti-patterns where you should avoid it but I think that what you descried here is not one of the anti-patterns. JSON will be public soon so we will publish some use cases that describes when you should use it.

          • Thanks for pointing out the JSON structure, I’ll get that fixed.

            Out of interest, Will the JSON use cases include actual customer success stories?

  • Bryan Hamilton
    August 31, 2015 5:14 pm

    What about using powershell Invoke-Sqlcmd | ConvertTo-Json have you experimented with the cmdlet as yet?

    Reply
  • Something like this can be achieved?

    Query:

    SELECT
    u.name AS namefather,
    (SELECT name
    FROM children AS c
    WHERE c.uid = u.id) AS namechildren
    FROM users AS u
    FOR JSON PATH

    Result:

    [
    { “namefather”:”Brad”, “namechildren”:[“Angelina”, “Forrest”] },
    { “namefather”:”Maurice”, “namechildren”:[“Ben”, “Owen”] },
    { “namefather”:”Alan”, “namechildren”:[“Joan”, “Maritza”] },
    ]

    Note that namechildren is an Array.

    Reply
  • Once I have the correct format, without copy/pasting the results, how can I get it to write directly to a text file or export to text file?

    Reply
  • Thanks Brent! Right now when save the results from the query panel to a text file it doesn’t come across a single line of text. Know of any good tools on the market that would connect to SQL to create the JSON file from the stored procedure I wrote to produce the results?

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.