Announcing Row-Level Security in Azure SQL Database

Your users probably shouldn’t be able to view all of the data.

You might have regional sales managers who should only see sales for their region, or human resource staff who should be able to see employee details but not salaries.

This is usually challenging with databases.

How We’ve Historically Built Row-Level Security in SQL Server

We modify application code to pass in the user’s name or group name as part of their query, like this:

SELECT * FROM dbo.vwSalesHeader WHERE SalesVisibleTo = ‘MaryJane’

MaryJane doesn’t have to be a Windows or SQL authentication account – it can be anything we want to use as a security identifier. Some apps use a UserID number from their Users table, for example.

Photo: Locked Up Love by Tit Bona?
Photo: Locked Up Love by Tit Bona?

Then the vwSalesHeader view joins our SalesHeader table out to several other security tables where user names or roles are listed. The SalesHeader table might have a StoreID field, and then we’d have other tables that listed which users could see which StoreIDs.

This approach typically works well (enough) in reporting applications where we can guarantee all access is done through views. However, it isn’t as effective when users and applications are accessing the tables directly – and that’s where the database server needs to handle row-level security.

Serious Security: Enforcing It at the Server Level

If you create database logins for every end user, and the end user is authenticated in the database, then some database platforms can perform row-level security directly against tables.

The really cool part of this approach is that you don’t need to modify your applications – the apps don’t have to access the data via stored procedures or views in order to get row-level security. If you try to read or modify a row (or in some cases, even a column) that you’re not allowed to, the database simply stops you.

PostgreSQL added support for this via Row Security Policies recently in 9.5, and Oracle’s had Oracle Label Security since at least 10g. In all vendor implementations, the database needs a map between users, roles, tables, and rows. PostgreSQL does this with row security policies that let you build any query you want as the check expression. Oracle builds hierarchical security in at the data label level with compartments and groups. Both are radically different approaches, so there’s not really a concern about how Microsoft’s implementation adheres to standards.

How Azure SQL Database Does It

Books Online explains that the v12 preview will let you:

  1. Create a security predicate function to do the security check
  2. Create a security policy on a table that points to your new security function
  3. Enforce that security policy based on whoever’s logged in – without changing their queries

That’s awesome. Instead of this:

SELECT * FROM dbo.vwSalesHeader WHERE SalesVisibleTo = ‘MaryJane’

Your apps can just select from the table directly (or views still if they want):

SELECT * FROM dbo.SalesHeader

And SQL automatically applies the security policy.

That’s my favorite kind of new feature – one that can be introduced without app changes. It’s absolutely wonderful if you’re letting end users connect directly to the database with PowerBI tools like Excel.

I’m not going to write about the full implementation T-SQL here (function and policy creation) because we’re still ahead of the release date, and you can expect these details to change through the course of the release previews.

Server-Level Security Drawbacks

Most modern web and reporting applications use connection pooling with a single database login for all users, and no concern for “execute as user” statements. It’s challenging to run every query in the security context of the end user – especially if you have end users who aren’t in your actual security system. (Think public end users who don’t have Active Directory accounts.) The number of connections from your web/app tier may skyrocket, although most of those connections will end up being idle or disconnected over time. The alternative is to build in dynamic “execute as user” statements in your data access logic, and that’s nowhere near as trivial as it looks.

Plus, enforcing security at the database server level requires adding a security definition field to every secured table. If you’re not allowed to modify the tables, this approach won’t work. I don’t see this as a serious drawback because it’s still less work than modifying your application to work entirely on views and stored procedures.

Performance will suffer no matter how you implement row-level security. In the big picture, I don’t see this as a drawback because you shouldn’t implement it unless you need it, and if you need it, you should be willing to amp up your hardware requirements in order to pay for the additional business logic requirements. There’s no free lunch.

Silent security generates support calls. Anytime the users know that the system is influencing their query results (be it through Resource Governor, dirty reads, simultaneous loads & queries, or row-level security) then they’re going to think their reporting data is wrong/slow/unavailable because your infrastructure is at fault. If this is a concern for you, you can enable auditing of the security policies, but keep in mind that now you’re talking about even more performance impact.

And of course, there’s the obvious gotcha of having to create database users for these roles. In a perfect on-premise world, you don’t really want to create those in SQL Server – instead, you create groups in Active Directory and then in SQL Server. Let your help desk team manage the group memberships of individual users, and only rarely make changes to the groups in SQL Server. (But still, if you’re constantly adding/editing sales regions, and your queries need to be region-aware, you’re going to be constantly making security changes in production, QA, and dev servers.)

In summary, I think row-level security is one of the coolest new engine features I’ve seen in quite a while, and it’s in Azure SQL Database first. It solves a real business pain, and continues to bring Microsoft’s databases closer to feature parity with Oracle and PostgreSQL.

Previous Post
Three reasons to use Extended Events
Next Post
Failing over an Asynchronous Mirror in SQL Server (video)

12 Comments. Leave new

  • This is one area that I really like about Azure and wish would somehow make it to the on-prem version of SQL Server. It would be great to just automatically lock users into certain views of the data without needing to modify the code to do so, especially as a SaaS multi-tenant solution. There are so many great tools MS offers that just won’t work for us because we can’t easily control the users modifying queries to see other customer data. (It’s doable, but much more complex.) The Azure model really simplifies that and keeps users from adjusting parameters to see something they shouldn’t.

    Reply
  • The feature is a tiny bit too narrow. With just a few extensions it would have been perfect for multi-tenancy (with many tenants and connection pooling).

    For example they could have allowed setting a session-level variable for the tenant identifier or so. The filter function could have used that to make filter decisions.

    Brent, how would you use this feature to provide filtering based on tenants? Imagine thousands or millions of tenants.

    Reply
    • Tobi – as I wrote in the post, I wouldn’t use it for thousands or millions of tenants because the database has to know who’s running the query. If you have thousands or millions of tenants, they’re not querying the database directly (think Excel or PowerBI) – instead, they’re going through a tool (think your web app). Your web app would need to manage the WHERE clause in that scenario.

      Reply
      • If I remember correctly, they did build something like this into the connect strings or something similar. I remember thinking that it was pretty cool back when they announced the multi-tenant usage and showed off segmenting data by Customer ID/Key. If you want restricted access, use that, if not don’t include that column. Any tables with that column would be restricted to just that customer. It’s been a while, but I remember thinking even that would be useful to have on-prem for situations like ours with one DB, multiple customers.

        Reply
        • That was Federations. They made a decision to discontinue that in favor of manual sharding (they are putting out guidance).

          Reply
          • Thanks. I totally missed that announcement. Seems a shame – Federations had a lot of ease of use for security and partitioning by large customers or sets of smaller customers. At least there’s something with security that can be configured another way behind the scenes, but Federations was the big thing I was hoping to expand and perhaps hit on-prem boxes.

    • You can do exactly this by using CONTEXT_INFO() as your connection level variable.

      Reply
  • Are they not planning to role this to the on-prem version? It’s great they have this in azure, but would love to see it in other versions as well.

    Reply
  • Rob Nicholson
    February 1, 2015 6:53 pm

    The true power of this feature will be released once it can be integrated with Windows Authentication. Of course, AD is only available in on-prem versions and not in Azure just yet 🙂

    Reply
  • Brent,

    Thanks for posting this. I wonder if I can use RLS based on database role instead of user login? If I can group the users into database role, it is much easier for me to maintain.

    Thanks,
    Chung.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.