Why Use Schemas?

Development
100 Comments

The ever-insightful Buck Woody (Blog@BuckWoody) wrote a post telling you to use database schemas.  He writes:

“But I still see a lot of applications, developed by shops I know as well as vendors, that don’t make use of a Schema. Everything is piled under dbo…. But if you’ll use them properly you can make your application more understandable and portable.”

I disagree – and Buck and I started a conversation on Twitter which quickly turned into a group discussion.  It merits more than 140-character comments, but since Buck’s blog temporarily has comments turned off, I wanted to give people a place to discuss.

My thoughts: schemas are naming conventions in lipstick.  When I look at somebody’s code, the last thing I ever think is, “Man, if these tables just had schemas, I could understand everything better.”  In fact, when I see objects broken up into different schemas, I usually end up asking questions like, “Why are we writing user reports against something in the ETL schema?”  The answer comes back, “Well, it used to be an ETL table, but then our needs changed, and we couldn’t rewrite all our code, so it just turned into a production table.”  Great.

If they’re used right, they let you segregate permissions by groups of objects.  But here’s the thing – almost nobody ever does that.  It’s more work than it’s worth except in the largest of databases.  When we’re talking about a vendor app that gets sold to third parties, you usually don’t want anybody jury rigging permissions together – just query through the app like you’re supposed to, and keep your filthy hands out of my database.

Thoughts?

Previous Post
Free Quest TechEd 2010 T-Shirts
Next Post
Why I Pimp Our Book

100 Comments. Leave new

  • Great post Brent, as always.

    I think they help me in two ways – I segment the information into logical containers (which, by the way, are easily transferred if things change) and I can set the permissions on the schema without knowing if the end-user shops are fond of roles, users, Windows auth and so on. I document these permissions and let the end DBA decide how they want to handle it.

    (Cause I’m good like that.)

    Reply
    • Thanks, sir! I gotta drill down into that though.

      How specifically are they “easily transferred if things change”? I don’t understand how it’s easier to transfer objects by schema than if you used a naming schema for objects, and if they’ve got foreign keys that cross schemas, you’re still out of luck.

      When you say “let the end DBA how they want to handle it,” if you’re really writing an app for wide distribution and frequent upgrades, why would you want the DBA to go in and change permissions? That’s a recipe for a broken app.

      Reply
      • I see your point there – missed the part about the FK’s. As far as perms, I guess it all depends, right? All I meant was that the end DBA can assign owners and default schemas – they don’t have to mess with the actual perms themselves.

        Some shops I see user roles heavily, others use users, some are Windows auth, some even change the db user name from the Instance login (!) so this lets me not worry about that.

        Does that make a little more sense?

        Reply
        • But if you don’t know the end DBA, how can you predict how they would want the objects organized into schemas? You’re still guessing based on what YOU think the DBA wants. It’d probably help me understand if I heard a story from a DBA who said, “I inherited this database from a vendor, and thank goodness the objects were broken up into schemas, because I was able to ___, which I couldn’t have done otherwise.” I don’t see a lot of DBAs tackling third party app databases and making changes.

          Reply
      • Forgive my ignorance, but how are FK’s affected by schemas? I need more details b

        Reply
  • Somebody (probably Buck) likened schemas to folders on your hard drive. So following your thought, do you shun folders, and just name all your files with prefixes?

    SQLServerpedia_MyFirstArticle.docx
    SQLServerpedia_ThingsToDo.docx
    VeryPersonal_PeopleToShun.docx

    I see a lot of merit in the security argument, but equally appealing to me is the organization/grouping factor.
    -J

    Reply
    • No, I use *databases* for that. When I have applications, I separate them into databases. If I have dramatically different projects, then I probably want to have different recovery options and maintenance options for them. Schemas takes that ability away from me.

      Reply
      • Brandon Reno
        May 19, 2010 2:09 pm

        So let’s take a simple sales application:
        you need to track data in high level areas, such as inventory, customers, orders, security, a product catalog, and probably some set of tables that are simply there to support the application (logs, etc)

        you have a few options:
        1) naming conventions
        2) seperate databases (?!?!)
        3) schemas

        we don’t want these in seperate databases due to RI. we want to make sure orders refer to actual products, and made by actual customers, and that our inventory contains actual products.

        as far as naming conventions vs schemas, i suppose the difference in many cases is simply ‘_’ vs ‘.’
        (though one downside to schemas is the existance of only one level of grouping)

        however, with schemas you can quickly assign permissions, for example: the users responsible for orders can, by default, have read only permissions on everything else. other users can be seperated by area of responsibility.

        it’s not the schema’s “Killer App” per say, but for me, i organize schemas as above in nearly every db i create. it just makes things simple.

        also: i can grant execute permissions on the appropriate schemas, and then i don’t have to remember to add all roles with execute permissions to each new proc that gets created 🙂

        Reply
        • I have to agree with Brandon here, schemas do two nice things for me when I’m building out a database:

          1) They allow me to group my objects together and organize them easily (I suspect that this argument is more a matter of design style than right vs wrong)

          2) They make it easier for me to grant exec to my sprocs by doing so at the schema level instead of having to do so for every sproc I create.

          Reply
  • To be honest, the concept of Schemas in SQL Server has always confused me.

    I look to FKs to figure out how objects are related. The security thing is interesting…but seems like more work for little return?

    Reply
  • I agree with Buck that schemas, in a perfect world, would easy our lifes making all the related tables being part of a nice container..
    but in the real life the permissions benefit are not in fact a benefit ’cause 99% of the vendor applications uses 1 sql or windows user to all DB access and the few databases that do use schemas have those problems of tables that shouldn’t be in the XYZ schema.. but it is ’cause the effort to change the code to point to the new schema is too big.. off course.. this is DBA fault that didn’t did a good DB design in the first place.. but this is the industry reallity..

    ah.. and one more thing.. no one here ever saw databases where we have more than 1 table with the same name but in different schemas? and a lot of queries in the wrong table ’cause the DEV didn’t correctly specified the schema name..
    ow my…. sometimes I do hate developers..

    Reply
  • I don’t see permissions in schemas as any different than a role. They can organize things, but when code changes, and needs change, which seems to happen often, you don’t want to move objects from under the schema to a new one, especially if you have code referencing the schema.

    I never saw an issue in SQL2K using dbo as the default schema for everything, and I’d argue that having the same table in two schemas can be confusing.

    The exception might be with ETL work where you stage data, and it arguably is clearer to have ETL.Orders and dbo.Orders

    Reply
  • I use schema’s (in a warehouse environment) to provide a layer between the users… which access everything through the default “dbo” view’s that point to the underlying tables in a schema.
    Provides several things…

    Blogged on this one a few weeks ago:

    http://stef-bauer.com/2010/03/30/warehouse-design-tip-schemas/

    1) “fix” things in view with joins/case statements etc until permanant changes can be made to ETL/historical data, tables etc.
    2) Point the view to a copy of the table while you do maintenance, repartition, compress, etc etc …. and your users don’t even know it is happening.
    3) Who wants columns in alphabetical order… one of my users does…
    4) Security… if you have significant security needs your views can point to multiple schemas each secured independently.

    Reply
    • Stef – interesting, but I don’t understand why you can’t accomplish that in a single schema. You could put all of these objects in the dbo schema and it’d work fine. If you’re saying you have the same object name in different schemas, and you use the same select statement but you select the right one based on who’s logged in, that’s a troubleshooting mess.

      Reply
      • No.. not same object in different schemas…I agree that would be a MESSSSSS but…
        The table-name (and the view-name in dbo) do match. So when someone does the dreaded “select * from test” I get to control what they see from where, in what order etc.

        (This does not work….same name cannot exist)
        CREATE TABLE dbo.test
        (test_col INT)

        CREATE VIEW dbo.test
        AS
        SELECT test_col
        FROM dbo.test

        (assuming dw_dbo schma exists… this does work):
        CREATE TABLE dw_dbo.test
        (test_col INT)

        CREATE VIEW dbo.test
        AS
        SELECT test_col
        FROM dw_dbo.test

        Thus you can point the view to dw_dbo.test2 or whatever you want while you work on the dw_dbo.test table, and users are not affected.

        Reply
        • Whoa – you’re recommending that people create multiples of the same object name as tables and views plus in different schemas, and then use them without prefixes? How on earth do you troubleshoot queries in the procedure cache? You can’t tell what object they’re actually trying to access without doing a ton of work. That strikes me as a total nightmare, and I would neeeeever want to recommend that to anyone. Are you saying you actually encourage people to do this?

          Reply
          • OK… your taking this too way far! …. this is VERY tightly controlled, with the ONLY access by way of the dbo objects (views). The ONLY time I swap out objects is during maintenance that can’t be done off-hours, or during the implementation of a new table/feature. All views point to the underlying tables at all times, thus all queries are reading from objects named the same as the underlying tables. Load processes ALL access the tables, thus the tables MUST be there for the loads to run daily…NO insert/update etc is performed through a view ONLY to the object with the schema prefix.

            The queries in proc-cache, plans etc are still valid… don’t missunderstand the flexibility I am talking about as something to flip-flop tables around and leave them pointing to different objects…. that is NOT what I am advocating at all.

            I am NOT suggesting multiple schemas (with the excpetion of the possible need for security reasons… and even then…. not sure) I am advocating 1 schema for the warehouse data tables that is not dbo.

            I think we actually agree…. I fear I am just not making myself very clear.

  • I use schemas to control security access to the underlying tables. Things that anyone can access goes into dbo. Procedures, views, and functions specific to a single application go into a schema. I create a certificate with the security for that schema and then I create a user from that cert.

    Typically, I only key from a custom schema to dbo. We do a lot of design before we create tables in the database (mainly by yelling over the cube walls). If there’s even a thought that something will be used by multiple applications the table goes into dbo but we put the sprocs and views into a schema.

    Reply
  • As someone who sits on both sides of the fences, developer and database designer (as well as BI guy and bottle washer) I like schemas. If nothing else when working with a large database it’s much easier to find tables I need when writing stored procs, views, reports, and classes if things are organized into schemas. And while it’s true it’s not done much, we have actually used the to control permissions on a few projects.

    Scary story, we had an app that had been running fine, then one day it just broke. After half a day of phone calls, etc, we finally tracked down why. A junior DBA had been doing maintenance on an adjoining database and for whatever reason looked at our production database. He’d never seen anyone use schemas before, assumed we were idiots and on his own changed them all to dbo.

    Reply
  • I’ve seen the extremes. Databases with horridly complex naming conventions and even more unmaintainable permissions, all the way to multiple databases with 10 – 20 tables each for a single application. I think Schemas would have been useful in both cases.

    Overall, I see schemas as an added layer of granularity. They’re there to be used when needed. They may not always be needed but there are certainly cases where they would be very appropriate, and not just for security. What about breaking out schemas into different filegroups for a performance benefit?

    Reply
    • “10 – 20 tables each for a single application” — stop working on mom and pop applications and try working on real applications. Any serious application will have a LOT more tables.

      Reply
      • I think his point was that a single application had divided its tables through several databases, each with *only* 10 or 20 databases.

        Reply
  • The biggest problem for me is that because schemas had been lacking in SQL for so long, it is really hard to argue that we really need them however….

    When performing script deployments on the dark side (Oracle) I find the use of schemas as a very good safety net. Since the schema user does not have permission outside of the schema, this prevents problems being very big problems since the change is encapsulated by the schema.

    The other issue we have is that the SQL database is incredibly accessible, this is primarily down to Windows authentication which means that connection to any instance from any tool is VERY quick. This encourages us to login using our privilege user account rather than authenticating with just the right amount of permissions that we require for that operation. That surely has to be better practice right? Conversely connection from Oracle clients are a real pain and usually encourages the entry of user/password thereby making it logical to enter the schema user.

    There are certain things like this (schemas not poor authentication!) in Oracle that we as dbas or developers could adopt which could only strengthen SQL’s excellent Enterprise credentials.

    Another reason the Oracle dbas like schemas is that it does allow for them to treat the schema and its objects as one interdependent unit. This is particularly useful for data refreshes and the like.

    Finally, I believe another reason why schemas haven’t really been adopted in SQL as opposed to Oracle is that in many ways SQL server databases are probably more comparable to Oracle schemas and therefore we could probably get by without them. Oracle would struggle.

    In conclusion in the right situations the adoption of schemas should (imo) by encouraged, but like everything if they are implemented incorrectly in SQL Server it is only going to cause more overhead for the dba.

    Reply
  • I don’t buy your argument. In your example, just because whomever was in charge of that schema was too lazy to properly refactor it when the purpose of the formerly-etl table changed doesn’t make schemata a bad idea. Schemata do make things like permissions management easier. Riddle me this: when you have a database with even 100 stored procedures, what’s easier:

    1) Remembering to grant permissions on each procedure to all the users/roles that need it?
    2) Granting that permission to the schema once and forgetting about it?

    I know which one I prefer.

    Reply
    • Ben – I only grant permissions once. I create roles in the database that match what my end users need. When people come on board, I add them to the appropriate roles. That way I don’t have to touch any objects directly. When I add new objects, I grant the role the right permissions. You never, never want to grant 1:1 permissions between end users and objects.

      Reply
      • That’s not what I’m suggesting. Even if you use roles, you’re still granting object-wise permissions. So, when you add that 101st stored procedure, you have to remember to grant execute permission to that same role. Yes… you can get around this by having a template for your procedure DDL, but why go through the hassle? If you could run one grant statement that not only covered your current needs but your future ones too, my question is: why wouldn’t you do that?

        Reply
    • Finally (!!) someone used the proper plural.

      Reply
  • I’d use schemas for:

    1) developers to have playgrounds
    2) caching other databases locally
    3) application support of a second set of rules (such as a reporting schema)
    4) Snapshots, when consistent snapshot is required, and the data model does not
    support it.
    5) Copy and paste a schema to test new development.

    1 and 2 allow unhindered naming schemes.
    3 and 4 usually go together.

    I would not use more than one schema for the same data model. There’d be no way to
    support option 5. Besides, a schema can handle a model.

    To separate objects in the schema, i use prefixes on the TABLE names.

    IMO, using schema TO make permission-management easier, is a hack.

    Reply
    • Brian – in all of those instances, why wouldn’t you use a separate database? All of the scenarios you describe should have a different recovery model, high availability strategy, and so forth from your production database.

      If you’re putting dev/cache/snapshot data inside the production database, I can see why schemas would be attractive, but I’d make the recommendation to avoid doing that in the first place.

      Reply
      • Brent,

        Here’s what i am thinking.

        For 1 and 5, the user can simply change the schema being referred to, and things run the same. They can even change the schema on only some of the TABLEs. (I do this all the time when hacking one TABLE for testing, but do not want to hurt anyone elses development. CREATE SYNONYMs in a new schema to point to the real TABLEs, except the one i want to hack at.) Changing databases requires more statement to be run, or fully qualified names.

        For 2, it’s another corporate database being stored locally. We cannot trust them to always be up, or not to change anything. So, based on the time granularity we need, just pull another snapshot. To give it its own namespace, go a level up to a schema. Though i guess a database is an option, but it seems like overkill.

        Recovery? Points i never thought of that. I come from Oracle where switching schemas is easy, but databases is harder. I don’t even think of using another database unless its something completely different.

        I guess another point is databases can be taken offline. Schemas cannot. So, if my code is running in one schema, the other schema is available. The same assumption cannot be made for another database.

        Reply
  • We’re using schemas at my job for several different reasons.

    We give each database/application a unique schema because we just don’t care to have everything sit in dbo. We generally use dbo for temporary objects that we are testing with.

    We also have several other schemas, such as log, staging, rs, cg, etc. log is for any logging tables. staging is usually for importing data from old systems (ie, access, excel, etc) to then transform into the new system. rs is for Reporting Services objects (mainly stored procedures). and cg is for code generated stored procedures. we have an app built inhouse that runs against a database and creates crud procedures for each table that are then used in the application.

    Reply
    • Thirster42 – okay, that’s interesting, but what do you *accomplish* with the different schema names that you couldn’t accomplish with naming prefixes instead?

      Reply
      • i can set security on a schema level. So if x,y,z should be able to read and write schema a, then i set them up on the schema (or if i was smart i made a role assigned to the schema and assigned them to the role). if you just used a table prefix then you have to specificaly select those objects in the role, and new objects aren’t automatically added (which could be either a good or bad thing depending).

        Also, I hate table prefixes. I prefer real words for table names. I’d rather write log.Employees instead of dbo.logEmployees (which is also longer :-D).

        Reply
        • also, x,y,z would (if we were following our internal standards) actually be in an AD security group.

          Reply
      • Brandon Reno
        May 19, 2010 2:13 pm

        on the flip side though, you can see schemas as simply an extension of using naming conventions, but with a small amount of additional functionality (specifically quicker permission allocation)

        Reply
  • I have not extensively used schemas, but I can see their worth for organizational purposes and security (even if they may not be now). Especially in apps that have “modules” that you pay for separately, a different schema for each module. Yes, you could do it with different DB’s for each module, but I think that makes maintenance harder.

    Reply
  • I prefer schemas, but I have multiple applications hitting many of the same databases. A product can be accessed by many applications, and they all want different slices of the data. By setting up service accounts, roles and schemas. I don’t have to manage individual object permissions, grant the role access to the schema, and let the developers in each group create objects in that schema. No table access and no permission management issues. We still maintain a naming standard, but the schemas make it alot easier on the DBA staff.

    Reply
  • Going off on a tangent, I think schema should be called something else. Maybe not “container”, “group” or “family” but something like that. Something that shows that they are logical containers.

    I’m saying this because whenever I use the word “schema” I’m almost always referring to the definition of tables sprocs and other db objects as used in the sentence “Changes to the schema require testing, especially for any changing or dropped sprocs”

    Fortunately the meaning of “schema” at my workplace is never ambiguous. It always means ‘db-object-definitions’ We use naming conventions to split things into their domains and everything is “piled under dbo”.

    Part of the reason is because the db was designed before 2005 and there’s no good motivation to change it.

    Reply
    • I’d also like to say that I appreciate that everyone in this discussion uses the plural schemas rather than schemata. It reminds me of indexes vs. indices.

      Reply
      • …but schemata is the proper plural of schema according to dictionary.com. Are you suggesting that I not use the proper plural because people are unfamiliar with it? Ever seen the movie Idiocracy?

        Reply
        • Whatever you do, don’t remind them that tables have columns and not fields. I try to be a stickler for using the right words, but I forget from time to time. Schemata included. Although it does sound dumb as hell.

          Reply
        • My dictionary says the word “schemas” is acceptable. I’m not going to get into a descriptivist vs prescriptivist debate, but I will point out Ben, that you also use words and terms like ‘whomever’ and ‘riddle me this’.
          Chacun son gout.

          Reply
  • “If they’re used right, they let you segregate permissions by groups of objects. But here’s the thing – almost nobody ever does that. ”

    Why is the reason that no one does it right an excuse to not do it at all? LaRock gave this reason for Linked Servers the other day (http://thomaslarock.com/2010/05/what-5-things-should-sql-server-get-rid-of/):

    “Seriously, they need to go. Not because they are bad, but because they get abused.”

    Just because people don’t implement it “correctly” doesn’t mean that it doesn’t have implementations or merits that should be looked at and considered. Which is similar to what someone said about SQLCLR on RunAs Radio (http://www.runasradio.com/default.aspx?showNum=159). 🙂

    Personally I have used them to segregate functionality and/or logical breaks but are still dependant on the data somehow. I think a great example of this is CDC, which creates its own schema.

    Reply
    • Aaron – that’s a great question. I’m of the “keep it simple” school. The more options I can remove, the easier I can make it for not just end users, but for developers who have to maintain the code.

      I love your CDC example though. That’s an interesting example, and I think it’s a great use for a separate schema.

      Reply
    • A schema is nothing more than a database inside of a database. Sometimes it makes sense, sometimes it doesn’t. What is usually lacking is the 50k foot goal for even discussing their use in the first place.

      My frustration with using schemas has more to do with the native development tools. I have seen issues where we grant users access through AD groups and the get back errors when they try to navigate and use schemas. It makes for a very frustrating experience for everyone involved.

      Reply
  • Having bought software and later needed reports the vendor was NOT going to add just for me, I am one of those devs who go into the db and pull data on my own. I figure the data is mine so if I need to put my grubby hands into the database 🙂 it stopped being your database when the software shipped. I know in some cases this may violate tos but I think tos should not be legally allowed to keep me away from my own data.

    Off topic I know but I had to defend things from the dev side.

    Reply
    • Keith – heh, how does that work out from a support perspective, though? For example, according to Microsoft’s SharePoint support policy, you’re not allowed to touch the database if you ever expect to call for support.

      Reply
  • I like using schemata for the very reasons Buck states. I think it’s a case of you say tomato, I say tomato…

    That doesn’t work on the Internet.. but you get the idea.

    Reply
  • James Luetkehoelter
    May 18, 2010 11:14 am

    An interesting discussion. Schemas could be 1) a tool for developers, in which case I agree with you Brent or 2) a tool for DBAs for security. I personally like the use of 2). I don’t like them being used as a grouping technique or naming convention, unless that grouping will have distinct and GLOBAL security needs.

    One scenario I use it in is in smaller shops where there is only a “Dev” server, or no “Dev” server (yes, those exist!). I’ll create at least 3 schemas – one meant as a playground, where the development staff has DBO rights, one for real development work – usually senior staff have full rights – (where unit testing and the like should be done), and a TEST/QA environment, where the developers have no access (only the QA people). As the code gets developed and moved into testing and finally QA, the DBA only has to move objects in between schemas. If QA rejects some code, the DBA just moves it back down. Very handy.

    Security is where I think schema usage is appropriate, and that should be controlled by the DBA.

    Reply
  • SQLRockstar said what I was going to say: Schemas are another layer of abstraction in the object model.

    When designing databases I usually use schemas. I like the organization they bring to the design. Microsoft needs to update some of their management functions/procs to support/manage/report schemas better. Even though they are difficult to support, there are valid business intelligence use cases for doing exactly what Stef Bauer suggested earlier (multiple schemas containing objects of the same name).

    After the application and database mature, I sometimes look back and think “Schemas were a waste here.” In the database development world we call this “20/20 hindsight.”

    :{> Andy

    Reply
    • Andy brings up a great point, the BI world is one place schemas do well. At PASS last year Eric Veerman showed some examples where he’d created a Fact and Dim schema rather than naming everything dbo.Fact…

      After some discussion with him I decided to try this on a project and it worked very well, it also made it easy to have Staging, Log, Temp etc schemas. It made it much clearer what tables did what.

      In addition it also made security much easier, for most tasks we only granted rights to Dim and Fact, leaving the other tables only exposed to the ETL processes.

      For those schema lovers, how many of you also apply schemas to other database objects (views, sprocs, etc)?

      Reply
      • Robert – you’re saying to have staging, log, and temp data in your production data warehouse database? It’s already hard enough to back up data warehouses and do high availability for them – why put these objects inside the production database? Why wouldn’t you have separate databases for these databases that have throwaway data that you don’t want to back up so fast?

        Reply
        • In ETL scenarios, the words “temp” and “staging” can have different meanings than in other contexts. A common design pattern involves using a Data Flow Task to land Updated Rows somewhere near the destination database – a different schema inside the Production data warehouse database is an option – and then performing a set-based update from a subsequent Execute SQL Task. That would be a good use for a staging schema.

          As for a temp schema, SSIS doesn’t lend itself to using temp (# or##) tables, and sometimes you need those. One solution is to create a temp schema somewhere – and this can also be in the Production data warehouse database – and materialize ETL “temp” tables there.

          :{> Andy

          Reply
          • Andy echos what I would have said had I not been stuck in a meeting all afternoon. In addition, I should clarify Log is not a log in the sense of the transactional log but more a specialized ETL history log that holds stats about the ETL execution process. That kind of data is often retained for long periods of time for analysis to determine run times, help with performance tuning, etc. In some cases it also provides proof to auditors that our ETL process ran on a regular basis since management made decisions based on data in the warehouse. (And buddy, you just try telling an inspector from the Nuclear Regulatory Commission you don’t have some data they request? Not a pretty sight.)

            As for putting Temp and Staging in a seperate database, yes that is an option, but often I’ve run into issues where we were only allowed one database. Usually it was some sort of cost allocation reason where projects were “charged” (by a central IT group) based on the number of databases they had, the thought being the more databases you had the more of a DBA’s time it would take. I’m sure we’ve all run into situations where decisions are not made based purely on technical merit but some company political or cost rule.

            Excellent post though, seems to have really set off a firestorm of debate. I admit I’m learning a lot by the back and forth of the comments here.

  • Another reason for NOT using schemas are that if you have a schema that is owned for a user, and then later you want to migrate this DB to another server and want to recreate the user you can’t drop the user because it owns a schema:

    You get this:
    Msg 15138, Level 16, State 1, Line 1
    The database principal owns a schema in the database, and cannot be dropped.

    So, you have to change the schema’s owner to anyone else, so then you can drop the user.

    But guess what??? If you granted permissions in the schema to any user (Select, exec, etc) these permissions are lost when you tranfer the owner to someone else, you have to transfer the ownership to anyone else, drop the user, migrate the DB, create the user, transfer ownership again to this user, and grant again all other permissions that you lost when you first changed the owner.

    This ends becoming an administrative nightmare for a DBAs, In fact, I had to make a script to check if a schema is owned for anyone else different than sa that I run every day in all DBs, to avoid future headaches.

    Reply
  • Clifford Dibble
    May 18, 2010 3:07 pm

    Hi,

    I was one of the MSFT PMs who worked on schemas during the Yukon time frame.

    If I recall correctly, goals were (in no particular order):

    a) Give developers the convenience of having separate namespaces in a single database.

    b) Simplify permissions setting for all current and future objects my GRANT-ing at the schema level.

    c) The “WinFS” team asked us for them

    d) Better ANSI/ISO SQL compliance

    e) Needed for the new “sys.” schema

    f) Make it easy to drop users but leave behind the objects they owned (had several customer requests for this). Similarly, it is easy to transfer schema ownership.

    g) Allow schemas to be owned by a role

    h) Allow users to have a specifed default schema (+ modifications to the TSQL name resolution algorithm to make the default schema part of the search path).

    Regards,
    Cliff

    Reply
    • Thanks for the insight Clifford, as an occasional developer myself I’m a fan of A. And as someone who gets into debates with others over the merits of SQL versus other DB platforms D is always helpful.

      Reply
  • In our large enterprise (hundreds of SQL servers, thousands of SQL databases), our SQL 2000 environment was extremely difficult to control and maintain, especially when it came to security. We refer to it as our “Wild, Wild West”

    Because of the problems we encountered, we implemented some pretty strict policies for all SQL2005+ databases developed in-house:

    1. All objects have to be under a named schema. No DBO objects at all.

    2. All security is set to AD groups at the schema level, with a fairly narrow set of options (pre-defined “webuser” permissions vs “developer” permissions, some exceptions allowed on appeal)

    The main thing that it does is force our dev teams to at least *think* a little bit before they implement their schema and security, or when they upgrade a database.

    Sure, for fully 50% of databases, that means a single named schema “Appname” that contains all objects, but it still makes it much easier for the DBAs to monitor our policy via scripts.

    Reply
  • We have a system which is built on top of legacy database. We decided to use schemas to separate ‘new’ objects from ‘old’ objects, which in vast majority are in dbo schema.
    At first teams were reluctant to adopt idea of creating schemas in the new system, mostly because of the assumption that schemas = users. Once they were convinced that they are really namespaces, using schemas became very natural. I agree, in homogenous system which is not accessed by end users tweaking security on schema level is not what is done very often.
    We introduced schemas to divide objects into some business oriented groups so it is easier to understand their purpose and meaning of the data in them.
    I wish management studio could display database objects groupped by schemas, for databases grew a lot for last 7 years, and management studio did not change the way of navigation – it is terribly lame to scroll hundreds or thousands of objects and wait for the tree to populate the leaf data.

    Reply
  • schemas are good for many things. We pull data from DB2, which has “libraries” proddta, devdta, stgdta, etc. I have found that using schemas in a SQL DB map nicely to those. I can have 3 schemas with same tables in db, one SSIS package that works on all, just a different schema. Yes of course could have 3 different db’s on my end, but that is 2 more databases to backup, manage, etc, etc.

    Reply
  • I would use schemas more if there was a way to backup an individual schema’s objects without having to resort to segregating it to a different file group. Sure would be cool.

    Reply
    • Jason – yeah, agreed. See, that I could really use. That, or the ability to have different recovery models per schema – I wouldn’t care about logging in my staging schema, for the examples Andy Leonard sited where he puts staging data in the production database.

      Reply
  • For the way I tend to use SQL Server (mainly data warehousing) I’d say that schemas are more of an extended naming convention than anything else, I generally have more databases (DEV/UAT/PROD) or use naming conventions (e.g. fct_table, dim_table, arc_table, stg_table).

    That said, I did some work at a hosted application provider who would build many short-term-use (but sometimes reused over time) applications for their clients. For security and convenience (reusing the same table names) developers would create an entirely new database for each application leading to us having literally hundreds of databases on their servers. The perfect solution to me would have been to have one database per customer and then one schema per application and since developers tend to like copy/pasting you could even configure the schema once at the top of each ASP script to make it easy to re-version code and database in a stroke.

    Reply
  • We have a database that stores tables, views and procedures that are used for reporting against a 3rd party application. I’ve used schemas to keep objects in logical groups (like those related to business metrics). I’ve also used schemas to group objects by the people who have request them so that I can remove objects when the people who have requested them leave.

    Reply
  • Ed Leighton-Dick
    May 19, 2010 4:32 pm

    Count me among those who like schemas. We primarily use them for namespaces within an application’s database and for easier management of security, like others have mentioned. Yes, you can do both of those with other methods (and we do have them on legacy apps), but I find schemas a clean way to implement both.

    To control the “schema sprawl” that some have mentioned, we lock down permissions so that only DBAs can create schemas, and no user-specific schemas are created.

    Reply
  • Having a multiple applications in the same database i see the necessity of using different schemas. Schemas give such a good option of a logical content separation, and doing it the old and not so good SQL 2000 way is something that i do not wish to encounter again.
    Same applies for the security reasons – it is so much easier to manage the access to the collection of objects, and those who are starting out have much less probability of executing a wrong action.

    Reply
  • I’m not sure I understand the idea of using schemas to separate multiple applications in the same database. If the applications do not share any data (tables) or objects then why would anybody ever put them in the same database? If they do share data (tables) or objects then which schema do the shared items belong in? Is there a “Shared” schema? What if there are three applications and some items are shared between only two of them?

    Reply
  • In a couple of my presentations which cover this, I talk about the usefulness of schema. First and foremost is to use the same way we use namespaces in Java, .NET, etc. The security is secondary. But if you can group objects logically, then it works well. I’ve got some sample code which shows a cabin registration system for state parks… based it on my experience with the South Carolina state park system and who can do what. I really need to write up that SQL Server Standard article…

    Reply
  • I grew up on Oracle 8 and 9. In my history, we sometimes stacked multiple applications into a single Oracle Instance seperated only by schema. Outside of the simplified security of using schemas in this model, we could also “pull” an application out of this shared database and into it’s own database if the need arose.

    SQL Server doesn’t really need to worry about this type of model, since it’s so easy to just creat a new database, and the licensing of multiple instances on the same hardware. So to me, using a schema to federate logical objects into their own namespaces or domains just doesn’t make much sense.

    Reply
  • Ok after a serious and funny read of all the posts above, I can write my opinion/way to proceed.
    I started with SQL Server in its 2000 Version, where schemas were embryonic and in some cases a nightmare. So when things changed with 2005, I’ve never felt the need to use them. If I’ve the need to add an abstraction layer, I use Databeas Roles, Stored Procedures whose execution is granted to this or that Database Role. I think I have the same Brent thoughts: “keep it simple”
    As someone said above, I’d appreciate much more Schemas if I could manage schema backup, but I’m so confident with Filegroup/File Backup that this sholud be an option for DBAs are currently using Schemas.
    I have to manage issues harder than Schemas, like poor designed Databases, data consistency, poor performances, so I don’t want to add a layer to be managed.

    Reply
  • Stephanie Sullivan
    June 22, 2010 5:28 am

    1) Having bought hosting before for small companies, and wanting a MSSQL db, those on a shoestring budget, can often only have on database. This si where schema’s start coming in very handy
    2) Logical organisation of tables and other objects is paramount for me. Coming froma non-technical background I shudder when I see tables like [Application]_[ApplicationModule]_[ObjectType]_[Entity]
    3) Again, coming from a non-technical background I’m loathe to do things like cross-server and cross-database queries. It strikes me that data you need should all be in one database. Schemas then allow for logical, but not physical seperation of data.
    4) As I work in BI, I have worked in companies with multiple departments who all perform ad-hoc analysis and modelling from the database. In these instances, it is useful to allow every read-only access to tables in a central schema and give them unlimited access to a team schema which they can create tables in, produce stored procs etc. This is also useful for audit purposes because it allows people to keep results stored for a period of time without cluttering up the important tables.

    In summary, I like schemas because they allow a logical seperation of data and make development easier.

    Reply
  • Idea Deadbeat
    June 22, 2010 11:22 am

    Our team loves schemas. Below is an excerpt from our BP (for BI datamarts). We generally use a 3-4 layer approach (staging layer, data warehouse layer, optional integration layer and presentatino layer. We also add a schema for procedures that support reporting only. Naming conventions without schemas add clutter to the name. One comment above – using databases in lieu of schemas – this means more databases on a set of LUNS (more potential for I/O contention).

    Our guideline:
    There are gray areas as to what schema should be used for views, procedures and functions. The design considerations are based on function of object and security considerations for access to schema. Here are some general guidelines to use during application design.

    1. Views can support selecting data for reports or aid in loading tables. Views in the data warehouse layer support loading integration layer; views in integration layer support relational reporting from integration layer and loading presentation layer tables. Views should not be used in the presentation layer. As a rule data should be persisted in tables in all layers and views are to assist reporting and data loading and are not intended to replace the function of the architectural layer.
    2. Stored procedure that change persisted data should use the schema of the objects being changed. Stored procedures (utility in nature) that cross layers should use the [dbo] schema.
    3. Stored procedures and functions for reporting (no change to persisted data) should use the reporting schema (rpt).

    Exceptions to the above guidelines would be based on security requirements where exposure of underlying data needs to be restricted based on role.

    Reply
  • I’ve used schemas for small projects that may well get integrated into an existing DB. Imagine writing the latest and greatest forum software – nice to create your our schema and if they want to integrate all is separate and clear, and if they decide to stick to a separate db, that works too. But it’s still one schema, just not dbo!

    Reply
    • Andy – now I like that idea. That makes perfect sense to use a separate schema there to avoid problems like table name overlaps and versioning. I like it.

      Reply
  • Adelio Stevanato
    December 3, 2014 9:04 am

    At our company we have an application where each customer has a seperate database. We are thinking or trying to reduce the number of databases by creating a seperate schema per customer.
    The thing is, the permissions stuff is OK it is just the deployment and management that looks very hard.
    i.e. New customers happen all the time. at the moment we just restore a copy of the BASE database and rename it. Backups and restores are fine as each customer has their own database.

    If we move to a schema per customer then we would have to generate a script (on demand as the system changes all the time) for a Base Schema. then modify the script to apply it to a new schema.
    This then has repercusions for any SP/functions that have hard coded schema or database names.

    The final issue is if one ofthe customers (schema) requires their data to be restored back. Very easy if one customer for a database but must harder for the data in a single schema.

    Reply
    • Hi Adelio,

      The “one schema per client in one database” model can sound good on paper, but in practice it tends to be really problematic. You end up with all the pains of a large, unwieldy database which takes a long time to restore and is tricky to performance tune, but none of the simplicity of shared tables. (In addition to the other issues that you mention.)

      Typically you’re better off with either the one-client-per-database model or the one-database-for-everyone-with-shared-tables model. Brent walks through those here: https://www.brentozar.com/archive/2011/06/how-design-multiclient-databases/

      Reply
      • Personally, I wouldn’t take the “one schema per client in one database” approach, but at least the schemas have meaning with this use case as apposed to attempting to logically group tables using schema names.

        (Apologies for practicing necromancy. Hello 2014!)

        Reply
  • I wrote an app for my company, and was soon asked to deploy a second instance of it, followed by several more. So I took all instance-specific data and put it into separate schemas, keeping global data in dbo. So for each instance, I have a set of tables with the same names and structures, differing only by schema qualification.

    On the application side, I removed schema references in queries and had each instance connect to the database with a different user whose default schema was set to the one corresponding to the instance.

    This system has worked really well so far. The only drawback is that a couple of recently added use cases require having data that refers to multiple instances/schemas, which case I had to choose between losing FK/referential integrity, denormalizing, or collapsing all the instance tables into dbo, with instance ids added to all the primary keys.

    Reply
  • I think it is nothing more than a design principal. If I take an example of Class or Interface, I can put all my functions or logic at same place and everything will work well, you can do any dam which you can do by separating your logic into different class.
    Similarly, I can do everything without schema but it’s just different design approach to have it, I don’t see any other point…

    Reply
  • Buddy Arbuckle
    November 4, 2015 4:27 pm

    I have been working on schema base security and found something a little confusing. I have 3 schemas Common, Messaging, and Ordering (schema_id 4 – 7). When I look in sys.objects I see the names of Common and Ordering as sysrowsets and sysallocunits. This is based on the object_id matching the actual schema id and a type of “S”. What I do not see in sys.objects is the object_id which matches the schema_id for Messaging and there is not an object_id of 6. It is in sys.schemas and I can do what I need to do for the task. I was just wondering why there would be no reference in sys.objects and is “Messaging” some kind of key word.

    Reply
  • There is nothing more entertaining than watching a room full of developers try to find a [Vendor] table in the SSMS tree, only to finally discover it at the top of the list in the [profile] schema.

    They make modeling more difficult, they play hell with Entity Framework, SMO become more tedious and I’ve seen tables grow out of their schema. For example, a [static] schema table that receives updates.

    Schemas are not folders, please do us DBAs and Database developers a favor and avoid using them as such!

    Reply
  • Olivier Desmedt
    March 8, 2016 7:13 am

    Hello,
    I use different schema in my staging database for my source tables, one schema for one source and my tec tables with tec that represent all intermediary tables and process in order to fill my staging dbo tables that are the respresentation of my presentation layer.
    I also use schema when I need to separate objects from their use.
    Stored procedure for reporting, …
    These objects have specific use and so will not be used in other context.
    This is to allow fast answer time to bug and failed process.
    Of course It needs to be weighted and not create a schema for everything and also considering objetcs that might be used in a mixed situation…
    when you have a staging with a lot of table from different sources, this help the whole process.

    Reply
  • Surely (ha) you can take your lead from Microsoft with the dbo. sys. and INFORMATION_SCHEMA. schemas. Sometimes MS themselves use prefixes on the table to separate things and sometimes they use a schema.

    Looking at their use of schemas, they seem reserved for ‘grand concepts’ rather than simple groups of like tables and also as people have said likely security divisions.

    Reply
    • …and referencing comments above: You can grow out of a logical table group but you rarely grow out of a ‘grand concept’ There is always a place for an info schema and always a place for system objects, that do not really cross the ‘normal’ tables and the division between them is well defined.

      Reply
    • Not sure that there is a “higher” scheme at microsoft. They are composed by many team and each one has it’s own rules…
      For example 10 years ago, we saw 2 “example” db coming into sql server… Pubs and adventureworks… Simply because they had decided that ms access would disappear, they had assigned the ms access team to sql server and they compete each others saying their example db was the best…
      There are other example where you can see that they put things together without any common idea…

      Reply
  • what if I use a framework like Django that has ORM?

    Reply
  • Wow… Y’all have missed one of the most real reasons I like to use schema’s: Intellisense. (I hear some of you snorting and rolling your eyes… I have very sensitive hearing.) But, here’s the deal, Schemas are USEFUL for both security AND organization. And that organization leads to productivity gains. I’m a pretty fast typist, but I probably write 50+ queries a day on average, and using schemas and leveraging intellisense is the difference between typing 6 characters vs. 16+ characters for something like Settlement.Type (“Se” + [tab] + “.T” + [tab]) vs (dbo.SettlementT + [tab]) 100 times per day (considering most queries I write use 2 or more tables). It also helps “enforce” naming convention – which developers tend to screw up if left to their own devices… you end up with stuff like: dbo.SettlementType, dbo.SettlementsSource (plural), dbo.StlmtEvnt, and dbo.TransSettlement all in the same database. If it makes life a little easier and has additional possible benefits (security management), then why NOT? In code, we use Namespaces to keep our code organized (as well as to provide some scoping benefits as needed). It is necessary and best practice. I don’t see why data organization in DB should be treated differently. In a well-normalized relational database where each table has a pretty clear and specific context, Schemas make sense to me even if only for organizing (and to help me with Intellisense when writing queries). Aside from the schemas not being well-thought-out to begin with and diverging from original intent, I don’t see any legitimate downside… and even in that case, you are no worse off than you are with just naming conventions (that everyone will start to ignore anyhow).

    Reply
  • Clearly Brent Ozar is highly opinionated, and his opinions are against using schema’s as organizational tools, and he is going to fight tooth and nail to discredit any attempts to use schema’s in such a way.

    Brent, your arguments cannot be proved. “But if you don’t know the end DBA, how can you predict how they would want the objects organized into schemas?” uhhh… maybe don’t. I don’t design based on predicting the expertise, mood, or opinions of a DBA. I build things so that they work well, make sense, are logical, and all-told generally bad-ass. It sounds to me like your db’s suck. I hope they don’t since you seem to be well established. But based on your closed-minded opinions, I would not hire you. You’re welcome to your opinions, but don’t push your opinions on me as facts. I’m just guessing, of course, but I would imagine your databases do not hold up well over time. You’re too narrow minded.

    Reply
  • Jyoti Sankaran
    May 18, 2021 9:42 pm

    Brent,
    Apropos the discussion on SQL Server schemas, I am more than a little amused at your re-igniting an old issue which you have passionately argued against in your blog many moons ago. I think it was in Dec 2016 or 2017 that I ran into you in the corridors of a SQL Saturday conference in Washington DC. As a schema aficionado myself who had followed your blog discussion with interest I could not help bringing up your stubborn opposition to the concept during our brief conversation and expected a passionate, dogmatic reply from you. To my surprise, after listening to me patiently without interruption, you smiled at me and gave a reply (I forget the details) which suggested that you might be open to a “conversion”. Maybe your mind was on the keynote talk you were going to give later that afternoon. But what struck me was your humility, considering that you are a “rock star” in SQL Server circles! And to that I shall tip my hat and say “three cheers”!

    Reply
  • We use schema as follows:
    1. dbo for all production tables. This is the heart of the system.
    2. audit schema – when ever a table is updated, the previous version of the row is saved in an audit table with the same structure. that audit table is in a different schema.
    3. import schema – we import data in a temp table then post into real tables. these temp tables are in import schema.
    4. log schema. we log user clicks, request, response para, etc. These logs are in a log schema.
    Then after every few months, we truncate tables in audit, import and log schema.

    Reply
  • Daniel Dickler
    July 9, 2021 10:14 pm

    Very nice discussion here and it looks like the common ground is that schemas have their merits but can easily be misused and contrary to their purpose may mess things up.

    Prefixes instead of schemas may be as useful but by considering the simpler assignment of privileges, schemas can save some time and effort as emphasized by others here.

    I am using schemas as a design pattern to keep things physically separated that are logically separated, e.g. with an application that transforms data from input tables and saves the results in output tables. Input tables follow a 3NF form while output tables follow a star schema. Referential integrity is in place within each schema. Effectively the schema is separating the models. In between the input and output schema we have a calc schema. Here the application developers don’t have referential integrity in place, do not store any historical data, and don’t follow one of the strict schema designs (3NF, star) for performance reasons.

    Process wise ETL loads source data to tables residing in input schema where the history of input data is stored, application reads input data into calc schema, performs the calculations and stores data in tables of output schema where the history of results are stored, the OLAP engine is attached, and finally gives an option to wipe (truncate) calc schema tables before it starts another run.

    Having all schemas in just the dbo schema with prefixes in*, out*, calc* on each table, view and procedures like in2calc* and calc2out* is of course an option, but since in and out tables are growing, taking a backup of the database for support reasons can be very cumbersome. Hence, some of my colleagues would prefer to carve out the calc schema into a sperate database (understandable). So the discussion diverted from one schema (dbo) vs. multiple schema to one database with multiple schemas vs. multiple databases with one schema each.

    Backing up all objects that belong to a schema is always a problem especially if objects of the schema that is about to be backed up refer to objects of another schema. One of the reasons why I believe schema backups are not available in SQL Server. Other DBMS that allow schema backups will drop warnings/errors as well if it detects dependencies on other schemas. Luckily there are tools available to carve out objects of one schema to a new or different database.

    Long story short, I think the answer to whether one should (1) just dump everything into dbo schema and add meaningful prefixes to tables, views, and stored procedures, (2) create different schemas instead of prefixes, (3) create different databases, all depends on the type of application, the type of process (workflow), the type of users/roles (security), and the performance of the database/application as the database grows (scalability). The developers perspective alone is not sufficient but it is certainly a necessary one as well!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.