The ever-insightful Buck Woody (Blog – @BuckWoody) wrote a post telling you to use database schemas. He writes:
“But I still see a lot of applications, developed by shops I know as well as vendors, that don’t make use of a Schema. Everything is piled under dbo…. But if you’ll use them properly you can make your application more understandable and portable.”
I disagree – and Buck and I started a conversation on Twitter which quickly turned into a group discussion. It merits more than 140-character comments, but since Buck’s blog temporarily has comments turned off, I wanted to give people a place to discuss.
My thoughts: schemas are naming conventions in lipstick. When I look at somebody’s code, the last thing I ever think is, “Man, if these tables just had schemas, I could understand everything better.” In fact, when I see objects broken up into different schemas, I usually end up asking questions like, “Why are we writing user reports against something in the ETL schema?” The answer comes back, “Well, it used to be an ETL table, but then our needs changed, and we couldn’t rewrite all our code, so it just turned into a production table.” Great.
If they’re used right, they let you segregate permissions by groups of objects. But here’s the thing – almost nobody ever does that. It’s more work than it’s worth except in the largest of databases. When we’re talking about a vendor app that gets sold to third parties, you usually don’t want anybody jury rigging permissions together – just query through the app like you’re supposed to, and keep your filthy hands out of my database.
Thoughts?
We have a system which is built on top of legacy database. We decided to use schemas to separate ‘new’ objects from ‘old’ objects, which in vast majority are in dbo schema.
At first teams were reluctant to adopt idea of creating schemas in the new system, mostly because of the assumption that schemas = users. Once they were convinced that they are really namespaces, using schemas became very natural. I agree, in homogenous system which is not accessed by end users tweaking security on schema level is not what is done very often.
We introduced schemas to divide objects into some business oriented groups so it is easier to understand their purpose and meaning of the data in them.
I wish management studio could display database objects groupped by schemas, for databases grew a lot for last 7 years, and management studio did not change the way of navigation – it is terribly lame to scroll hundreds or thousands of objects and wait for the tree to populate the leaf data.
schemas are good for many things. We pull data from DB2, which has “libraries” proddta, devdta, stgdta, etc. I have found that using schemas in a SQL DB map nicely to those. I can have 3 schemas with same tables in db, one SSIS package that works on all, just a different schema. Yes of course could have 3 different db’s on my end, but that is 2 more databases to backup, manage, etc, etc.
I would use schemas more if there was a way to backup an individual schema’s objects without having to resort to segregating it to a different file group. Sure would be cool.
Jason – yeah, agreed. See, that I could really use. That, or the ability to have different recovery models per schema – I wouldn’t care about logging in my staging schema, for the examples Andy Leonard sited where he puts staging data in the production database.
I think I smell an MS Connect issue. Who wants to submit it?
For the way I tend to use SQL Server (mainly data warehousing) I’d say that schemas are more of an extended naming convention than anything else, I generally have more databases (DEV/UAT/PROD) or use naming conventions (e.g. fct_table, dim_table, arc_table, stg_table).
That said, I did some work at a hosted application provider who would build many short-term-use (but sometimes reused over time) applications for their clients. For security and convenience (reusing the same table names) developers would create an entirely new database for each application leading to us having literally hundreds of databases on their servers. The perfect solution to me would have been to have one database per customer and then one schema per application and since developers tend to like copy/pasting you could even configure the schema once at the top of each ASP script to make it easy to re-version code and database in a stroke.
[...] Why Use Schemas? (Brent Ozar) [...]
We have a database that stores tables, views and procedures that are used for reporting against a 3rd party application. I’ve used schemas to keep objects in logical groups (like those related to business metrics). I’ve also used schemas to group objects by the people who have request them so that I can remove objects when the people who have requested them leave.
Count me among those who like schemas. We primarily use them for namespaces within an application’s database and for easier management of security, like others have mentioned. Yes, you can do both of those with other methods (and we do have them on legacy apps), but I find schemas a clean way to implement both.
To control the “schema sprawl” that some have mentioned, we lock down permissions so that only DBAs can create schemas, and no user-specific schemas are created.
Having a multiple applications in the same database i see the necessity of using different schemas. Schemas give such a good option of a logical content separation, and doing it the old and not so good SQL 2000 way is something that i do not wish to encounter again.
Same applies for the security reasons – it is so much easier to manage the access to the collection of objects, and those who are starting out have much less probability of executing a wrong action.
I’m not sure I understand the idea of using schemas to separate multiple applications in the same database. If the applications do not share any data (tables) or objects then why would anybody ever put them in the same database? If they do share data (tables) or objects then which schema do the shared items belong in? Is there a “Shared” schema? What if there are three applications and some items are shared between only two of them?
In a couple of my presentations which cover this, I talk about the usefulness of schema. First and foremost is to use the same way we use namespaces in Java, .NET, etc. The security is secondary. But if you can group objects logically, then it works well. I’ve got some sample code which shows a cabin registration system for state parks… based it on my experience with the South Carolina state park system and who can do what. I really need to write up that SQL Server Standard article…
[...] How Much Space Am I Taking Up? Are you running out of space? Filling up drives? Don’t know where to go? Is your SAN administrator complaining about your continued demands for more spindles? Do you need to juggle databases around on different servers but don’t know how you’re going to figure out the size of your index, table, and materialized view filegroups as well as the filegroups you’ve created for every schema just to piss off the DBA? [...]
I grew up on Oracle 8 and 9. In my history, we sometimes stacked multiple applications into a single Oracle Instance seperated only by schema. Outside of the simplified security of using schemas in this model, we could also “pull” an application out of this shared database and into it’s own database if the need arose.
SQL Server doesn’t really need to worry about this type of model, since it’s so easy to just creat a new database, and the licensing of multiple instances on the same hardware. So to me, using a schema to federate logical objects into their own namespaces or domains just doesn’t make much sense.
Ok after a serious and funny read of all the posts above, I can write my opinion/way to proceed.
I started with SQL Server in its 2000 Version, where schemas were embryonic and in some cases a nightmare. So when things changed with 2005, I’ve never felt the need to use them. If I’ve the need to add an abstraction layer, I use Databeas Roles, Stored Procedures whose execution is granted to this or that Database Role. I think I have the same Brent thoughts: “keep it simple”
As someone said above, I’d appreciate much more Schemas if I could manage schema backup, but I’m so confident with Filegroup/File Backup that this sholud be an option for DBAs are currently using Schemas.
I have to manage issues harder than Schemas, like poor designed Databases, data consistency, poor performances, so I don’t want to add a layer to be managed.
1) Having bought hosting before for small companies, and wanting a MSSQL db, those on a shoestring budget, can often only have on database. This si where schema’s start coming in very handy
2) Logical organisation of tables and other objects is paramount for me. Coming froma non-technical background I shudder when I see tables like [Application]_[ApplicationModule]_[ObjectType]_[Entity]
3) Again, coming from a non-technical background I’m loathe to do things like cross-server and cross-database queries. It strikes me that data you need should all be in one database. Schemas then allow for logical, but not physical seperation of data.
4) As I work in BI, I have worked in companies with multiple departments who all perform ad-hoc analysis and modelling from the database. In these instances, it is useful to allow every read-only access to tables in a central schema and give them unlimited access to a team schema which they can create tables in, produce stored procs etc. This is also useful for audit purposes because it allows people to keep results stored for a period of time without cluttering up the important tables.
In summary, I like schemas because they allow a logical seperation of data and make development easier.
Our team loves schemas. Below is an excerpt from our BP (for BI datamarts). We generally use a 3-4 layer approach (staging layer, data warehouse layer, optional integration layer and presentatino layer. We also add a schema for procedures that support reporting only. Naming conventions without schemas add clutter to the name. One comment above – using databases in lieu of schemas – this means more databases on a set of LUNS (more potential for I/O contention).
Our guideline:
There are gray areas as to what schema should be used for views, procedures and functions. The design considerations are based on function of object and security considerations for access to schema. Here are some general guidelines to use during application design.
1. Views can support selecting data for reports or aid in loading tables. Views in the data warehouse layer support loading integration layer; views in integration layer support relational reporting from integration layer and loading presentation layer tables. Views should not be used in the presentation layer. As a rule data should be persisted in tables in all layers and views are to assist reporting and data loading and are not intended to replace the function of the architectural layer.
2. Stored procedure that change persisted data should use the schema of the objects being changed. Stored procedures (utility in nature) that cross layers should use the [dbo] schema.
3. Stored procedures and functions for reporting (no change to persisted data) should use the reporting schema (rpt).
Exceptions to the above guidelines would be based on security requirements where exposure of underlying data needs to be restricted based on role.
I’ve used schemas for small projects that may well get integrated into an existing DB. Imagine writing the latest and greatest forum software – nice to create your our schema and if they want to integrate all is separate and clear, and if they decide to stick to a separate db, that works too. But it’s still one schema, just not dbo!
Andy – now I like that idea. That makes perfect sense to use a separate schema there to avoid problems like table name overlaps and versioning. I like it.