Should we use stored procedures or queries built in the app?

A client asked a great architecture question that deserved its own blog post:

Should we use more “stored procedures” for select, insert, update, delete or should that just be sent as queries from the application?

If you ask a junior database administrator where to put something, she’ll probably say “in the database” because that’s the thing she has control over. That’s fair – DBAs love having control over stuff. However, let’s step back a little.

If your shop lacks T-SQL expertise, but has a lot of sharp senior application developers, then it may be easier to keep the queries in the app. If you simply make a mandate that “all queries come from stored procedures,” but you don’t give your staff the training necessary to write fast, scalable stored procedures, then the result won’t perform. Database administrators love to bash the queries written by ORMs like Entity Framework and nHibernate, but the reality is that these days, those tools write better queries than untrained developers. Sure, in a perfect world, we’d all be perfectly trained on all of the tools we use, but….

If your shop is good at testing & source controlling app code, but doesn’t have that same level of expertise for testing & source controlling stored procedures, then you’ll probably find versioning and deployment easier when the queries are built into the app code. In the year 2019, testing & source controlling database objects like stored procedures and functions still isn’t quite as easy as app code. It’s getting better, but it’s not there yet.

If you’re good at measuring app code performance, like if you’re good with code speed tools like New Relic that instrument every step of your application code, then you’ll be able to more naturally track down slow queries with those tools. However, if you’re not good at app performance monitoring, you may find it easier to track down slow queries with plan cache tools like sp_BlitzCache.

If you have predictable performance needs, like your business has the same exact workload for months straight, then you can keep code in the application tier. When you start to see performance issues, you can make gradual changes to the code, test that as part of your regular testing processes, and gradually deploy those changes out through your normal release process.

However, if your business has extremely unpredictable performance needs, like a sudden high volume of transactions every few months, and if you don’t do a great job of performance testing ahead of time, then stored procedures can be a better fit. In the heat of a performance emergency, it can be easier to change and deploy a new stored procedure than deploy an entire new build of your application. For example, during one client’s emergency, I implemented this bizarro stored procedure caching technique because it meant the difference between making money, versus their online store being…not online.

In this client’s case, with a team of sharp app developers, none of which had database training, and no database developers or DBAs on the team, the answer was clear: keep the queries in the code. I do try to give folks an avenue to up their game if they want, though – and in your case, if you’ve got developers who want to level up, start them on Itzik’s classic, T-SQL Fundamentals.

Previous Post
What Are You Going to Do About SQL Server 2008?
Next Post
Registration Open for My SQLSaturday Sacramento Workshop

42 Comments. Leave new

  • What about Reports, SSRS ? I’m of the belief that everything depends upon something else or multiple something elses’s. What are all of the something else’s and is the overhead of running a query from SSRS one of them ?

    Reply
    • I’ve found that from an everyday dev perspective alone (not even thinking about having to fix and redeploy reports at crunch time) report query code is better off as stored procedures. Like everything in the MS “BI Stack”, the UI element you’re given for embedding T-SQL is a miserably small textbox that doesn’t even use a fixed-width font. I wouldn’t say there was an execution performance overhead of any kind for having your queries in the dataset directly, but there are multiple hidden costs from development to maintenance to diagnosis.

      Judicious use of procedure-backed shared datasets etc. is probably going to give you the least headaches overall, but to be honest I find SSRS to be inflexible in all the places I need it not to be, and a bit of a tired old dog to boot (not that I’m saying I know of anything better at the same price (free), just that it’s very difficult to make a decision in it that doesn’t manifest itself as pain somewhere else down the road).

      With SSRS (and indeed your apps and databases) sticking to one paradigm is arguably one of the most important things you can do. I’ve seen apps before that used non-parametrised stored procedures, parametrised stored procedures, parametrised queries in the app, non-parametrised queries in the app that with the results filtered on the client side, dynamic SQL constructed in the app, dynamic SQL constructed in stored procedures, _and_ an ORM simultaneously. The strain on every part of the dev and support lifecycle trying to juggle different ways of doing things is staggering (not to mention that you obviously get crossover between the different approaches, meaning that queries that “do the same thing” end up out of sync with each other).

      Reply
  • Jason Markantes
    March 27, 2019 9:23 am

    I would add, if your systems have several applications accessing the database (maybe a web app, mobile app, several small internal web interfaces for employees, etc) then stored procedures could help keep data consistent across the access methods. Another layer of security could be enforced at that later also.

    Reply
    • That’s one of my main issues with it. I go into organizations all the time who have code embedded in apps, and often they say they need to be agile. Yet there are often multiple apps, spreadsheets, reports, Power BI, etc, etc. and the people looking after the DB get to a point where they’re not prepared to change anything, as they have no visibility on what any change that they make will affect. They know if they change something, that someone will scream at them; they just don’t know who, or how big a mess they will have just created. Then, what they’ve actually achieved is the complete opposite of agile.

      More to the point, as well as multiple apps, the apps themselves come and go. Data doesn’t. It’s usually the most valuable thing the organization owns, and it tends to live on and morph from shape to shape over time. Designing your data storage for the needs of one version of one app today isn’t the cleverest thing you’re ever going to do.

      Reply
  • Jeff Mergler
    March 27, 2019 9:47 am

    If your application must support SQL Server and Oracle, you’re going to code it into the app (and run it through an agnostic Data Access Layer), unless you enjoy doing everything 2x in 2 slightly different dialects. I don’t know how common this is elsewhere but at my shop, its our day, every day: We try to avoid the duplication because who says this and is happy to hear it? “OK, your code passes on SQL Server, now go clone it for our Oracle customers” (cue the sobbing sounds now). Some things that are very loop intensive like crunching rows of data (maybe in a cursor, gasp!) and creating new data from that, are definitely done in stored procs but in general we try to code smart SQL in our app and use sp_BlitzCache to find the stinkers on the sucker board. And we do find stinkers, in code -and- in stored procs, so location matters less than SQL competency, in my experience.

    Reply
    • Jeff – you are absolutely right. The cross-platform thing is a big cause of that heartache. I went down that path as a developer, and I’ve got the scars. Seemed like such a good idea at the time…

      Reply
  • I second Budd’s question. Don’t SPs make sense for SSRS? It seems as if they would offer easier source code control and allow us to swap in upgrades to individual SPs

    Reply
  • “with a team of sharp app developers, none of which had database training, the answer was clear”. The answer is clear to me (but different to the suggested one): if you don’t have someone in your team with critical skills that are needed by the team, you hire someone with those skills, instead of more clones of yourself. The biggest hassle with database code that’s tucked inside apps, spreadsheets, access dbs, reports, etc. is that anyone who needs to modify the DB has no visibility on it at all. All they know is that if they change something, someone will scream at them, and they don’t know who. That actually leads to the opposite of the agility they no doubt claim they were trying to achieve.

    Reply
    • Greg – yeah, I totally hear you, but in a startup with 3 developers, “hire someone with those skills” is a non-trivial spend. That’s the difference between having, say, 6 months of runway, versus 3 months of runway.

      Reply
      • What I find frustrating about those teams is that data is often the most important thing they do, yet it’s the area they aren’t skilled in. That is not sensible, even for small teams. But the problem is that it permeates even large teams. I spent a lot of time at a site with 500 devs, all clones of each other, none with data experience, and yet that’s the primary thing they do.

        Reply
      • To give a specific example: I was at a startup in North Sydney last year. They had 14 developers who’d worked for 4 years on a SaaS application. They were two weeks from needing to show it to the shareholders. The manager told me if they could have 1000 concurrent users, they had a viable business. If it was only 500, the business was shaky. At the time, they couldn’t get past 9. All issues with how they used the DB.

        Reply
        • John Zabroski
          March 29, 2019 3:58 am

          That still happens today (2019)? I would love to read what they were doing.

          Reply
        • “They were two weeks from needing to show it to the stakeholders.”

          I was laughing out loud before I even finished reading the comment because I knew where that was going. You are such a phenomenally good fit for situations like that – you’ve got the knowledge and such a wonderfully polite, delicate touch. I’d have loved to have been a fly on the wall.

          Reply
      • There is another thing, DB procedure logics is going to be always quicker that app, given app developer skills == sql developer skills. Obviously, you need guys who do actually know what exec plan is, SQL is giving enormous possibilities if you know how to use them.

        Reply
  • Often, in these discussions, the 2 options are considered to be mutually exclusive – “an ORM and stored procedures can’t possibly coexist.” I’m a proponent of the “Not every problem is a nail. Don’t always use a hammer.” approach. Both solutions have their “sweet spots”, and when you deviate from it, there are going to be problems.

    I have seen countless times where developers try to make Entity Framework do things that it is not particularly good at – extremely complex queries, reporting queries, batch DML operations, etc. In these cases performance always suffers significantly, and stored procedures are a much better option. Entity Framework generally allows for reduced development time and query flexibility. There are some very subtle gotchas, as well – selecting all columns, lazy loading, implicit conversion problems from incorrect table mappings, performance regressions from internal calls to DetectChanges(), misuse of LINQ deferred execution, a plethora of issues with IEnumerable.Contains(), etc. There are also certain query patterns in T-SQL that do not have a corresponding syntax in LINQ.

    If you need finer control over the SQL being generated for better performance, use a properly written and tuned stored procedure. With that being said, stored procedures can easily be misused, as well. Brent has documented these problems again and again – optional parameter queries, function calls, cursors, implicit conversions, SQL injection vulnerabilities, etc.

    I say this coming from a hybrid viewpoint – a former developer and current database developer that speaks both T-SQL and LINQ.

    Reply
    • Alex Friedman
      March 28, 2019 12:16 am

      This is very interesting, I would love to read a blog post detailing all these issues and their solutions!

      Reply
    • This is where I tend to come down too. We’ve tended to use a combination of stored procs and ADO.Net — stored procs for the CRUD and other more complicated task and ADO.NET for simpler things (like fetching values to populate a drop down box).

      Other considerations:
      Sometimes (e.g., when the app and the related database are done by different groups or even different vendors). When there’s a problem, it is useful to use stored procs as a line of demarcation — so we can quickly find out who owns the problem (or at least who takes the lead).

      Some problems can be solved at the database/stored proc level but don’t really involve the rest of the application. If all the queries are handled in the application then you’re in for a (typically more complicated) “bugfix” / “release” cycle

      Reply
  • Every connection using the application code will require its dedicated compilation. In case of SPs in database these are not recompiled at every execution request and it is a big performance factor.

    Reply
    • This is definitely not true. The text of the query (including all whitespace) is hashed & stored with the execution plan.

      If you dynamically generate the query text, then each distinct formulation will be cached.

      Reply
  • No, ad hoc queries are cached as well.

    Reply
  • I don’t have experience with all ORMs, but Entity Framework (starting in version 5) also automatically compiles LINQ queries on first execution, so that the generated SQL statement for the LINQ query is cached for subsequent executions (similar to how SQL Server caches a query plan at the database level). There are some documented exceptions (IEnumerable.Contains(), for example), to this rule. The EF-level query caching is a big performance boost, as complex LINQ statements typically take significantly longer to compile than SQL Server does to build a query plan for the corresponding SQL statement.

    Reply
  • Alex Friedman
    March 28, 2019 5:29 am

    Caught EF generating fun SQL like this one today:

    SELECT
    CAST(NULL AS bigint) AS [C1],
    CAST(NULL AS uniqueidentifier) AS [C2],
    CAST(NULL AS varchar(1)) AS [C3],
    CAST(NULL AS int) AS [C4],
    CAST(NULL AS datetimeoffset) AS [C5],
    CAST(NULL AS datetimeoffset) AS [C6]
    FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
    WHERE 1 = 0

    Reply
  • Always use sprocs, as DBA’s we should be teaching our devs why.
    1) Consolidated Business Logic “See Jason Markantes comment” was the reason the stored procedure layer was built in the 70’s.
    2) Multiple Query Plan Cache, based on parameters that survive reboot.
    “Greg Low” Yes, Ad-hoc query plans are cached as long as the hash and the parameter types are the same. Pass in a null and the plan is void. IF you have an OR against the values, then the Plan you are using might be very expensive, instead of leveraging a better plan (or [deleted]=@deleted)
    3) Leverage the concept for Oracle. Every other DB Engine will require a DAL to be constructed or use ad-hoc. If you build your DAL with a view for both, then you can leverage either. Call GetCustomer {if SQL Exec else Build Query}

    The only permission I grant in Secure Databases is EXEC

    Reply
    • Randy – I’m going to be a little gentle here, but a lot of your assumptions here aren’t quite correct. It’s a little beyond what I can cover here in a blog post comment, but I’ll just suggest maybe checking out our Mastering Query Tuning classes when you get time. Might be time to hit the books.

      Reply
    • Randy this is a little late, but for future readers – While it’s true that SPs are a saviour when you need an optimization or for other reasons, most proponents of modern app-architecture will advise that you keep as much of the actions on business logic in the code. I have found this to be a beneficial thing as a dev; the SPs just get me to the data.

      Reply
  • I really like SQL so in personal projects everything is a stored proc/inline TVF. But at companies I work for that is not typically the case. I can see a use case for both. But you also get insight into your procs/functions when you have them all on the database. And you get the performance boost from having actual queries. Right now we hit the tables on a per table basis except for some exceptions. So we are just nailing the database instead of using joins. Seems very inefficient to me, but whatever that’s the way they want to do it so that is the way I’ll do it.

    I’m also surprised how many high quality knowledgeable devs don’t have basic knowledge of SQL, so perhaps it is better this way.

    Reply
  • Oh, and by-the-way I love that book. I feel like I have a fairly descent grasp of T-SQL but that book still taught me quite a bit. I’m a dev first and a database guy second so it’s not my main job but I still find myself doing quite a bit of work in SQL.

    I’ve been passing that book around, well, to at least one dev so far. Once he is done I’ll pass it around to others.

    Reply
  • Last comment, maybe :-). One downside of using stored procs/inlined TVFs is that it can be more difficult to create abstractions of code. Which can be a good (you don’t over abstract) or bad (you need to abstract but the language makes it difficult to impossible) thing.

    Like joining tables with one table resulting in a single set of languages and ordering them efficiently requires that you have the same language filter. But since T-SQL doesn’t have a good way to abstract that code you end up having to do code generation or manually changing it in all the different stored procs/functions.

    So, as much as I love SQL, it has its warts.

    Reply
  • […] Should we use stored procedures or queries built in the app? Brent (T) discusses the scenarios when queries from an app might be reasonable. […]

    Reply
  • John Hennesey
    April 1, 2019 7:40 am

    Goodness, this hit some nerves with the community! 🙂 I used to believe putting it in the db was the best approach [it can cache the plans much cleaner, right?] But then I started working in my current shop.

    * Support both Oracle and Sql.
    * Prevent customers from modifying the stored procs [they really do that?!? Yep].
    * Stronger dev’s than db people.

    As with everything we do, there are pro’s and con’s to each approach – in our case it opened my eyes and made me much more accepting of the embed in code approach.

    Reply
  • In my opinion, it comes down to the speed at which you can deploy fixes. If your SQL Statement has an bug, and that statement is in a Stored Procedure, I can update the stored proc on the client’s system right away to get them working, and then later follow up with that in an official software update. If it’s in application code, I’m forced to deploy new software to fix the issue. Depending on your application deployment strategy, that may or may not be a big difference.

    In the products I work on, we try hard to reduce the time between finding the bug and having a new software ready to install, so it isn’t really an issue. Because, most of the time, the bugs aren’t in the interface to SQL Server, so having that cycle time as small as possible brings benefits to all bug classes.

    Reply
  • Apologies if someone already addressed this, but one thing I prefer about stored procs over queries embedded within the application is that stored procs make it so much easier to keep track of dependencies. For example, if I need to make a schema change or upgrade my database version I can run a simple query to find all of my code dependencies. Not so with embedded queries.

    Reply
    • John Zabroski
      April 1, 2019 2:29 pm

      Phil, That is why I created the following ticket with Entity Framework: https://github.com/aspnet/EntityFrameworkCore/issues/12669

      It appears this suggestion has been reiterated many times but keeps getting closed, such as this one: https://github.com/aspnet/EntityFrameworkCore/issues/14134
      https://github.com/aspnet/EntityFrameworkCore/issues/15127

      Reply
    • That particular advantage of stored procs can be addressed by creating data service that all application code must use. The database cannot be accessed directly in any way (i.e no client side dynamic SQL or prepared statements) and the schema is not exposed.

      Reply
      • But that only works in simple environments. Most serious environments have many app layers that need to access the DB. How would you even construct a data service that could service say Java apps and Excel?

        Reply
        • Mark Messer
          July 25, 2019 5:48 pm

          I worked for a San Francisco-based financial software company that has been selling trading software to the largest financial institutions in the world for over twenty years. The API is via RMI – clients would connect to a data server that exclusively talked to the underlying relational database. The system is also pub/sub event driven. The API dealt with business concepts, not tables, rows and attributes. There are many engines (tiers) which subscribe and publish events and make calls to the data server, for example a market data engine, position engine, message engine, etc. A client can create their own engines and remote API’s to customize. Of course they’ve modernized the architecture a bit over the years (“cloud”, “microservices”), but the general architecture remained pretty much the same for at least fifteen years.

          Reply
  • Adelio Stevanato
    April 2, 2019 2:29 am

    The main issue with having lots of code outside of the database is the assumption that there is only a single route to get data into the database.
    We have at least 2 different ways to get data into the database SSIS and a Web based App
    SSIS only sees the database so to my mind as much logic as is possible/reasonable should be at the lowest level.
    I still hear some people arguing that we do not need referential integrity in the database as it “slows it down” and it can all be done in the front end. O.M.G.
    Well I can tell you that I have worked on applications where the database has ZERO Foreign keys or check constraints and many routes to get data into the database and you can guess how well that works!
    To my mind if you can put a FK or check constraints (Or even triggers) in the database then it does not matter how many paths the data comes in from you should end up with a lot less issues with data. All it takes is someone to perform a manual update to some data and miss out some important stuff to mess up the data.
    I do not care it the FK makes the update take a little longer. Generally the speed difference is minor but the chances of invalid data getting into in the database are greatly reduced. Fixing invalid data retrospectivly can also take up a lot of time.

    Reply
  • Antonio Cortina
    May 30, 2019 2:49 pm

    What about Test-Driven Database Development (TDDD)?. There are some good frameworks out there.

    Reply
  • Antonio Cortina
    July 2, 2019 7:20 am

    I’m working on some data-driven projects (BI/Warehousing projects) and we’re using Test-driven database development to manage database changes with great results. Does anyone has any experience?

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
{"cart_token":"","hash":"","cart_data":""}