What the Users Want: Ad Hoc Reporting

Eventually we all run into a situation that makes us cringe: the users want to design their own reports. If this were something as simple as giving users a color picker and some visual layout tools, that would be one thing. Most of the time it’s more complex. Business users know their problem domain very well; they want to be able to view widget sales for the last three quarters broken up by sales person, location, and four other factors that you couldn’t possibly dream of, but you know you have the data for. What are the best options? How do you get the job done?

Dynamic SQL

It’s no secret that I’m a fan of dynamic SQL and when you’re building ad hoc reports, dynamic SQL can be the quickest way to get the job done. For those of you unfamiliar with the term, dynamic SQL is SQL that has been built in the database in response to inputs from some outside source. Dynamic SQL is frequently used to respond to two separate code paths. For example:

Depending on the age of the data, the code will read from one of two tables. The idea, of course, is to keep our implementation invisible from the user. This doesn’t tell us how we can use dynamic SQL to respond to ad hoc queries from users.

In order to use dynamic SQL to respond to user reports, the features must meet some pretty strict requirements – we need to be able to base all of our ad hoc reports on a stable core query. “Stable core query” is just my way of saying “all of these reports join to the same base set of tables.” This makes it easy to dynamically add join conditions to the end of the FROM clause. In effect, we’re just shoving strings together in the database to get data back to the users.

Unfortunately, dynamic SQL doesn’t always produce the best results. It’s difficult to build true ad hoc querying capacity with dynamic SQL. Users can be given a reasonable facsimile of dynamic SQL by providing many canned choices, but creating this SQL may be time consuming, error prone, and ultimately it can become difficult to maintain. The more options that users have, the more complex the T-SQL becomes. Clearly this pattern doesn’t work for large applications.

Lucene and SOLR

Apache Lucene is an open source search engine written in Java. SOLR adds functionality like document indexing (PDF, Word documents, etc), database integration, geospatial search, and faceted search. Both of these products are free and scale independently of SQL Server. Another benefit is that Lucene and SOLR can be fed documents to index in many formats – XML, JSON, PDF, or Word docs are all acceptable. Developers merely have to point the documents at SOLR and they will be indexed.

If you think this sounds an awful lot like full text search, you’d be right. Lucene and SOLR share a lot with SQL Server’s full text search feature. One of the big differences is that they exist outside of the database. Another difference is that you can query SOLR directly and return all of the fields that you want without having to read from tables in a database.

What’s more, SOLR has a very fluent search syntax making it possible to many types of complex queries. SOLR’s rich syntax makes it very easy to construct complex queries according to user needs without having to worry about writing SQL that can perform the underlying search. Finding sales data for the third quarter of the last five years where the sale contained headphones would look like this: items_sold:headphones sale_date:[NOW-5YEARS/DAY TO *] quarter:3. It’s certainly not something that business users could write, but neither is SQL.

OLAP

I’m not a huge fan of OLAP databases because I fear all things I don’t understand.

Okay, that last paragraph is a lie. SQL Server Analysis Services is a fantastic way to build crazy reports, the likes of which the world has never seen. I hadn’t even thought of this mechanism until I talked with a client recently and they told me what they were doing to deliver ad hoc reporting. New or modified data is marked with a timestamp. The data is periodically fed into an Analysis Service cube, right now this is once a day. The users get ad hoc reporting capabilities through the cube. The data currently isn’t real time enough, but the cube processing is fast enough on the current hardware that the frequency could be increased dramatically.

Here’s the best part: many different tools can be pointed at the cube to make this happen. Right now, users are generating reports with Excel and pivot tables, but nothing says that additional front ends can’t be stood up. Excel, Power Pivot, and SQL Server Reporting Services all make it easy for users to create their own reports and get to data in ways that the business can use.

What’s Your Secret?

There are as many ways to implement ad hoc reporting as there are developers implementing it. What are some of the ways that you’ve seen ad hoc reporting implemented. Sound off in the comments.

Previous Post
Read-Only View Permissions and Roles in VMware vSphere
Next Post
My Thoughts on SQL Server 2012’s Licensing Changes

10 Comments. Leave new

  • Great question! While there are, of course, a bazillion options let’s start with one really important constraint right off the bat:

    For the love of all that’s holy, please never report off of your transactional store.

    I know that there’s a historic tension between developers (like me) and DBAs. A lot of this tension comes from trying to make the database accomplish too many conflicting goals. If you separate your data into transactional and reporting, you can optimize each for its intended use. When it comes to reporting, I’m a huge OLAP fan since it tends to structure data the way that business people think. Regardless of what you choose, as long as you separate your concerns you’ll be able to change your approach without causing huge ripples . . . and rifts.

    Jay

    P.S. I’m assuming that if you’re discussing things like ad-hoc reporting, OLAP, Lucene, etc., then you’re talking about a medium-to-large system. The “never report off your transactional store” battlecry doesn’t apply to small systems because, well, they’re small and you can get away with “doing bad things”.

    Reply
    • Well said! Many people fall into the small shop category, and these are all options that make just as much sense for the smaller shops as they do for the big shops. Some options (Lucene and Dynamic SQL) work better than others (OLAP) for small shops because developers can accomplish much of the work – OLAP requires a specialized skill set. Dynamic SQL requires knowledge of string concatenation and Lucene requires reading the docs (at least to get started).

      In a perfect world, DBAs like Brent and Kendra would never get upset with developers like you and I. Thankfully, we can just buy a lot of tiny servers, load them full of SSDs and pretend that resources are infinite, right?

      Reply
  • There is a crazy-good product by Yellowfin (http://www.yellowfinbi.com/) that allows for users to define ad-hoc reports.

    Kind of like Access Report builder (well, in concept anyway). It’s delivered over the web and of course you can lock-down to stop your users from cross joining every table in the db 🙂

    Dave

    Reply
  • I second the call for SSAS cubes. Not only does it lend itself to the way that business users think (like Jay said), but it can be accessed via Excel which is the business user’s #1 goto analytics tool across nearly every industry. And with the bells and whistles added in v2010 (slicers, sparklines, PowerPivot, etc), it makes for a pretty decent ad-hoc reporting solution. Then for the power-users, you can throw a Reporting Services model on top of the Cube and provide them with access to Report Builder.

    Dynamic SQL is just a smoke and mirrors “technique” and can hardly be considered a full “ad-hoc solution”, imo.

    Reply
  • I did some work with the National Health Service (NHS) in the UK. The data was very hiearchichal so SSAS was perfect. We published the cubes online allowing users to self report using a third party control (from Dundas). The users loved it!

    Reply
  • Why Lucene and SOLR remind me so much about 5th generation languages and query by example?

    Reply
  • > the users want to design their own reports
    Crystal Reports 2008 and up will fit perfectly.
    Sure, it’s not a “MS way” to do the job. It’s “the” best way to solve initial problem: “design their own reports”.
    I have experience with both technologies, RS and Crystal, and found Crystal more advanced, user-friendly and more flexible in many ways.

    Reply
  • In my opinion, Ad-Hoc reporting is the Holy Grail of reporting. Give the business users the ability to produce their own reports, when they want them, how they want them.

    The traditional process is to have a dedicated reporting team service the end users, which is timely, inefficient and costly.

    Working with Crystal Reports since 1996, version 5, the product has developed over time. The learning curve is easy, although the admin utility is a bit bloated.

    Microsoft, similar to missing the boat on the web, also missed the boat on reporting as well in my opinion as the space was dominated by Seagate Crystal Reports, Actuate and even Microsoft Access.

    However, they partnered with Crystal embedded in .net and then caught up quickly with SSRS in 2005.

    When I first saw SSRS 2005 I thought it was a kids application, but the more I learned the more I saw the power behind it and the flexibility and awesomeness.

    In fact, I’ve fully converted over to SSRS and no longer develop in Crystal/Universes.

    With that said, SSAS is capable of providing some dynamic ad-hoc reports to the end users, but does not fit every scenario.

    So if someone can develop an true ad-hoc reporting system, they will ride into the sunset…

    Jon

    Reply
  • We provide HR/payroll services to companies and have been using a product called Izenda to do ad-hoc reporting. The dumbing down of the schema is accomplished through database views. Overall, I think the product does a decent job as a web-based ad-hoc report writer.

    But our users still have problems with it being too difficult to use. They start joining in stuff they don’t know what they are doing and then wonder why the data is wrong. We try to coach them to start by copying an existing report but it’s just too complicated for most.

    Might have to check out Yellowfin mentioned above to see if it might be a help for us.

    Reply
  • For small databases Access is perfect for this scenario. Tables can be converted to read only views and linked into Access as DSNless tables. You would then distribute the Access file to power users and let them have at it for custom queries and reports.

    We include a Report Generator in all our Access/SQL projects that has the frame work to quickly add reports for our clients, usually they let us do the reports for them instead of messing around on their own.

    Juan Soto
    Access MVP

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.