Blog

Advice to an IT newcomer

Women in Technology

Women in Technology (yes, Clippy is a woman, too)

We recently got the following question for Kendra and Jes in the Brent Ozar Unlimited® mailbox:

Six months ago I stumbled into the IT world. Do you have any advice for someone (a woman) starting off in the IT industry, specially someone without a computer science degree? I really enjoy working with databases and data. I would love to one day be a Database Administrator and get into business intelligence.

JES SAYS…

There has been a lot written about women in technology – positive and negative. I’m happy to say that my experience as a woman in technology has been incredible. The SQL Server community has been accepting, helpful, and nurturing. If anyone, man or woman, came to me, asking how to succeed, I’d give them this advice.

Ask questions. Constantly learn. There is no such thing as too much knowledge. The most successful people I know in IT – not just the SQL Server world – are lifelong learners. They read books and blogs, attend user group meetings and conferences, and learn new programming languages in their spare time. Build your own computers. Know how to set up a network. You don’t have to be an expert at every facet of technology, but knowing enough to talk to other people in “their” language will go a long way.

Speaking of user groups…find the closest user group, and attend it regularly. Don’t have one nearby? Start one, or join a virtual user group (like those at http://www.sqlpass.org/PASSChapters/VirtualChapters.aspx). Not only will you learn about things you may not be exposed to at work, you’ll have a chance to build a network. I can’t tell you the number of questions I’ve gotten answered through this channel, or the number of people I know that have gotten new (better!) jobs this way.

Never, ever be afraid to say, “I don’t know, but I can find the answer.” People will respect you far more if you are honest. Don’t try to be a know-it-all. If you haven’t dealt with a particular technology or situation before, acknowledge that, make note of what the person is asking for, and research it later. You’ll learn something new, and you won’t get caught giving bad – or wrong – answers.

Don’t think, “I’ll never know as much as him/her.” Yes, there are some people in IT that started building computers or robots or software before they could drive a car. Instead of thinking that you will never know as much as they do, remind yourself how many years of experience they have – and how they can help you. Ask to pick their brains. Ask them what books they read or what tools they use. Learn from them.

Most of all, don’t fall victim to the so-called “impostor syndrome”. Someone always appears smarter, faster, more organized, more accomplished, less stressed, or hasn’t spilled coffee all over herself yet today. Don’t let that invalidate where you started from and what you have accomplished. Keeping a blog – even if it’s private – that you can go back and reference over the years is a great way to value yourself. I know I’ve seen a dramatic change in my writing over five years, from both a technical and editorial perspective.

Good luck! Being in IT – and especially the SQL Server field – is exciting and rewarding.

KENDRA SAYS…

Remind yourself that you’re qualified to be a DBA. You mention that you don’t have a computer science degree. Great news: that’s not required. There’s no degree or certification that is the Official DBA Training Program or Proving Ground.

I had a period where I felt I wasn’t “legitimately a DBA” because I also worked with some C# and drove some non-SQL software development processes. But I was doing some really cool things with SQL Server administration, too. I should have felt awesome about being a DBA, regardless of my job title.

Never feel that your background or your current job have to fit a specific mold for you to be a “real” DBA. There is no mold!

Remind yourself that you are designing your own career as you go. Just like there’s no set educational and certification “solution” to end up with all the right skills, job progression for DBAs is usually not linear. It’s a rare person who starts with a “Junior DBA” job title and works their way up to “Mid Level DBA” and then “Senior DBA”.

Instead, most people these days struggle to find the right training, the right mentor, and discover if they want to specialize (Virtualization? Performance Tuning? Data modeling?), be a generalist, or dive off into uncharted waters (Hadoop? Different Platforms? Business Intelligence?). There are many paths, and there are new paths all the time.

Expect to repeatedly redefine your interests and redesign your career. Make sure that every six months you have a conversation with your manager about what kind of work makes you excited, and where you’d like to be in a year or two.

Remind yourself that you do great stuff: and write it down. For a few years, I had a formal review process where I was regularly required to write out my accomplishments. And you know what? That was great for me! Each item on the list seemed small, but when I put it all together it gave me a new view of myself.

This may be difficult, but it’s worth it. Keeping track of the great things you do boosts your confidence and makes you ready when opportunity comes around.

GET INSPIRED

Is this advice only for women? Heck no! Sometimes it’s just nice to ask advice from someone who’s part of a group you also identify with and hear their perspective.

Want to learn more about how to build a great career working with data? Hop on over to Brent’s classic post, “Rock Stars, Normal People, and You.”

Going, Going, Gone: Chicago Performance Class Selling Out

Your application wants to go fast, so I guess that means our tickets go fast.

For our upcoming Chicago training classes in May, the 3-day SQL Server Performance Troubleshooting class is just about sold out. (There’s still about 15 seats left for our How to Be a Senior DBA training class, though.)

If you missed out this round, we’re doing the same classes in Philadelphia this September. Read more about the classes here, or read about how our San Diego attendees liked it earlier this year.

Don’t hold out for additional cities or dates – this is our entire 2014 training calendar. We’re working on our 2015 lineup as we speak, though – let us know what you’re looking for, and you might just influence our training class decisions.

Brent Teaching with Hardware

Brent Teaching with Hardware

SQL Server 2014 Buffer Pool Extensions

SQL Server 2014 contains some interesting new features. Although SQL Server Standard Edition is limited to 128GB of memory, teams deploying on Standard Edition have an option to fit more of their working set in low latency storage – SQL Server Buffer Pool Extensions.

How SQL Server Normally Deals With Data

During SQL Server’s normal operations, data is read from disk into memory. At this point, the data is clean. Once the data is changed, it is marked as dirty. Eventually the dirty pages are written to disk and marked as clean; clean pages may be flushed from memory when the data cache (the buffer pool) comes under pressure. At this point the data is gone from memory and SQL Server has to read it from disk the next time a user runs a query that requires the data.

As long as you have less data than you have memory, this isn’t a problem. As soon as you have more data than you have memory, you’re at the mercy of physical storage.

Right now, some of you are probably saying “So what? I have an awesome EMC/3PAR/Hitachi VX6.” You also probably have SQL Server Enterprise Edition and a pile of unused RAM sticks. This blog post isn’t for you. Go away.

The rest of you, the 99%ers, listen up.

Speeding Up Data Access with Buffer Pool Extensions

SQL Server 2014 Buffer Pool Extensions are our new secret weapon against not having enough memory. Like most secret weapons, there’s a lot of hype surrounding Buffer Pool Extensions (BPE).

The idea behind BPE is a lot like the idea behind virtual RAM (better known as swap space): fast, low latency persistent storage is used to replace a portion of memory. In the case of BPE, SQL Server will use the disk space to store clean buffers – specifically BPE will hold unmodified data pages that would have been pushed out of RAM.

To see just how fast this was going to perform, I create a test instance of SQL Server and decided to find out.

Test Methodology

I ran SQL Server through a relatively boring test harness – TPC-C running through HammerDB. The database was created at a scale factor of 444 warehouses – this yields 44GB of data on disk, or near enough. SQL Server 2014 RTM was installed on Windows Server 2012 on an Amazon i2.8xlarge instance. To find out more about how the instance was physically configured, you can check out the instance type details page.

SQL Server was set up in a fairly vanilla way:

  • Max degree of parallelism was set to 8
  • Max server memory was left alone
  • tempdb was given 4 data files located on a local all SSD RAID 0 of four drives
  • A second all SSD RAID 0 of four drives was reserved for BPE

Tests were run multiple times and the results of the first test were discarded – many changes during this process could clear the buffer pool as such, the first test results were assumed to be anomalous. The remaining results were averaged to produce the following chart:

It's faster than not having enough RAM.

It’s faster than not having enough RAM.

Conclusions

Having an appropriately sized buffer pool was far more effective than allocating considerable space to buffer pool extensions. BPE improved performance by 42.27%. This is not an insignificant performance gain, but BPE is no substitute for memory.

BPE will shine for customers deploying in dense, virtualized environments where memory is constrained but SSD is cheap and plentiful. Given the near ubiquity of SSD, and high speed consumer grade SSD being available for as little as $0.50 per GB, BPE may seem tempting. It may even provide some respite from performance issues. However, BPE is no substitute for RAM.

Collecting Detailed Performance Measurements with Extended Events

Analyzing a workload can be difficult. There are a number of tools on the market (both free and commercial). These tools universally reduce workload analysis to totals and averages – details and outliers are smeared together. I’m against using just averages to analyze workloads; averages and totals aren’t good enough, especially with the tools we have today.

Collecting Performance Data with Extended Events

A note to the reader SQL Server Extended Events let us collect detailed information in a light weight way. These examples were written on SQL Server 2012 and SQL Server 2014 CTP2 – they may work on SQL Server 2008 and SQL Server 2008R2, but I have not tested on either version.

We need to set up an Extended Events session to collect data. There are three Extended Events that provide information we want:

  • sqlos.wait_info
  • sqlserver.sp_statement_completed
  • sqlserver.sql_statement_completed

In addition, add the following actions to each event, just in case we need to perform deeper analysis at a later date:

  • sqlserver.client_app_name
  • sqlserver.client_hostname
  • sqlserver.database_id
  • sqlserver.database_name
  • sqlserver.plan_handle
  • sqlserver.query_hash
  • sqlserver.query_plan_hash
  • sqlserver.session_id
  • sqlserver.sql_text

In a production environment you should enable sampling. Sampling is just a predicate that filters out random events, typically by using the modulus of the session identifier (e.g. SPID modulo 5 = 0 will sample 20% of activity). Enabling sampling makes sure that you don’t collect too much data (yes, too much data can be a bad thing). During a routine 5 minute data load and stress test, I generated 260MB of event data – be careful about how often you run this and how long you run it for.

To get started, download all of the scripts mentioned in this post.

Reporting on Extended Events

Go ahead and set up the Extended Events session and run a workload against that SQL Server for a few minutes. If you’re analyzing a production server, copy the files to a development SQL Server or to a SQL Server that has less workload – you’ll thank me later. Once you’ve collected some data in the Extended Events session, stop the session, and run the results processing script. On my test VM, processing the data takes a minute and a half to run.

The data processing script creates three tables materialize XML data to speed up processing; shredding XML takes a long time. Relevant data is extracted from each of the different events that were collected and persisted into the processing tables. If you’re going to be doing multiple analysis runs across the data, you may even want to put some indexes on these tables.

Once the data is loaded, run the analysis query. This doesn’t take as long as the processing script, but it does take some time; it’s worth it. The analysis script collects a number of statistics about query performance and waits. Queries are grouped into 1 minute time blocks and metrics around reads, writes, duration, and CPU time are collected. Specifically, each metric has the following statistics built up:

  • Total
  • Average
  • Standard Deviation
  • Minimum
  • Maximum
  • Percentiles – 50th, 75th, 90th, 95th, and 99th

The same thing happens for each wait as well – each wait is time boxed on the minute and then both signal and resource waits are analyzed with the same statistics as query duration metrics.

In both analyses, analysis is performed on a query by query basis. At the end of the analysis we you get a multi-dimensional view of the data by time and query. It should be easy to perform additional analysis on the data to create broader time windows or to analyze the entire dataset at once.

Want to see what the output looks like? Check out these two screenshots:

Query Performance by the Numbers

Query Performance by the Numbers

Query Waits

Query Waits

Why Produce All of These Metrics?

All of these metrics give us insight into how these queries really run; an average just doesn’t help. Standard deviation alone lets us be aware of the variability of a particular metric – high standard deviation on a particular wait type means that we have a lot of variability in how long we wait on a resource. We also collect percentiles of all of these different metrics to help understand the distribution of data.

With this data at our disposal, we can make a better analysis of a workload. Now we can identify variations of a query that are producing bad plans, taking too long, or just reading an excessive amount of data. Or, better yet, if query performance is constant, we know that your code is just plain awful.

How’s that for a parting thought?

Updated “How to Think Like SQL Server” Videos

On my East Coast user group tour last month, I presented my How to Think Like the SQL Server Engine course to a few hundred folks. I’m always trying to learn and adapt my delivery, and I noted a lot of attendee questions this round, so I updated the course.

This is my favorite course I’ve ever done. In about an hour and a half, I cover:

  • Clustered and nonclustered indexes
  • Statistics, and how they influence query plans
  • How sargability isn’t just about indexes
  • T-SQL problems like table variables and implicit conversions
  • How SQL turns pages into storage requests

If you’ve always wanted to get started learning about SQL internals, but you just don’t have the time to read books, this course is the beginning of your journey.

Attendees give it rave reviews, but this one is one of my favorites:

Go check it out and let me know what you think.

SQL Server 2014 Licensing Changes

With the release of SQL Server 2014, we get to learn all kinds of new licensing changes. While I don’t work for Microsoft legal, I do have a PDF reader and a web browser. You can follow along in the SQL Server 2014 Licensing Datasheet… if you dare.

Server + CAL Licensing is Still Around

It’s only for Standard Edition and BI Edition.

Microsoft are highly recommending that VMs be licensed as Server + CAL (rather than per core). This can make a lot of sense when there are small, single application SQL Servers that cannot be consolidated for security reasons. Having a number of Server + CAL license for 1 or 2 vCPU instances can be much more cost effective than having a large number of core based licensed.

Of course, it makes even more sense to just license the entire VM host…

How much money would it take to give you assurance?

How much money would it take to give you assurance?

 

Standby Servers Require Software Assurance

Prior to SQL Server 2014, many shops were able to deploy a single standby server without licensing SQL Server. Log shipping, mirroring, and even failover clustering allowed for an unlicensed passive node, provided that the passive node didn’t become the primary for more than 28 days.

That’s gone.

If you want to have a standby node, you’ve got to pony up and buy software assurance. Head over to the SQL Server 2014 Licensing Datasheet; at the bottom of page three, it reads “Beginning with SQL Server 2014, each active server licensed with SA coverage allows the installation of a single passive server used for fail-over support.” The passive secondary server doesn’t need to have a complete SQL Server license, but Software Assurance is a pricey pill to swallow. In short, Software Assurance (SA) is a yearly fee that customers pay to get access to the latest and greatest versions of products as well as unlock additional features that may have complex deployment scenarios.

In case you were confused, high availability is officially an enterprise feature. Note: I didn’t say Enterprise Edition. I mean enterprise with all of the cost and trouble that the word “enterprise” entails in our modern IT vernacular.

All Cores Must Be Licensed

You heard me.

To license a physical server, you have to license all of the cores. Don’t believe me? Check out this awesome screenshot:

License ALL THE CORES

License ALL THE CORES

It’s even more important to consider alternatives to having a number of SQL Servers spread throughout your environment. SQL Server consolidation and virtualization are going to become even more important as SQL Server licensing changes.

Finding new ways to analyze, tune, and consolidate existing workloads is going to be more important than ever before. Your ability to tune SQL Server workloads is going to be critical in successful SQL Server deployments. The days of worrying when the server hit 25% capacity are fading into history – as licensing costs increase, expect server density and utilization to increase, too.

Standard Edition has a new definition

“SQL Server 2014 Standard delivers core data management and business intelligence capabilities for non-critical workloads with minimal IT resources.” You can read between the lines a little bit on this one – SQL Server Standard Edition isn’t getting back mirroring or anything like it. In fact – SQL Server Standard Edition sounds an awful lot like the database that you use to run your non-critical ISV applications, SharePoint, and TFS servers.

Software Assurance Gives You Mobility

If you want to move your SQL Server around inside your VM farm, you need to buy Software Assurance. VM mobility lets teams take advantage of VMware DRS or SCOM VMM. This isn’t new for anyone who has been virtualizing SQL Servers for any amount of time. What is explicitly spelled out, though, is that each VM licensed with SA can be moved frequently within a server farm, or to a third-party hoster or cloud services provider, without the need to purchase additional SQL Server licenses.”

In other words – as long as you’re licensed for Software Assurance, those SQL Servers can go anywhere.

SQL Server 2014 Licensing Change Summary

Things are changing. DBAs need to take stock of their skills and help the business get more value from a smaller SQL Server licensing footprint. Realistically, these changes make sense as you look at the broader commercial IT landscape. Basic features continue to get cheaper. More resources are available in SQL Server 2014 Standard Edition, but complex features that may require a lot of implementation time, and Microsoft support time, come with a heavy price tag.

What happens to in-flight data compression in an emergency?

Data compression can have many uses and advantages, but it also has its drawbacks. It’s definitely not a one-size-fits-all strategy. One of the things to be aware of is that initial compression of a table or index can take quite some time, and will be resource-intensive. It also is an offline operation, so the object can’t be accessed while it’s being compressed or uncompressed. (Clarification: by default, an ALTER TABLE statement is an offline operation. You can declare it online, but, as David notes in the comments, “Although the operation is ONLINE, it’s not completely idiot-proof.”)

So what would happen if your SQL Server service or server restarted while you were in the middle of (or, as it usually goes, 90% of the way through) compressing an index? Let’s investigate.

I have a table that is 1.1 GB in size, with a 1.0 GB nonclustered index.

Table name Index name total_pages PagesMB
bigTransactionHistory pk_bigTransactionHistory 143667 1122
bigTransactionHistory IX_ProductId_TransactionDate 131836 1029

I need to reduce the size of the nonclustered index, so I decide to compress it. After using sp_estimate_data_compression_savings, I determine that I will benefit more from page compression.

I apply page compression, and allow my server to work.

ALTER INDEX IX_ProductId_TransactionDate
ON dbo.bigTransactionHistoryPage
REBUILD WITH (DATA_COMPRESSION = PAGE);

Now, let’s say there is an emergency during this operation. Perhaps a component in the server breaks; maybe the data center loses power. (Or I restart the SQL Server service.) Uh-oh! What happened?

When the service is running, I check SSMS. I see the following error.

Msg 109, Level 20, State 0, Line 0
A transport-level error has occurred when receiving results from the server. 
(provider: Shared Memory Provider, error: 0 - The pipe has been ended.)

What happened to the data? Is compression all or nothing, or is it possible that some pages are compressed and others aren’t? I first check the index size.

Table name Index name total_pages PagesMB
bigTransactionHistory pk_bigTransactionHistory 143667 1122
bigTransactionHistory IX_ProductId_TransactionDate 131836 1029

Nothing has changed – the index is not one byte smaller. This tells me the operation was not successful.

I also check the error log to see what information it can provide.

error log rollback

There are 2 transactions that are rolled back. As I am the only person in this instance right now (the benefits of a test environment), I know those were my transactions.

SQL Server has treated my data compression operation as a transaction. If there is a restart at any point, the operation will be rolled back to maintain data integrity.

Microsoft Cloud Rebranded as Microsoft Pale Blue

It’s time to learn another new set of acronyms.

Effective today, Microsoft’s as-a-service brand is changing names again. As recently as last week, the product’s name had been changed from Windows Azure to Microsoft Azure, but industry observers noted that Microsoft’s web pages actually referred to a different name - Microsoft Cloud.

“Our research found that the primary barrier to Azz..As..Accu..cloud adoption was pronounciation,” said an inside source. “No one could say the damn word correctly, and nobody wanted to look stupid, so they just recommended Amazon Web Services instead.”

Thus the new name, Microsoft Cloud – but it ran into more branding problems right away, said the source. “We tried to trademark our virtual machines and databases, but you-know-who had a problem with our names, MC-Compute and MC-Database. People kept calling them McCompute and McDatabase. It probably didn’t help that our combination program was called the Value Menu.”

Enter the New Brand: Microsoft Pale Blue

The new Microsoft Pale Blue logo

The new Microsoft Pale Blue logo

Satya Nadella, Microsoft’s new CEO, picked the name himself. “Microsoft Pale Blue captures the wide-open possibilities of the empty sky. Everybody knows that blue is the best color for logos, so why not take it to the next level? Let’s use the color name as our brand.”

“Microsoft has learned to play to their core strength – product rebranding,” said industry analyst Anita Bath. “Nobody goes through product names like they do. Metro, Vista, Zune, PowerWhatever, Xbone, you name it, this is a company that understands brands are meaningless.”

Nadella has realigned Microsoft’s organizational structure to support the new mission. “Developers are building more and more applications with cloud-based services and Javascript. We have to understand that it’s the right combination for today’s agile startups.” The new Pale Blue & Javascript division will be led by John Whitebread, a developer widely known in the community as the beginning and end of this kind of work.

“We’re also announcing new datacenters – or as we call them, Pale Blue Regions – in China, North Korea, and Iran,” said Microsoft spokesperson Pat McCann. “We don’t believe politics should stop people from having access to the best technology, and we’re committed to aggressively growing our regions. Anytime we see new cloud demand, we’ll open a PBR.”

Today’s announcements did not include any numbers about customers or revenues, however, and questions remain. A few European reporters at today’s announcement asked Nadella if he thought security concerns around Microsoft reading customer data or NSA back doors might be barriers to cloud adoption. Nadella paused for a moment, then said, “No, no way. That can’t be it. It’s gotta be the brand name.”

Refactoring T-SQL with Windowing Functions

You’ve been querying comparative numbers like Year To Date and Same Period Last Year by using tedious CTEs and subqueries. Beginning with SQL Server 2012, getting these numbers is easier than ever! Join Doug for a 30-minute T-SQL tune-up using window functions that will cut down dramatically on the amount of code you need to write.

Looking for the scripts? Grab them below the video!

Script 1: Create Windowing View

--******************
-- (C) 2014, Brent Ozar Unlimited (TM)
-- See http://BrentOzar.com/go/eula for the End User Licensing Agreement.

--WARNING:
--This script suitable only for test purposes.
--Do not run on production servers.
--******************

/* Script 1: Create Windowing View
   Note: This script requires the AdventureWorks2012 database,
   which can be found here: https://msftdbprodsamples.codeplex.com/releases/view/55330
*/

USE AdventureWorks2012
GO

/* Duplicates the data with a DueDate field set to [year] - 1. */

CREATE VIEW vWindowing
AS
    ( SELECT    CustomerID ,
                CAST(SUM(TotalDue) AS DECIMAL(18,2)) AS TotalDue ,
                DATEPART(mm, DueDate) AS DueMonth ,
                CASE WHEN DATEPART(mm, DueDate) BETWEEN 1 AND 3 THEN 1
                     WHEN DATEPART(mm, DueDate) BETWEEN 4 AND 6 THEN 2
                     WHEN DATEPART(mm, DueDate) BETWEEN 7 AND 9 THEN 3
                     WHEN DATEPART(mm, DueDate) BETWEEN 10 AND 12 THEN 4
                     ELSE -1
                END AS DueQtr ,
                DATEPART(yy, DueDate) AS DueYear
      FROM      sales.SalesOrderHeader
      GROUP BY  CustomerID ,
                DATEPART(mm, DueDate) ,
                CASE WHEN DATEPART(mm, DueDate) BETWEEN 1 AND 3 THEN 1
                     WHEN DATEPART(mm, DueDate) BETWEEN 4 AND 6 THEN 2
                     WHEN DATEPART(mm, DueDate) BETWEEN 7 AND 9 THEN 3
                     WHEN DATEPART(mm, DueDate) BETWEEN 10 AND 12 THEN 4
                     ELSE -1
                END ,
                DATEPART(yy, DueDate)
      UNION
      SELECT    CustomerID ,
                CAST(SUM(TotalDue) AS DECIMAL(18,2)) AS TotalDue ,
                DATEPART(mm, DueDate) AS DueMonth ,
                CASE WHEN DATEPART(mm, DueDate) BETWEEN 1 AND 3 THEN 1
                     WHEN DATEPART(mm, DueDate) BETWEEN 4 AND 6 THEN 2
                     WHEN DATEPART(mm, DueDate) BETWEEN 7 AND 9 THEN 3
                     WHEN DATEPART(mm, DueDate) BETWEEN 10 AND 12 THEN 4
                     ELSE -1
                END AS DueQtr ,
                DATEPART(yy, DueDate) - 1 AS DueYear
      FROM      sales.SalesOrderHeader
      GROUP BY  CustomerID ,
                DATEPART(mm, DueDate) ,
                CASE WHEN DATEPART(mm, DueDate) BETWEEN 1 AND 3 THEN 1
                     WHEN DATEPART(mm, DueDate) BETWEEN 4 AND 6 THEN 2
                     WHEN DATEPART(mm, DueDate) BETWEEN 7 AND 9 THEN 3
                     WHEN DATEPART(mm, DueDate) BETWEEN 10 AND 12 THEN 4
                     ELSE -1
                END ,
                DATEPART(yy, DueDate) - 1
    )

	GO

Script 2: The Old Way of Querying

--******************
-- (C) 2014, Brent Ozar Unlimited (TM)
-- See http://BrentOzar.com/go/eula for the End User Licensing Agreement.

--WARNING:
--This script suitable only for test purposes.
--Do not run on production servers.
--******************

/* Script 2: The Old Way of Querying
   Note: This script requires the AdventureWorks2012 database,
   which can be found here: https://msftdbprodsamples.codeplex.com/releases/view/55330
*/

USE AdventureWorks2012
go

SET STATISTICS IO ON
GO

/* --------------------------------------- */
-- The old way to blend YTD and detail. 
-- Pretty simple so far with just one join.
/* --------------------------------------- */

SELECT vdet.customerID, vdet.DueMonth, vdet.DueQtr, vdet.DueYear, vdet.TotalDue
, SUM(vsum.TotalDue) AS YTDTotalDue
FROM dbo.vWindowing AS vdet
JOIN dbo.vWindowing AS vsum ON vsum.CustomerID = vdet.CustomerID AND vsum.DueYear = vdet.DueYear AND vsum.DueMonth <= vdet.DueMonth
WHERE vdet.CustomerID = 11091
GROUP BY vdet.CustomerID, vdet.DueMonth, vdet.DueQtr, vdet.DueYear, vdet.TotalDue
ORDER BY vdet.DueYear, vdet.DueQtr, vdet.DueMonth

/* --------------------------------------- */
-- That's great, but what if you need other metrics?
-- How do we put QTD and YTD in the same T-SQL statement?
/* --------------------------------------- */

SELECT vdet.customerID, vdet.DueMonth, vdet.DueQtr, vdet.DueYear, vdet.TotalDue
, SUM(CASE WHEN vsum.DueQtr = vdet.DueQtr AND vsum.DueMonth <= vdet.DueMonth THEN vsum.TotalDue ELSE 0 END) AS QTDTotalDue
, SUM(vsum.TotalDue) AS YTDTotalDue
FROM dbo.vWindowing AS vdet
LEFT JOIN dbo.vWindowing AS vsum ON vsum.CustomerID = vdet.CustomerID AND vsum.DueYear = vdet.DueYear 
																		AND vsum.DueMonth <= vdet.DueMonth
WHERE vdet.CustomerID = 11091
GROUP BY vdet.CustomerID, vdet.DueYear, vdet.DueQtr, vdet.DueMonth, vdet.TotalDue
;

/* --------------------------------------- */
-- Looks good. What about same YTD last year?
/* --------------------------------------- */

SELECT vdet.customerID, vdet.DueMonth, vdet.DueQtr, vdet.DueYear, vdet.TotalDue
, SUM(CASE WHEN vsum.DueQtr = vdet.DueQtr AND vsum.DueMonth <= vdet.DueMonth THEN vsum.TotalDue ELSE 0 END) AS QTDTotalDue
, SUM(vsum.TotalDue) AS YTDTotalDue
, SUM(vsum2.TotalDue) AS LastYTDTotalDue
FROM dbo.vWindowing AS vdet
LEFT JOIN dbo.vWindowing AS vsum ON vsum.CustomerID = vdet.CustomerID AND vsum.DueYear = vdet.DueYear 
																		AND vsum.DueMonth <= vdet.DueMonth
LEFT JOIN dbo.vWindowing AS vsum2 ON vsum2.CustomerID = vdet.CustomerID AND vsum2.DueYear = vdet.DueYear - 1 
																		AND vsum2.DueMonth <= vdet.DueMonth
WHERE vdet.CustomerID = 11091
GROUP BY vdet.CustomerID, vdet.DueYear, vdet.DueQtr, vdet.DueMonth, vdet.TotalDue
;

/* --------------------------------------- */
-- And now we have problems. What WILL give us
-- the Last YTD number we want?
/* --------------------------------------- */
;
WITH va AS
	(SELECT vdet.customerID, vdet.DueMonth, vdet.DueQtr, vdet.DueYear, vdet.TotalDue
	, SUM(CASE WHEN vsum.DueQtr = vdet.DueQtr AND vsum.DueMonth <= vdet.DueMonth THEN vsum.TotalDue ELSE 0 END) AS QTDTotalDue
	, SUM(vsum.TotalDue) AS YTDTotalDue
	FROM dbo.vWindowing AS vdet
	LEFT JOIN dbo.vWindowing AS vsum ON vsum.CustomerID = vdet.CustomerID AND vsum.DueYear = vdet.DueYear AND vsum.DueMonth <= vdet.DueMonth
	WHERE vdet.CustomerID = 11091
	GROUP BY vdet.CustomerID, vdet.DueYear, vdet.DueQtr, vdet.DueMonth, vdet.TotalDue
	)

SELECT vdet.customerID, vdet.DueMonth, vdet.DueQtr, vdet.DueYear, vdet.TotalDue
	, vdet.QTDTotalDue
	, vdet.YTDTotalDue 
	, vsum.QTDTotalDue AS SameQTDLastYearTotalDue
	, vsum.YTDTotalDue AS SameYTDLastYearTotalDue
FROM va AS vdet
LEFT JOIN va as vsum ON vsum.CustomerID = vdet.CustomerID AND vsum.DueMonth = vdet.DueMonth AND vsum.DueYear = vdet.DueYear - 1

/* --------------------------------------- */
-- That's great but now we're four joins deep.
-- How much deeper dare we go? Roll 2d8 to find out.

-- Roll		Joins
-- -----	-----
-- 2-6		6
-- 7-10		8
-- 11-15	10
-- 16		Save vs. Traps or suffer 1d6 + 2 damage.

-- And how's the execution plan looking?
/* --------------------------------------- */

/* --------------------------------------- */
-- Let's add the TotalDue for everyone over that same period
-- ...if we dare.
/* --------------------------------------- */
;
WITH va AS
	(SELECT vdet.customerID, vdet.DueMonth, vdet.DueQtr, vdet.DueYear, vdet.TotalDue
	, SUM(CASE WHEN vsum.DueQtr = vdet.DueQtr AND vsum.DueMonth <= vdet.DueMonth THEN vsum.TotalDue ELSE 0 END) AS QTDTotalDue
	, SUM(vsum.TotalDue) AS YTDTotalDue
	FROM dbo.vWindowing AS vdet
	LEFT JOIN dbo.vWindowing AS vsum ON vsum.CustomerID = vdet.CustomerID AND vsum.DueYear = vdet.DueYear AND vsum.DueMonth <= vdet.DueMonth
	WHERE vdet.CustomerID = 11091
	GROUP BY vdet.CustomerID, vdet.DueYear, vdet.DueQtr, vdet.DueMonth, vdet.TotalDue
	)
	, vAll as
	(SELECT vdet.DueMonth, vdet.DueQtr, vdet.DueYear, vdet.TotalDue
	, SUM(CASE WHEN vsum.DueQtr = vdet.DueQtr AND vsum.DueMonth <= vdet.DueMonth THEN vsum.TotalDue ELSE 0 END) AS QTDTotalDue
	, SUM(vsum.TotalDue) AS YTDTotalDue
	FROM dbo.vWindowing AS vdet
	LEFT JOIN dbo.vWindowing AS vsum ON vsum.CustomerID = vdet.CustomerID AND vsum.DueYear = vdet.DueYear AND vsum.DueMonth <= vdet.DueMonth
	GROUP BY vdet.DueYear, vdet.DueQtr, vdet.DueMonth, vdet.TotalDue
	)

	SELECT * FROM vAll
	-- ...and so on...

/* --------------------------------------- */
-- I'll just stop here and say 
-- it's a whole lotta work.
/* --------------------------------------- */

Script 3: The New Way of Querying

--******************
-- (C) 2014, Brent Ozar Unlimited (TM)
-- See http://BrentOzar.com/go/eula for the End User Licensing Agreement.

--WARNING:
--This script suitable only for test purposes.
--Do not run on production servers.
--******************

/* Script 3: The New Way of Querying
   Note: This script requires the AdventureWorks2012 database,
   which can be found here: https://msftdbprodsamples.codeplex.com/releases/view/55330
*/

USE AdventureWorks2012
GO

SET STATISTICS IO ON
GO

/* --------------------------------------- */
-- Remember the old way got us roughly one 
-- table join per metric? 
--
-- Hold on to your seat.
/* --------------------------------------- */

SELECT customerID, DueMonth, DueQtr, DueYear, TotalDue
FROM dbo.vWindowing
WHERE CustomerID = 11091
ORDER BY DueYear, DueMonth

/* We'll start slow by just adding QTD */

SELECT customerID, DueMonth, DueQtr, DueYear, TotalDue
, SUM(TotalDue) OVER (PARTITION BY CustomerID, DueYear, DueQtr 
						ORDER BY DueMonth 
						ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS QTDTotalDue
FROM dbo.vWindowing
WHERE CustomerID = 11091
ORDER BY DueYear, DueMonth

/* + YTD */

SELECT customerID, DueMonth, DueQtr, DueYear, TotalDue
, SUM(TotalDue) OVER (PARTITION BY CustomerID, DueYear, DueQtr ORDER BY DueMonth ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS QTDTotalDue
, SUM(TotalDue) OVER (PARTITION BY CustomerID, DueYear ORDER BY DueMonth ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS YTDTotalDue
FROM dbo.vWindowing
WHERE CustomerID = 11091
ORDER BY DueYear, DueMonth

/* What are we waiting for -- let's really see what this can do! */

SELECT customerID, DueMonth, DueQtr, DueYear, TotalDue
, SUM(TotalDue) OVER (PARTITION BY CustomerID, DueYear, DueQtr ORDER BY DueMonth ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS QTDTotalDue
, SUM(TotalDue) OVER (PARTITION BY CustomerID, DueYear ORDER BY DueMonth ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS YTDTotalDue
, SUM(TotalDue) OVER (PARTITION BY CustomerID) AS AllTimeTotalDue
, SUM(TotalDue) OVER (PARTITION BY CustomerID ORDER BY DueYear, DueMonth ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotalTotalDue
, SUM(TotalDue) OVER (PARTITION BY CustomerID ORDER BY DueYear, DueMonth ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS RemainingFutureTotalDue
, SUM(TotalDue) OVER (PARTITION BY CustomerID ORDER BY DueYear, DueMonth ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)
	/SUM(TotalDue) OVER (PARTITION BY CustomerID) AS PctRemainingFutureTotalDue
, SUM(TotalDue) OVER (PARTITION BY CustomerID, DueYear, DueQtr ORDER BY DueMonth) AS ThisQtrTotalDue
, SUM(TotalDue) OVER (PARTITION BY CustomerID, DueYear ORDER BY DueMonth) AS ThisYearTotalDue
, SUM(TotalDue) OVER (PARTITION BY CustomerID ORDER BY DueYear, DueMonth ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) AS Rolling12TotalDue
, TotalDue/SUM(TotalDue) OVER (PARTITION BY CustomerID, DueYear, DueQtr) AS PctOfThisQtrTotalDue
, TotalDue/SUM(TotalDue) OVER (PARTITION BY CustomerID, DueYear) AS PctOfThisYearTotalDue
, LAG(TotalDue, 1) OVER (PARTITION BY CustomerID ORDER BY DueYear, DueMonth) AS PrevMonthTotalDue
, LAG(TotalDue, 3) OVER (PARTITION BY CustomerID ORDER BY DueYear, DueMonth) AS SameMonthPrevQtrTotalDue
, LAG(TotalDue, 12) OVER (PARTITION BY CustomerID ORDER BY DueYear, DueMonth) AS SameMonthPrevYearTotalDue
, AVG(TotalDue) OVER (PARTITION BY CustomerID, DueYear ORDER BY DueMonth ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS YTDAvgTotalDue

/* ------------------ */
-- If I want to see how this customer stacks up to others, 
-- I can remove the where clause, add these columns, and filter at the client

--, AVG(TotalDue) OVER (PARTITION BY DueYear ORDER BY DueMonth ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS YTDAvgTotalDueAllCustomers
--,	AVG(TotalDue) OVER (PARTITION BY CustomerID, DueYear ORDER BY DueMonth ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
--		/ AVG(TotalDue) OVER (PARTITION BY DueYear ORDER BY DueMonth ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS PctVarYTDAvgTotalDueVsAllCustomers
/* ------------------ */

FROM dbo.vWindowing AS va
WHERE CustomerID = 11091
ORDER BY CustomerID, DueYear, DueMonth

Don’t Fear the Execution Plan! [Video]

Have you ever been curious about how SQL Server returns the results of your queries to you? Have you ever opened an execution plan but been bewildered by the results? Have you dabbled in query tuning, but weren’t sure where to focus your efforts? Join Jes as she takes you through the basics of execution plans. She’ll show you how to read them, how to spot common problems, how to spot help, and tools that will make your job easier.

css.php