Blog

This is not a deep dive

If you’re looking for lots of internals and explanations of what happens behind the scenes, don’t read past here. I almost made a READPAST joke. It’s that kind of day. This is just a basic overview of creating some indexes and gathering statistics. Why? Because someone just paid about $47.5k for every 0.75 cores of Oracle Enterprise licensing and they probably expect some performance out of it. This isn’t MySQL. We don’t have all day to get query results.

If you remember last time, we created a couple tables of random data, HR.T1 and HR.T2. They are currently sitting in TABLESPACE, where no one can hear you scream.

The first thing you want to do is forget about clustered indexes. Oracle has Cluster Indexes, which allow frequently joined rows from separate tables to sit on the same block of data, to reduce I/O when joining tables. Oracle has Index Organized Tables, which can be defined by a Primary Key.

But that’s more than I want to bite off!

Index Gang

Creating a Primary Key ain’t too far off from SQL Server. But there are some weird points, too. For instance, when you create a Primary Key, you can let it create an associated index, specify an index to associate with, or let Oracle pick the first index it finds to associate with the Primary Key. Full disclosure: it may not be the index you’d pick, it may be the first index that has the PK column as a leading column.

Here are some examples!

ALTER TABLE HR.T1 ADD CONSTRAINT pk_t1_id PRIMARY KEY (ID);

--I created this one a little out of order to show the USING syntax
ALTER TABLE HR.T2 ADD CONSTRAINT pk_t2_id PRIMARY KEY (ID) USING INDEX IX_T2_ID_OID_CID;

ALTER TABLE HR.T1 ADD CONSTRAINT pk_t1_id PRIMARY KEY (ID);

--I created this one a little out of order to show the USING syntax

ALTER TABLE HR.T2 ADD CONSTRAINT pk_t2_id PRIMARY KEY (ID) USING INDEX IX_T2_ID_OID_CID;

If, at some point, you realize you chose the wrong Primary Key, you can drop it without dropping the index.

ALTER TABLE HR.T2 DROP CONSTRAINT pk_t2_id KEEP INDEX;

1	ALTER TABLE HR.T2 DROP CONSTRAINT pk_t2_id KEEP INDEX;

Can’t cluster this

You can also create some pretty familiar looking index structures. There’s even an online option, if you paid through the nose. You can, of course, define your index as UNIQUE for free (for now, anyway)!

CREATE INDEX IX_T1_ID_OID_CID ON HR.T1 (ID, ORDER_ID, CUSTOMER_ID);
CREATE INDEX IX_T2_ID_OID_CID ON HR.T2 (ID, ORDER_ID, CUSTOMER_ID);

CREATE INDEX IX_T1_ID_OID_CID ON HR.T1 (ID, ORDER_ID, CUSTOMER_ID) ONLINE;
CREATE INDEX IX_T2_ID_OID_CID ON HR.T2 (ID, ORDER_ID, CUSTOMER_ID) ONLINE;

CREATE UNIQUE INDEX IX_T1_ID_OID_CID ON HR.T1 (ID, ORDER_ID, CUSTOMER_ID) ONLINE;
CREATE UNIQUE INDEX IX_T2_ID_OID_CID ON HR.T2 (ID, ORDER_ID, CUSTOMER_ID) ONLINE;

CREATE INDEX IX_T1_ID_OID_CID ON HR.T1 (ID, ORDER_ID, CUSTOMER_ID);

CREATE INDEX IX_T2_ID_OID_CID ON HR.T2 (ID, ORDER_ID, CUSTOMER_ID);

CREATE INDEX IX_T1_ID_OID_CID ON HR.T1 (ID, ORDER_ID, CUSTOMER_ID) ONLINE;

CREATE INDEX IX_T2_ID_OID_CID ON HR.T2 (ID, ORDER_ID, CUSTOMER_ID) ONLINE;

CREATE UNIQUE INDEX IX_T1_ID_OID_CID ON HR.T1 (ID, ORDER_ID, CUSTOMER_ID) ONLINE;

CREATE UNIQUE INDEX IX_T2_ID_OID_CID ON HR.T2 (ID, ORDER_ID, CUSTOMER_ID) ONLINE;

But man oh man, the best part of this to me brings in a little something from when we created the tables and test data! Creating indexes with no logging! Creating indexes online with no logging is like perf tuning God mode. Oracle for the IDDQD!

ALTER TABLE HR.T1 ADD CONSTRAINT pk_t1_id PRIMARY KEY (ID) NOLOGGING;

CREATE UNIQUE INDEX IX_T1_CID_PHN_NL ON HR.T1 (CUSTOMER_ID, CUST_PHONE) ONLINE NOLOGGING;

ALTER TABLE HR.T1 ADD CONSTRAINT pk_t1_id PRIMARY KEY (ID) NOLOGGING;

CREATE UNIQUE INDEX IX_T1_CID_PHN_NL ON HR.T1 (CUSTOMER_ID, CUST_PHONE) ONLINE NOLOGGING;

Other options

Oracle doesn’t exactly have filtered indexes. They have function based indexes, but to my SQL Server soaked brain, they seem more like a computed column with an index on it than a filtered index.

CREATE UNIQUE INDEX IX_T2_OD_PD_SD ON HR.T2 (UPPER(FIRST_NAME)) ONLINE;

1	CREATE UNIQUE INDEX IX_T2_OD_PD_SD ON HR.T2 (UPPER(FIRST_NAME)) ONLINE;

You can also create bitmap indexes, which are good for low density columns. That’s fancy talk for ‘not very unique’. Our bit column would fall into that category. Other entrants would be stuff like gender, marital status, or Favorite Rebecca Black Song would also probably qualify.

CREATE BITMAP INDEX IX_T1_BM_ISS ON HR.T1 (IS_SOMETHING);

1	CREATE BITMAP INDEX IX_T1_BM_ISS ON HR.T1 (IS_SOMETHING);

Ain’t no STATMAN here

To create, update, or otherwise manage statistics you use the DBMS_STATS package. It has subprograms for so many things, it’s hard to list them all. Oracle treats statistics much more importantly than SQL Server does, and with good reason: THEY ARE!

I also like the advice that the Oracle crowd has had on index fragmentation, since around 2002:

My opinion — 99.9% of all reorgs, rebuilds, etc are a total and utter waste of time and
energy. We spend way way way too much time losing sleep over this non-event.
If you are going to spend time on this exercise — make sure you come up with a way to
MEASURE what you’ve just done in some quanitative fashion you can report to your mgmt
(eg: these rebuilds I spend X hours a week doing save us from doing X IO’s every day, or
let us do Y more transactions then otherwise possible, or …..) No one, but no one,
seems to do that (keep metrics). They just feel “it must be better”. Who knows — you
may actually be DECREASING performance!! (you’ll never know until you measure)

If we wanted to gather statistics on all columns in our T1 and T2 tables, we could run commands like this:

BEGIN
DBMS_STATS.GATHER_TABLE_STATS(
'HR',
'T1',
METHOD_OPT => 'FOR ALL COLUMNS SIZE AUTO'
);
END

BEGIN
DBMS_STATS.GATHER_TABLE_STATS(
'HR',
'T2',
METHOD_OPT => 'FOR ALL COLUMNS SIZE AUTO'
);
END;

BEGIN

DBMS_STATS.GATHER_TABLE_STATS(

'HR',

'T1',

METHOD_OPT => 'FOR ALL COLUMNS SIZE AUTO'

);

END

BEGIN

DBMS_STATS.GATHER_TABLE_STATS(

'HR',

'T2',

METHOD_OPT => 'FOR ALL COLUMNS SIZE AUTO'

);

END;

You can check on Oracle statistics in the GUI, and see that they provide pretty commensurate information to SQL Server’s statistics.

That's a thick milkshake. — That’s a thick milkshake.

I’ll revisit this down the line

But there’s a lot I want to explore here, first. Hopefully you learned a few things along the way. I know I did writing this!

Thanks for reading!

Getting Started With Oracle Week: NULLs and NULL handling

Last Updated June 4, 2016

We’re not so different, you and I

In any database platform, you’ll have to deal with NULLs. They’re basically inescapable, even if you own an island. So let’s compare some of the ways they’re handled between Oracle and SQL Server.

Twofer

If you take a look at the two queries below, there are a couple things going on. First is the NVL function. It’s basically the equivalent of SQL Server’s ISNULL function, where it will return the second argument if the first is, well, NULL.

The second thing you may notice is the ORDER BY. In here we can do something really cool, and specify whether to put NULLs at the beginning, or end, of our results. SQL Server will just put them first, for better or worse. If you want to put them last, you need to do some dancing with the devil. Or just use a CASE expression in your ORDER by.

SELECT EMPLOYEE_ID, COMMISSION_PCT, NVL(COMMISSION_PCT, -1) AS "NULL_TEST"
FROM HR.EMPLOYEES
ORDER BY COMMISSION_PCT NULLS FIRST;

SELECT EMPLOYEE_ID, COMMISSION_PCT, NVL(COMMISSION_PCT, -1) AS "NULL_TEST"
FROM HR.EMPLOYEES
ORDER BY COMMISSION_PCT NULLS LAST;

SELECT EMPLOYEE_ID, COMMISSION_PCT, NVL(COMMISSION_PCT, -1) AS "NULL_TEST"

FROM HR.EMPLOYEES

ORDER BY COMMISSION_PCT NULLS FIRST;

SELECT EMPLOYEE_ID, COMMISSION_PCT, NVL(COMMISSION_PCT, -1) AS "NULL_TEST"

FROM HR.EMPLOYEES

ORDER BY COMMISSION_PCT NULLS LAST;

I love stuff like this, because it gives you easy syntactic access to presentation goodies.

There’s another function, NVL2, which I haven’t quite figured out a lot of uses for, but whatever. It takes three arguments. If the first argument is NULL, it returns the third argument. If the first argument isn’t NULL, it returns the second argument.

SELECT EMPLOYEE_ID, COMMISSION_PCT, NVL2(COMMISSION_PCT, 1, 2) AS "NULL_TEST"
FROM HR.EMPLOYEES
ORDER BY COMMISSION_PCT NULLS FIRST;

SELECT EMPLOYEE_ID, COMMISSION_PCT, NVL2(COMMISSION_PCT, 1, 2) AS "NULL_TEST"

FROM HR.EMPLOYEES

ORDER BY COMMISSION_PCT NULLS FIRST;

The results end up something like this below.

There’s also NULLIF! Which does what you’d expect it to do: return a NULL if the two arguments match. Otherwise, it returns the first argument. Dodge those divide by zero errors like a pro.

SELECT 
NULLIF('RUMP', 'ERIK'),
NULLIF('ERIK', 'ERIK'),
NULLIF(1, -1),
NULLIF(1, 1)
FROM DUAL;

SELECT

NULLIF('RUMP', 'ERIK'),

NULLIF('ERIK', 'ERIK'),

NULLIF(1, -1),

NULLIF(1, 1)

FROM DUAL;

At long last, not a Rump

Last, but certainly not least, is the lovely and talented COALESCE. It’s a dead ringer for SQL Server’s implementation, as well.

SELECT
COALESCE(NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, 'HELLO!')
FROM DUAL;

SELECT

COALESCE(NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, 'HELLO!')

FROM DUAL;

Is it me you’re looking for?

Intentionally left blank

NULLs happen to the best of us. Three-valued logic can be sneaky. I prefer to use canary values when possible. Those are values that could never naturally occur in data (think -999999999 or something). Again, this isn’t meant to be an exhaustive piece on NULLs and NULL handling, just a toe in the water for any SQL Server people who need to start working with Oracle.

Thanks for reading!

Getting Started With Oracle Week: Aggregating

Last Updated June 4, 2016

I probably should have written this one first

Most of these are exactly the same as in SQL Server. There are a whole bunch of interesting analytic functions, most of which should look pretty familiar to anyone who has spent time querying SQL Server. Most, if not all, can be extended to be window functions, if you need per-group analysis of any kind.

Counting!

All of this works the same as in SQL Server.

SELECT COUNT(*), COUNT(COMMISSION_PCT), COUNT(DISTINCT COMMISSION_PCT)
FROM HR.EMPLOYEES;

1 2	SELECT COUNT(*), COUNT(COMMISSION_PCT), COUNT(DISTINCT COMMISSION_PCT) FROM HR.EMPLOYEES;

Oracle does have something kind of cool if you only need an approximate count of distinct values. Don’t ask me why there isn’t a similar function to get an approximate count of all values. I wasn’t invited to that meeting. This is good for really large data sets where you just need a rough of idea of the values you’re working with.

SELECT APPROX_COUNT_DISTINCT(COMMISSION_PCT)
FROM HR.EMPLOYEES;

1 2	SELECT APPROX_COUNT_DISTINCT(COMMISSION_PCT) FROM HR.EMPLOYEES;

Sums and Averages

Fun fact: under the covers, AVG is just a SUM and a COUNT anyway.

--SUM and COUNT
SELECT JOB_ID, 
SUM(SALARY) AS "DEPT_TOTAL",
COUNT(JOB_ID) AS "POSITIONS",
(SUM(SALARY) / COUNT(JOB_ID)) * .100 AS "AVG_SALARY"
FROM HR.EMPLOYEES
GROUP BY JOB_ID
ORDER BY DEPT_TOTAL DESC;

--AVG is a little easier
SELECT JOB_ID, 
SUM(SALARY) AS "DEPT_TOTAL",
COUNT(JOB_ID) AS "POSITIONS",
AVG(SALARY) * .100 AS "AVG_SALARY"
FROM HR.EMPLOYEES
GROUP BY JOB_ID
ORDER BY DEPT_TOTAL DESC;

--SUM and COUNT

SELECT JOB_ID,

SUM(SALARY) AS "DEPT_TOTAL",

COUNT(JOB_ID) AS "POSITIONS",

(SUM(SALARY) / COUNT(JOB_ID)) * .100 AS "AVG_SALARY"

FROM HR.EMPLOYEES

GROUP BY JOB_ID

ORDER BY DEPT_TOTAL DESC;

--AVG is a little easier

SELECT JOB_ID,

SUM(SALARY) AS "DEPT_TOTAL",

COUNT(JOB_ID) AS "POSITIONS",

AVG(SALARY) * .100 AS "AVG_SALARY"

FROM HR.EMPLOYEES

GROUP BY JOB_ID

ORDER BY DEPT_TOTAL DESC;

The Max for the Minimum

You also have your MIN and MAX functions, along with the HAVING clause, to filter aggregates.

SELECT JOB_ID, MIN(SALARY), MAX(SALARY)
FROM HR.EMPLOYEES
GROUP BY JOB_ID
HAVING MIN(SALARY) > 10000
ORDER BY JOB_ID;

SELECT JOB_ID, MIN(SALARY), MAX(SALARY)

FROM HR.EMPLOYEES

GROUP BY JOB_ID

HAVING MIN(SALARY) > 10000

ORDER BY JOB_ID;

Something not boring

The LISTAGG function is something I’d absolutely love to have something like in SQL Server. It takes column values and gives you a list per row, delimited by the character of your choice. It’s pretty sweet, and the syntax is a lot easier to bang out than all the XML mumbo jumbo in SQL Server.

SELECT JOB_ID, LISTAGG(EMPLOYEE_ID, ', ') WITHIN GROUP (ORDER BY EMPLOYEE_ID) AS "EMPLOYEE_IDS"
FROM HR.EMPLOYEES
GROUP BY JOB_ID
ORDER BY JOB_ID;

SELECT JOB_ID, LISTAGG(EMPLOYEE_ID, ', ') WITHIN GROUP (ORDER BY EMPLOYEE_ID) AS "EMPLOYEE_IDS"

FROM HR.EMPLOYEES

GROUP BY JOB_ID

ORDER BY JOB_ID;

For reference, to do something similar in SQL Server, you need to do this:

SELECT  [e].[JobTitle] ,
        STUFF(( SELECT  ', ' + CAST([e1].[BusinessEntityID] AS VARCHAR)
                FROM    [HumanResources].[Employee] AS [e1]
                WHERE   [e1].[JobTitle] = [e].[JobTitle]
              FOR
                XML PATH('') ), 1, 2, '') AS [EMPLOYEE_IDS]
FROM    [HumanResources].[Employee] AS [e]
GROUP BY [e].[JobTitle];

SELECT [e].[JobTitle] ,

STUFF(( SELECT ', ' + CAST([e1].[BusinessEntityID] AS VARCHAR)

FROM [HumanResources].[Employee] AS [e1]

WHERE [e1].[JobTitle] = [e].[JobTitle]

FOR

XML PATH('') ), 1, 2, '') AS [EMPLOYEE_IDS]

FROM [HumanResources].[Employee] AS [e]

GROUP BY [e].[JobTitle];

Good luck remembering that!

SQL is a portable skill

Once you have the basics nailed down, and good fundamentals, working with other platforms becomes less painful. In some cases, going back is the hardest part! My knowledge of Oracle is still very entry level, but it gets easier and easier to navigate things as I go along. I figure if I keep this up, someday I’ll be blogging from my very own space station.

Thanks for reading!

Getting Started With Oracle Week: Joins

Last Updated June 4, 2016

Oh, THAT relational data

Thankfully, most major platforms (mostly) follow the ANSI Standard when it comes to joins. However, not all things are created equal. Oracle didn’t have CROSS and OUTER APPLY until 12c, and I’d reckon they’re only implemented to make porting over from MS easier. It also introduced the LATERAL join at the same time, which does round about the same thing.

Here’s some familiar joins to keep you calm.

/*Inner!*/
SELECT T1.ID, T2.ID
FROM HR.T1, HR.T2 
WHERE T2.ID = T1.ID;

SELECT T1.ID, T2.ID
FROM HR.T1 T1 
JOIN HR.T2 T2 
ON (T1.ID = T2.ID);

/*Left!*/
SELECT T1.ID, T2.ID
FROM HR.T1 T1 
LEFT JOIN HR.T2 T2 
ON (T1.ID = T2.ID);

/*Right!*/
SELECT T1.ID, T2.ID
FROM HR.T1 T1 
RIGHT JOIN HR.T2 T2 
ON (T1.ID = T2.ID);

/*Full!*/
SELECT T1.ID, T2.ID
FROM HR.T1 T1 
FULL JOIN HR.T2 T2 
ON (T1.ID = T2.ID);


/*Cross and Outer Apply*/
SELECT t1.ID, x.WHATEVER
FROM HR.T1 t1
CROSS APPLY (
SELECT t2.ID * 10000000 AS "WHATEVER"
FROM HR.T2 t2
WHERE t2.ID = t1.ID
) x;

SELECT t1.ID, x.WHATEVER
FROM HR.T1 t1
OUTER APPLY (
SELECT t2.ID * 10000000 AS "WHATEVER"
FROM HR.T2 t2
WHERE t2.ID = t1.ID
) x;

/*Inner!*/

SELECT T1.ID, T2.ID

FROM HR.T1, HR.T2

WHERE T2.ID = T1.ID;

SELECT T1.ID, T2.ID

FROM HR.T1 T1

JOIN HR.T2 T2

ON (T1.ID = T2.ID);

/*Left!*/

SELECT T1.ID, T2.ID

FROM HR.T1 T1

LEFT JOIN HR.T2 T2

ON (T1.ID = T2.ID);

/*Right!*/

SELECT T1.ID, T2.ID

FROM HR.T1 T1

RIGHT JOIN HR.T2 T2

ON (T1.ID = T2.ID);

/*Full!*/

SELECT T1.ID, T2.ID

FROM HR.T1 T1

FULL JOIN HR.T2 T2

ON (T1.ID = T2.ID);

/*Cross and Outer Apply*/

SELECT t1.ID, x.WHATEVER

FROM HR.T1 t1

CROSS APPLY (

SELECT t2.ID * 10000000 AS "WHATEVER"

FROM HR.T2 t2

WHERE t2.ID = t1.ID

) x;

SELECT t1.ID, x.WHATEVER

FROM HR.T1 t1

OUTER APPLY (

SELECT t2.ID * 10000000 AS "WHATEVER"

FROM HR.T2 t2

WHERE t2.ID = t1.ID

) x;

That was boring, huh? It’ll all work just as you expect it to. But we’re not done! Oracle is not without a couple neat things that it wouldn’t hurt SQL Server to implement.

Using, Naturally

I think these constructs are pretty neat. The first one is a Natural Join. This is kind of like Join Roulette, in that Oracle will choose a join condition based on two tables having a column with the same name.

SELECT *
FROM HR.EMPLOYEES
NATURAL JOIN HR.DEPARTMENTS;

SELECT *

FROM HR.EMPLOYEES

NATURAL JOIN HR.DEPARTMENTS;

The other slightly more exotic join syntax I like uses USING to shorten the join condition.

SELECT *
FROM HR.T1 JOIN HR.T2
USING(ID);

SELECT *

FROM HR.T1 JOIN HR.T2

USING(ID);

You can extend the USING syntax to join multiple columns, too, which I like because it cuts down on typing.

But what else?

Oracle also has some pretty fancy syntax for dealing with hierarchies. Even with all the options, it’s about 6 universes ahead of the recursive CTEs you have to bust out in SQL Server (if you’re not using a hierarchyid, which you’re probably not).

Here’s the Norse God table I used to show that SQL Server’s recursive CTEs are still serial in 2016:

CREATE TABLE HR.NorseGods
    (
      GodID INT NOT NULL ,
      GodName NVARCHAR2 (30) NOT NULL ,
      Title NVARCHAR2 (100) NOT NULL ,
      ManagerID INT NULL 
    );

INSERT  ALL     
INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES       ( 1,  'Odin',  'War and stuff', NULL)
INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES       ( 2,  'Thor',  'Thunder, etc.', 1 )
INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES       ( 3,  'Hel',   'Underworld!',   2 )
INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES       ( 4,  'Loki',  'Tricksy',       3 )
INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES       ( 5,  'Vali',  'Payback',       3 )
INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES       ( 6,  'Freyja','Making babies', 2 )
INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES       ( 7,  'Hoenir','Quiet time',    6 )
INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES       ( 8,  'Eir',   'Feeling good',  2 )
INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES       ( 9,  'Magni', 'Weightlifting', 8 )
SELECT * FROM DUAL;

CREATE TABLE HR.NorseGods

(

GodID INT NOT NULL ,

GodName NVARCHAR2 (30) NOT NULL ,

Title NVARCHAR2 (100) NOT NULL ,

ManagerID INT NULL

);

INSERT ALL

INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES ( 1, 'Odin', 'War and stuff', NULL)

INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES ( 2, 'Thor', 'Thunder, etc.', 1 )

INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES ( 3, 'Hel', 'Underworld!', 2 )

INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES ( 4, 'Loki', 'Tricksy', 3 )

INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES ( 5, 'Vali', 'Payback', 3 )

INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES ( 6, 'Freyja','Making babies', 2 )

INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES ( 7, 'Hoenir','Quiet time', 6 )

INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES ( 8, 'Eir', 'Feeling good', 2 )

INTO HR.NorseGods ( GodID, GodName, Title, ManagerID )VALUES ( 9, 'Magni', 'Weightlifting', 8 )

SELECT * FROM DUAL;

Ignoring the somewhat awkward looking inserts I was experimenting with, that’ll get you the table structure. In Oracle, CONNECT BY is used to generate hierarchies. There are many built-in components that give you information about the structure of your hierarchy as well.

SELECT 
GODID, 
TITLE, 
LPAD(' ', LEVEL * 2, ' ') || GODNAME AS "GODNAME", 
MANAGERID,
LEVEL,
SYS_CONNECT_BY_PATH(GODID, '/') AS "PATH",
SYS_CONNECT_BY_PATH(NVL(MANAGERID, 0), '/') AS "PARENT_PATH"
FROM HR.NorseGods ng
--START WITH GODID = 1
START WITH MANAGERID IS NULL
CONNECT BY PRIOR GODID = MANAGERID
ORDER SIBLINGS BY GODID;

SELECT

GODID,

TITLE,

LPAD(' ', LEVEL * 2, ' ') || GODNAME AS "GODNAME",

MANAGERID,

LEVEL,

SYS_CONNECT_BY_PATH(GODID, '/') AS "PATH",

SYS_CONNECT_BY_PATH(NVL(MANAGERID, 0), '/') AS "PARENT_PATH"

FROM HR.NorseGods ng

--START WITH GODID = 1

START WITH MANAGERID IS NULL

CONNECT BY PRIOR GODID = MANAGERID

ORDER SIBLINGS BY GODID;

Let’s talk about some of that!

The first thing you may notice is that we use LPAD to give an indented structure to the names to make the chain of command more obvious.
Both LEVEL and SYS_CONNECT_BY are built in components you can use to see which step in the hierarchy you’re on, and how you’ve stepped through it so far.
I’m also using STARTWITH to dictate which part of the hierarchy I want to begin recursion at. Since this is a small table, I want everything. I can either specify GODID = 1, or MANAGERID IS NULL. In this case, both indicate Odin.
Here comes CONNECT BY PRIOR, which joins GODID to MANAGERID, which is the point of the whole thing.
Lastly, ordering SIBLINGS by either GODID or MANAGERID NULLS FIRST gives us our desired display order.

Here are the results!

Any similarity to Marvel characters that might get us sued is absolutely ridiculous.

JOIN ME NEXT TIME

Just kidding, I wouldn’t do that to you.

Having standards is important. Especially if you drink a lot. The ANSI Standard gives us a good starting point for writing code that’s portable across multiple systems. Though it will (likely) never be flawless, joins are one area you can worry a bit less about.

Thanks for reading!

Getting Started With Oracle Week: Generating Test Data

Last Updated June 4, 2016

Bake your own cake

Pre-cooked example databases are cool for a lot of things. The first being that everyone can have access to them, so they can follow along with your demos without building tables or inserting a bunch of data. If you mess something up, it’s easy to restore a copy. The main problem is that they were usually designed by someone who didn’t have your issues.

Most of the time it only takes a table or two to prove your point, you just need to cook up some data that doesn’t have anyone’s real information in there. With Oracle, you have a couple different options.

SQL Server-ish

Baked into every Oracle table I’ve queried thus far, are two intrinsic columns: ROWID and ROWNUM. The ROWNUM column, at least, gives us the ability to skip over generating sequential numbers with the ROW_NUMBER function. The ROWID column is a confusing string of nonsense.

To die for

Have you ever wished you could create a table… Let’s say a demo table! And it since it’s of no consequence, SQL just wouldn’t log any changes to it? Or even that it ever existed? I mean, we have minimal logging, but what if we wanted no logging at all? Minimal logging doesn’t always work, and requires a few pre-requisites, and, well, you get the idea. If it’s high tide on St. Patrick’s Day and Jesus is eating Cobb Salad with a T-Rex in a rowboat, your inserts will be minimally logged.

Oracle can do that, with the magic of NO LOGGING!

CREATE TABLE HR.T1 
NOLOGGING --YOU CAN'T LOG ME!
AS 
SELECT ROWNUM AS "ID", 
TRUNC(MOD(ABS(ORA_HASH(SYS_GUID() )),  10000) + 1) AS "ORDER_ID",
TRUNC(MOD(ABS(ORA_HASH(SYS_GUID() )),  100) + 1) AS "SALES_REP_ID",
TRUNC(MOD(ABS(ORA_HASH(SYS_GUID() )),  1000000) + 1) AS "CUSTOMER_ID",
SYSTIMESTAMP - ROWNUM AS "ORDER_DATE",
SYSTIMESTAMP - ROWNUM + 1 AS "PROCESS_DATE",
SYSTIMESTAMP - ROWNUM + 3 AS "SHIP_DATE",
SUBSTR(SYS_GUID(), 0, 7) AS "CUST_FIRST_NAME",
SUBSTR(SYS_GUID(), 0, 10) AS "CUST_LAST_NAME",
TRUNC(MOD(ABS(ORA_HASH(SYS_GUID() )),  900000000) + 100000000) AS "CUST_PHONE",
CASE WHEN TRUNC(MOD(ABS(ORA_HASH(SYS_GUID() )), 3)) = 0 THEN 1 ELSE 0 END AS "IS_SOMETHING"
FROM ALL_OBJECTS
WHERE ROWNUM <= 10000;

CREATE TABLE HR.T1

NOLOGGING --YOU CAN'T LOG ME!

SELECT ROWNUM AS "ID",

TRUNC(MOD(ABS(ORA_HASH(SYS_GUID() )), 10000) + 1) AS "ORDER_ID",

TRUNC(MOD(ABS(ORA_HASH(SYS_GUID() )), 100) + 1) AS "SALES_REP_ID",

TRUNC(MOD(ABS(ORA_HASH(SYS_GUID() )), 1000000) + 1) AS "CUSTOMER_ID",

SYSTIMESTAMP - ROWNUM AS "ORDER_DATE",

SYSTIMESTAMP - ROWNUM + 1 AS "PROCESS_DATE",

SYSTIMESTAMP - ROWNUM + 3 AS "SHIP_DATE",

SUBSTR(SYS_GUID(), 0, 7) AS "CUST_FIRST_NAME",

SUBSTR(SYS_GUID(), 0, 10) AS "CUST_LAST_NAME",

TRUNC(MOD(ABS(ORA_HASH(SYS_GUID() )), 900000000) + 100000000) AS "CUST_PHONE",

CASE WHEN TRUNC(MOD(ABS(ORA_HASH(SYS_GUID() )), 3)) = 0 THEN 1 ELSE 0 END AS "IS_SOMETHING"

FROM ALL_OBJECTS

WHERE ROWNUM <= 10000;

I tried to make the rest of the code as close to the usual demo table SELECT INTO stuff I normally do.

ID is an incrementing integer
ORDER_ID is a random number between 1 and 10,000
SALES_REP_ID is a random number between 1 and 100
CUSTOMER_ID is a random number between 1 and 1 million
The three date columns use the ROWNUM to subtract a span of days, and then a static number of days are added to put a little distance between each activity
The two name columns are based on substrings of GUIDs
CUST_PHONE is a random 9 digit number
IS_SOMETHING is a random 1 or 0 bit column

Easy enough! And quick. 10,000 rows get inserted on my woeful VM in 0.297 ms. That’s about as long as it just took you to blink.

Of course, there are some built in Oracle goodies to generate data a little differently, but they’re (in my mind, anyway) a bit more complicated. They rely on the DMBS_RANDOM functions. There’s a lot you can do with them! The documentation is right over this way. In particular, the STRING subprogram can give you all sorts of nice junk data.

Here’s a quick example using DBMS_RANDOM.

CREATE TABLE HR.T2 
NOLOGGING --YOU CAN'T LOG ME!
AS 
SELECT ROWNUM AS "ID",
ABS(TRUNC(DBMS_RANDOM.VALUE(1,10000))) AS "ORDER_ID",
ABS(TRUNC(DBMS_RANDOM.VALUE(1,100))) AS "SALES_REP_ID",
ABS(TRUNC(DBMS_RANDOM.VALUE(1,1000000))) AS "CUSTOMER_ID",
TO_DATE(TRUNC(DBMS_RANDOM.VALUE(TO_CHAR(SYSDATE, 'J'), TO_CHAR(SYSDATE , 'J'))), 'J') AS "ORDER_DATE",
TO_DATE(TRUNC(DBMS_RANDOM.VALUE(TO_CHAR(SYSDATE, 'J'), TO_CHAR(SYSDATE, 'J'))), 'J') + 1 AS "PROCESS_DATE",
TO_DATE(TRUNC(DBMS_RANDOM.VALUE(TO_CHAR(SYSDATE, 'J'), TO_CHAR(SYSDATE, 'J'))), 'J') + 3 AS "SHIP_DATE",
DBMS_RANDOM.STRING('U', 1) || DBMS_RANDOM.STRING('L', DBMS_RANDOM.VALUE(1,6)) AS "FIRST_NAME",
DBMS_RANDOM.STRING('U', 1) || DBMS_RANDOM.STRING('L', DBMS_RANDOM.VALUE(1,10)) AS "LAST_NAME",
ABS(TRUNC(DBMS_RANDOM.VALUE(200000000,999999999))) "CUST_PHONE",
ROUND(DBMS_RANDOM.VALUE) AS "IS_SOMETHING"
FROM ALL_OBJECTS
WHERE ROWNUM <= 10000;

CREATE TABLE HR.T2

NOLOGGING --YOU CAN'T LOG ME!

SELECT ROWNUM AS "ID",

ABS(TRUNC(DBMS_RANDOM.VALUE(1,10000))) AS "ORDER_ID",

ABS(TRUNC(DBMS_RANDOM.VALUE(1,100))) AS "SALES_REP_ID",

ABS(TRUNC(DBMS_RANDOM.VALUE(1,1000000))) AS "CUSTOMER_ID",

TO_DATE(TRUNC(DBMS_RANDOM.VALUE(TO_CHAR(SYSDATE, 'J'), TO_CHAR(SYSDATE , 'J'))), 'J') AS "ORDER_DATE",

TO_DATE(TRUNC(DBMS_RANDOM.VALUE(TO_CHAR(SYSDATE, 'J'), TO_CHAR(SYSDATE, 'J'))), 'J') + 1 AS "PROCESS_DATE",

TO_DATE(TRUNC(DBMS_RANDOM.VALUE(TO_CHAR(SYSDATE, 'J'), TO_CHAR(SYSDATE, 'J'))), 'J') + 3 AS "SHIP_DATE",

DBMS_RANDOM.STRING('U', 1) || DBMS_RANDOM.STRING('L', DBMS_RANDOM.VALUE(1,6)) AS "FIRST_NAME",

DBMS_RANDOM.STRING('U', 1) || DBMS_RANDOM.STRING('L', DBMS_RANDOM.VALUE(1,10)) AS "LAST_NAME",

ABS(TRUNC(DBMS_RANDOM.VALUE(200000000,999999999))) "CUST_PHONE",

ROUND(DBMS_RANDOM.VALUE) AS "IS_SOMETHING"

FROM ALL_OBJECTS

WHERE ROWNUM <= 10000;

Quick example, he said! Alright then! The date stuff in here took me quite a while to get right. If you follow along, you have to: cast the truncated value from a range between the current system date cast as a string in Julian date format as a Julian date and… I think there’s more? I forgot this as soon as I went to bed.

But the number and string stuff is really easy! Feeding in a range of numbers is super simple. The string stuff is just one upper case character with a random-length string appended to it. These look more like names that GUID substrings, but are probably only useful to anyone trying to come up with names for an entire colony of aliens.

This insert took 1.466 seconds, plus who knows how long getting date ranges figured out. Julian! JULIAN! Why I never.

So now we have some tables

We should probably add some indexes, and figure out how to join them, huh?

Those sound like good future blog post topics.

Thanks for reading!

SSMS 2016: It Just Runs More Awesomely

Last Updated June 3, 2016

SQL Server

Step 1: configure SSMS to only show file names on the tabs. Click Tools, Options, Text Editor, Editor Tab and Status Bar, and set all of the tab texts to false except file name. After all, not like all this stuff fits on the tab.

Step 2: while you’re in Tools, Options, click on Tabs and Windows. Check the box for “Show pinned tabs in a separate row.”

Step 2: Put pinned tabs in a separate row

Step 3: start new windows for your favorite utility queries, and save them with the right names. For example, I have a window open just for sp_WhoIsActive, so I save that as sp_WhoIsActive.sql. It doesn’t actually have the sp_WhoIsActive CODE in that window, just a simple EXEC sp_WhoIsActive.

Then pin those tabs – click the little pin on those windows, and voila, they show up in their own row:

Presto - named, pinned tabs in their own row — Presto – named, pinned tabs in their own row

Go get SSMS 2016. It just runs more awesomely.

Is your SAN’s cache killing tempdb?

Last Updated February 13, 2017

SQL Server, Storage, TempDB

Let’s start with definitions

Many SANs have caching built in. What kind of cache is important, because if you’re dealing with non-SSD storage underneath, you could be waiting for a really long time for it to respond.

Let’s start with some definitions of the most popular caching mechanisms available for SANs. I’m not going to say ‘only’, because some vendor out there might have some proprietary stuff going on that I haven’t heard of.

Write-through: Much like synchronous Mirroring or AGs, writes have to be confirmed twice. They’ll write to the cache, but they’ll also write to the underlying disks, and then throw a secret handshake saying that it’s committed and all is well. This SUCKS if your underlying pool of disks are slow, saturated, or otherwise abused.

Write-around: Basically does this to the cache, skipping it entirely, and writing to disk. This can be fast, but then any data you write directly to disk will have to be read into cache when something needs it. If your application relies heavily on recent data, this can be a really lousy choice.

Write-back: Like good ol’ asynchronous commits, this writes to the cache, says everything is cool, and eventually writes it to hard storage. That means your most recent data is in cache and available, but maybe not on the most stable ground just yet. If you have slow disks underneath, and the power goes out before it writes to them, you could potentially lose some data here, unless your cache has some resiliency built in. So be careful what you wish for, here.

Why tempdb?

Because you people beat so much tar and sand out of it that you’re either going to strike oil or find a new dinosaur. If writes here are slow; if SQL is waiting more than 1 second for data to just write out to here, all of your subsequent reads are at the mercy of those writes.

What’s the sense in tuning a query that will always have overhead writing to tempdb?
Users complain that inserts are slow because you have a trigger (you know those use tempdb, right?) that stalls out for three seconds at run
Your maintenance (DBCC CHECKDB, indexes sorted in tempdb) can’t finish because your tempdb write stalls are the envy of only a Gutenberg Press.

The moral of the story

If you’re using local storage, there’s no excuse for not going SSD.

If you went out and got yourself an expensive SAN, and now you can’t afford to put good drives in it, you’re SAN poor and you made a bad choice.

If you run tools like CrystalDiskMark or DiskSpd, you’re using SQL’s DMVs to check on disk performance, or your monitoring tool is showing bad write latency, check to see what kind of caching you’re using. Start asking questions about the underlying drives, the SAN connections, and ask for numbers from your SAN admin. Downloading more RAM won’t fix slow writes!

Thanks for reading!

Brent says: Intel’s speedy 400GB PCI Express SSDs are down in the $700 range. Just do it.

#DellDBADays 2016: What Would You Do with Unlimited Hardware?

Last Updated April 9, 2017

Humor

Last August, we got the team together in person for Dell DBA Days. We ran all kinds of interesting experiments with SQL Server, and shared the results with you via live webcasts.

https://www.youtube.com/watch?v=Gn43sOLrcVs

You can watch our recorded episodes from last year – I’d highly recommend the last one, Watch SQL Server Break and Explode. Erik showed how to make a SQL Server crash instantly and reboot. Kendra demonstrated what happens when you run thousands of databases in an Availability Group. Doug and I yanked hard drives out of a server one by one to show how RAID controllers react.

This August, we’re heading out to Round Rock again – and you can be a part of it. What experiments would you like to see us run on SQL Server 2016? We’ve got all the hardware a DBA could want, and the only limit is your imagination.

If we pick your idea (and we may pick more than one!), we’ll give you a free Everything Bundle, plus credit you on air during the webcasts. Leave your idea in the comments – let’s see what you’d do if you were let loose in the Dell data center.

Update – let’s focus on experiments where you can actually learn something helpful. Think about what we could test in a lab that might change the way you administer databases, like whether TempDB still really needs 8 files in the year 2016, or what the impact of Transparent Data Encryption might be on a particular workload. We don’t need your help coming up ways to set SQL Server on fire. 😉

SQL Server 2016 and the Internet: Forced Updates, Phoning Home

Last Updated June 7, 2016

SQL Server

SQL Server 2016’s End User License Agreement (EULA) contains a couple of surprises for those who let their SQL Servers connect to the internet. No, I don’t mean where the Internet connects to you – I mean where the SQL Server can reach the internet, like open a web page.

Issue #1: You may get updates whether you want them or not.

You probably shouldn’t run 2016 side-by-side with older versions because:

IMPORTANT NOTICE: AUTOMATIC UPDATES TO PREVIOUS VERSIONS OF SQL SERVER. If this software is installed on servers or devices running any supported editions of SQL Server prior to SQL Server 2016 RC (or components of any of them) this software will automatically update and replace certain files or features within those editions with files from this software. This feature cannot be switched off. Removal of these files may cause errors in the software and the original files may not be recoverable. By installing this software on a server or device that is running such editions you consent to these updates in all such editions and copies of SQL Server (including components of any of them) running on that server or device.

You consented.

Issue #2: Your SQL Server phones home to Redmond by default.

We collect data about how you interact with this software. This includes data about the performance of the services, any problems you experience with them, and the features you use…. It includes information about the operating systems and other software installed on your device, including product keys. By using this software, you consent to Microsoft’s collection of usage and performance data related to your use of the software.

Before 2016, you had to manually opt-in by checking a checkbox during installation.

With SQL Server 2016, there’s no checkbox – you’re opted in by default.

I’m actually a huge fan of app telemetry – sending crash reports and usage data back to the application developers in order to help make the app better. I want developers to know how I use their apps, because I want them to improve the parts of the app that I use the most. Heck, I’d be fine if SSMS turned on the microphone while I worked, and then did sentiment analysis. (They would see a very high number of four-letter words tied to the term “IntelliSense.”)

Here’s the relevant part of setup:

If you install it, it's phoning home — If you install it, it’s phoning home

The Privacy Statement links to https://www.microsoft.com/EN-US/privacystatement/SQLServer/Default.aspx, which at first glance looks like it has some juicy hyperlinks, but they’re not links. You have to click on the Learn More link at the bottom right:

How to Turn Off the Phone-Home Option for Standard and Enterprise Edition

That above link explains:

Enterprise customers may construct Group Policy to opt in or out of telemetry collection by setting a registry-based policy. The relevant registry key and settings are as follows:

Key = HKEY_CURRENT_USER\Software\Microsoft\Microsoft SQL Server\130

RegEntry name = CustomerFeedback

Entry type DWORD: 0 is opt out, 1 is opt in

Be aware that editing the registry, much like the Wu-Tang Clan, is nothing to, uh, mess around with.

Issue #3: You can’t Turn Off the Phone-Home Option for Developer, Express, and Evaluation Editions

Jason Ash points out that the KB 3153756 says:

You can disable the sending of information to Microsoft only in paid versions of SQL Server. You cannot disable this functionality in Developer, Enterprise Evaluation, and Express editions of SQL Server 2016.

I’m curious to see how customers react to these new changes. I bet in the days of phone app telemetry, folks are okay with it. I certainly am – as long as we don’t find out that things like memory dumps with end user queries (especially insert statements) are making their way to places unknown.

Update 2016/06/01 – Microsoft’s Jeff Papiez points out KB 3153756: How to configure SQL Server 2016 to send feedback to Microsoft. That KB explains the registry changes required to turn off telemetry, and also lists a couple of sample DMV queries whose results could get sent back to Microsoft.

If you believe you should be able to disable phone-home telemetry feedback for Developer, Express, and Evaluation Editions, vote for this Connect item.

First Day Deal Breakers

Last Updated February 12, 2017

Professional Development, SQL Server

Starting a new job can be scary

If you’re not already established in your field, don’t know the company all that well, or taking on a role with a higher level of responsibility, it’s totally okay if you start drinking in the parking lot. Just kidding! Start drinking at home, that gives you more time to drink and bet on horses. Lifehack!

Assuming you make it into the office and don’t spend your day betting on the pony with the best name, and HR doesn’t immediately hand you substance abuse pamphlets, you begin your glorious career as Employee #2147483647. Hooray.

But will it last? Or are there things that may have you hastily editing your resume and angrily calling your recruiter by lunch?

Office Oddity

I’ve had some strange things happen to me when I started jobs (sober, I promise) that let me know exactly how long I’d be sticking around.

Boss wasn’t sure a second monitor was in the budget
SA password on the whiteboard by the developer cubicles
IT contractor passed out in the hallway outside the door

First Day, Last Day

Feel free to share in the comments if you’ve had any first day deal breakers. If you’ve ever:

Quit by lunch
Looked at glassdoor.com after it was too late
Found out you replaced a dead person

This is the right blog post for you!

All of these things I do
All of these things I do
To get away from you

Altered Images – “I Could Be Happy”

Brent says: I was interviewing for a DBA job, and the final interview took place in their offices. I took a tour, and one of the IT rooms was loaded with a couple dozen student desks. Each desk was barely big enough for a single small flat panel, a keyboard, and a mouse. Any team member could reach out their arms sideways and touch the person on either side of them. I didn’t even care what their jobs were, or if I’d be working in that room – I was done right there. Any company that treats any of their people that way, isn’t somewhere I wanna work.

Preparation, Is It In You?

Last Updated May 26, 2016

Richie Rump

Writing and Presenting

Backups aren’t just for databases.

BitLocker Blowup — Are you ready for the BitLocker of Doom?

Back in 2012, I started on a journey of sharing my technical knowledge by giving technical presentations. Now this might scare the living jeepers out of most people, but I found it exciting and fulfilling. Since then, I try to speak at ten events a year. Recently, I had the opportunity to speak the most awesome SQL Saturday Houston. Everything lined up for this to be a slam dunk. I was scheduled to give a presentation on Entity Framework, a presentation that I had given many times including the PASS Summit last year.

The afternoon before the event I was in my hotel room about to rehearse the presentation one more time. I take out my laptop, hit the power button, and nothing. It doesn’t boot, it just sits there. I whip out the power cord, plug it in to the laptop and it boots! Disaster averted. As it turns out the battery totally failed. Now, I could have used the laptop as is and everything would have gone just fine but I pulled out a second laptop, started it up, and rehearsed from that. I didn’t have to install anything. I didn’t have to restore a database or move files. I just opened Visual Studio, SSMS, and PowerPoint and I was ready to go. When you’re a speaker you need to be ready for anything, especially hardware problems.

So in celebration of my near disaster here are some of my tips for a disaster proof presentation.

BRING A SECOND LAPTOP

It’s going to happen at some point. Your machine is going to die. Sad but true. You don’t know when, you don’t know when. So have a second machine ready when you present. Have all of your demos and decks ready to go before you walk on stage. I like to put the second machine in sleep mode so that if I have to switch to it, it’s up in seconds. If you don’t have a spare laptop lying around, see if you can borrow one from work or from a friend. Although now that I think about it, if a friend is going to let you borrow a laptop that’s a great friend. You should take them to a nice dinner.

USE A FILE SYNCHRONIZATION SERVICE

I’m finding file synchronization service like Dropbox or OneDrive invaluable these days. These services serve three functions: 1. Puts a copy of your files in the cloud. I hear backups are good. 2. Allows you to share your files between different machines. 3. Allows you to share and collaborate with others. That’s all well and good, but the short of it is that when you use these services when you update a file on one machine, it will sync the changes to another. So when you update that PowerPoint presentation or that demo script you can be sure it will make it to your demo machine.

CREATE A DEMO FREE BACKUP PRESENTATION

Ah, the infamous demo failure. It’s the bubonic plague of technical conferences. Don’t fall victim to this epidemic. Wash your hands after…wait, that’s not it. Create a backup presentation that has screenshots of your demo. This way when the demo plague hits you can make a nice easy transition to your backup deck with the screenshots like a pro. Some presenters have even recorded their demos and used the videos as a backup. Sounds like a good way to spoof cloud presentations if you ask me.

KEEP DEMOS AND PRESENTATION FILES ON A THUMB DRIVE

This may be overkill but hear me out. This is your last line of defense. The line must be drawn here! This far, no further! When all else fails, this is your secret weapon. You now have a portable copy of your presentation. Beg to use someone’s machine, pop in the thumb drive, and you go get ’em tiger!

Brent says: at a recent conference, another speaker was struck by disaster in the prep room: his video card died. I handed him my spare laptop, and off he went. So don’t just think of your spare as your own spare – it can help others, too.

[Video] Office Hours 2016 2016/05/25 (With Transcriptions)

Last Updated June 4, 2016

Brent Ozar Unlimited Team

SQL Server, Videos

This week, Angie, Erik, Tara, and Richie discuss Veeam, replication, setting up alerts, using multiple instances, and much more.

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

Office Hours Webcast – 2016-05-25

Does turning on trace flag 2861 cancel out optimize for ad-hoc workloads?

Angie Rudduck: Let’s go with Gordon for the first question of the day.

Erik Darling: Oh no.

Angie Rudduck: He wants to know if switching on trace flag 2861 which is cache zero-cost plans he thinks, will that cancel out enabling optimize for ad hoc workloads?

Erik Darling: No, it shouldn’t because optimize for ad hoc workloads affects plans differently. Optimize for ad hoc the first time you run something you’ll get a stub. The next time it will cache the actual plan. So the zero-cost plan cache engine should not change that behavior. You can test it though if you don’t believe me.

Angie Rudduck: How would you test that?

Erik Darling: Run a zero-cost plan more than once.

Angie Rudduck: Hopefully that answered your question, Gordon.

Why am I getting update cascade errors?

Angie Rudduck: We’ll just go down the line. Wes. He is trying to use update cascade and he’s getting an error about it not being able to cascade directly. When he takes it away, it works. Is there a way to get around this error?

Erik Darling: Yes. Stop having foreign keys reference in a circular motion because it will just try to delete in a big, horrible ouroboros until all your data is gone.

Tara Kizer: So you won’t be able to do that specific foreign key. You may need to handle the deleter, the update, in a store procedure instead or maybe a trigger. We don’t recommend foreign keys, cascading foreign keys, anyway.

Erik Darling: Why not?

Tara Kizer: I looked at your slide. The serializable isolation level, the most pessimistic isolation level it is.

[Laughter]

Erik Darling: That’s right. Wes, there’s another hidden gotcha to those foreign key cascading actions. That’s whenever they run they take out serializable locks on all the tables that have the cascading action happening on them. So if you’re doing a large deleter update or whatever with a cascading action, you could take out some pretty heinous locks on your objects that really block other processes out. So be very careful when using those.

Angie Rudduck: He’s so smart.

Erik Darling: Nah.

Richie Rump: From an architecture perspective, I’ve always wanted to be explicit with what we’re deleting and/or updating. So from an architect’s perspective, that’s why I just never used any of that stuff. At least not since Access 2.0.

Tara Kizer: Access 2.0.

Angie Rudduck: I didn’t know that was a thing. Moving along.

Richie Rump: You weren’t born yet, Angie.

Angie Rudduck: Speaking of born, tomorrow is my birthday if you didn’t see calendars on Brent’s calendar.

Erik Darling: Happy tomorrow birthday because I’ll forget tomorrow.

Angie Rudduck: Don’t worry I will unless I look at my calendar.

Erik Darling: Brent didn’t send the card for me?

Angie Rudduck: Yeah, we’ll blame Brent. He’s not here.

How do I get around CRLF characters in an input file?

Angie Rudduck: J.H. has a situation. “A user entered a carriage return in a free form text field. Now trying to bulk insert along with format file relevant flat text file.” How can he skip the unwanted carriage return and use only the last column carriage return for that particular row?

Tara Kizer: You just need to pick a different delimiter, right?

Erik Darling: Right but I think the problem is that the delimiter is a carriage return for all the other rows.

Tara Kizer: You’re right.

Angie Rudduck: Can you just edit the file and delete that carriage return?

Erik Darling: Yeah, what Angie is saying, I would probably do a process that pre-cleans files for things like that.

Richie Rump: Yeah you would encode it in a different string type or whatever, like basic C4 or something or other.

Tara Kizer: I wonder if he could get around it by quoting the column so that the carriage return is in between quotes so it won’t read it. I’m not sure though.

Angie Rudduck: You’d still have to edit the actual file before. He’s going to have to do some processing on the file before it goes anywhere it sounds like.

Tara Kizer: But I wonder if you don’t need to do it though. If you select the option to—or not select the option obviously since it’s… well, okay, yeah.

Richie Rump: By quoting it with a pipe or something like that?

Tara Kizer: Something so that it’s surrounded, but yeah, the file does need to be modified to include that. It might just be easier just to get rid of the carriage returns and not allow those carriage returns from the application side.

Richie Rump: Yeah, unless you need them.

Tara Kizer: Yeah.

Angie Rudduck: Carriage returns.

Erik Darling: Yeah, as someone who’s done a lot of horrible file loading in his lifetime, I always try to make my delimiters as unlikely to happen in real life data as possible, either like double pipes or something like that just really, you’re not going to see it pop up too often.

Angie Rudduck: That’s a smart idea.

Could Veeam replication cause 8TB databases to go into suspect mode?

Angie Rudduck: All right, we’ll move on. Eugene wants to know has anyone seen issues running Veeam replication interfering with large SQL Server databases where the databases go into suspect mode? They have an 8 terabyte database and it seems like “each time our server team has enabled Veeam replication it caused issues where the 8 terabyte database goes offline.”

Erik Darling: I think it sounds like what’s happening is it’s freezing IO for a long time. A frozen IO makes SQL think that the database perhaps the drive that it’s on is unavailable and that, you know. So I would check your SQL error log and I would look for frozen IO, I would search for frozen IO and see how long the IO is being frozen for. Because I think if it happens for over a certain amount of time, SQL just starts assuming that the disk that the file is on isn’t available and it just says, “Nope. Not here.” A sort of like a similar thing happened, I’m sure, Tara, you’ve seen it where if a log file goes missing on a drive like that.

Tara Kizer: Mm-hmm.

Erik Darling: It will go right into suspect mode.

Tara Kizer: I wonder if it’s due to the database size because it’s having to freeze it for so long to take a snapshot of an 8 terabyte database. Just might be something that you just can’t do due to the size of the database. I would contact the vendor and see what they say as far as database size and how long that snapshot is going to take.

Angie Rudduck: I learned a trick when you’re search for your error log for your SQL Server error log. We always look for frozen but then if you see it’s frozen how long before it resumes. So you don’t see the resumed ones. But if you search for resume, you see both the frozen and the resume responses. So gives you a little bit more insight to how long your database files were really getting held up.

Tara Kizer: Yeah and it should be less than a second that IO is frozen. I mean we’re talking milliseconds. I don’t know about an 8 terabyte database though.

Can I do both Veeam full backups and native SQL Server full backups?

Angie Rudduck: Yeah, well we’re just going to keep going with Veeam because Sean has a question as well. They have a maintenance plan, the full backup, running every day as well as Veeam doing backups every day. “How do these two tasks affect logs? Don’t both tasks truncate logs? Should one be scheduled before the other? Or should one not be done if the other one is being done?”

Tara Kizer: No, they don’t impact the logs because the logs care about LSNs. You can use either of them too, either of the full backups to do your restore. You just have to make sure you get the right log sequence. It does impact differentials though so if the Veeam backups aren’t using the copy only option, the differential base could be a Veeam backup or the regular native backup. If you’re not doing differentials then it doesn’t matter.

Erik Darling: Yeah. Only transaction log backups will truncate the log file, nothing else will. So backups don’t do that.

Tara Kizer: Right.

Should I run the publisher and distributor together, or the distributor and subscriber?

Angie Rudduck: All right. Paul says, “Currently at my job they are running the distribution service off that same database server that the database is on. I’m going to propose an additional one replication to a reporting server, the original replication is merge. What considerations are there when you do this? Would it be better to run a separate distribution server?” Tara.

Tara Kizer: Yes, definitely.

Angie Rudduck: We all know that answer is yes.

Tara Kizer: The best practice is the publisher needs its own server. The distributor need its own server. The subscriber needs its own server. The distributor does a lot of work and it should not be on the same box with the publisher especially. When I’ve had to share servers, I’ve put it on the subscriber but it really depends upon the load of those boxes. You are going to need licenses for the distributor to be on another server so that’s something to take into consideration but the best practice is that they’re all on their own boxes.

Erik Darling: Heck yeah.

Angie Rudduck: What Tara said. Justin, I will not be turning 21, thank you. I am a tiny bit older than that.

Richie Rump: 19.

Angie Rudduck: I will be celebrating my one and only 29^th birthday. Thanks very much.

Tara Kizer: We can do the other replication question from David.

Angie Rudduck: Yeah? All right. I have to scroll way up for that one now.

Tara Kizer: Okay.

How do I know when replication’s initial load is done?

Angie Rudduck: David said they’re setting up replication. How does he confirm that a data loads replication is complete so he can start the job to load the app dbs?

Tara Kizer: So what you can do is you in Management Studio object explorer, expand replication, local publications, then you’ll see your publication. Right click on it and you can do the view snapshot agent status. That’s the snapshot that is snapshotting the publisher. Also right click on the publication for view log reader agent status. Then you get to go into right click on the replication and launch replication monitor. I’m pulling mine up right now. Mine is in failure. I created one as a test. But after you pull up replication monitor, you navigate to the publication and then you click on the all subscriptions tab on the right pane. Then double click on the row that shows up there. Then you could look at the publisher to distributor history, distributor to subscriber history, and you will be able to see where it’s at in the process. You can also insert a tracer token into the publisher. It’s not a row in the table but it’s like a test row basically and once it makes it to the subscriber you know the whole process is complete.

Angie Rudduck: You’re so smart about replication.

Tara Kizer: I don’t know about that.

Angie Rudduck: I’m so glad we have you around.

Tara Kizer: There are definitely smarter people.

Angie Rudduck: To answer all these questions for these poor souls who still have to use it.

Erik Darling: It’s nice because I get to zone out while you…

Angie Rudduck: I know, I’m like, I’ll read the questions to see who should answer it next.

How do I find out all SSIS packages that ran under a specific login?

Angie Rudduck: Nate wants to know if there’s a way to find all SSIS packages that have been executed under a specific server login. He’s trying to remove this login because the employee has left the company. Anybody got any suggestions?

Tara Kizer: If you’re using the SSISDB catalog you probably can go through those tables which has information. If it’s not using that, there is a logging table in msdb somewhere, I don’t remember what it’s called but there is a table there. So you’re going to have to do some sleuthing here to figure out which tables are but these are things that people have blogged about so you should be able to find which tables they are.

Erik Darling: Yeah, we don’t do a ton of work with SSIS in these parts. So if you really want a better answer for that, you might want to go to dba.stackexchange.com and ask around there if nothing in the great blog world turns up for you.

When did my AG last fail over?

Angie Rudduck: Tammy wants to know if there is a way to use T-SQL to determine when databases and availability groups failover last. She’s been looking through DMVs but hasn’t found anything.

Tara Kizer: Yes. I don’t have the DMVs memorized but, yeah, it’s in there.

Angie Rudduck: It’s dmv_ag_failover. Just kidding. Don’t really go to that.

Tara Kizer: I’m pretty sure at least I’m thinking about failed over last. If not, search through the error log because it’s definitely in there.

Are there any drawbacks of setting up SQL Server alerts?

Angie Rudduck: A. Deals wants to know if there are any impacts of setting up alerts from 17 through 25, 823, 824, 825, and 832. Did you get that off of our website?

Erik Darling: I think the bigger impact would be if you don’t set them up.

Angie Rudduck: Yeah.

Erik Darling: And then one day your database is corrupt and you don’t know how long or what happened to it or any of those things.

Tara Kizer: I think that our link says 19 through 25. I’m not too sure what you’re going to get for 17 to 18 so you might get too much to work on. I’m not sure about 832 either. I know 823 to 825 are in there but I’m not positive about 832.

Erik Darling: Yeah, I think if you turn on alerts for 17 and 18 you’re going to get a lot of stuff for failed logins and some of the lower lever old SQL Server errors. So we try to stick to 19 through 25 which are the severe ones and then 823, 824, and 825 which cover the hard and soft IO errors that come along with corruption. What about that one from Dennis?

Should I stack multiple instances on the same physical box?

Angie Rudduck: Dennis, all right. Dennis says he’s got an opinion question. Using SQL 2008 R2 Enterprise what’s better 5 by 8 to 12 cores and 128 gigabytes of RAM or one 60 core with 1 terabyte of RAM server? Oh I see.

Erik Darling: So I would only assume that you’re talking about stacking a bunch of instances on the server which we are all vehemently opposed to. Stacked instances are horrible ideas.

Angie Rudduck: Now if you have a bunch of Lego servers and you can break them apart and have five different servers or one mega Transformer server and you’re looking at spec’ing this out before purchasing, maybe there’s some more questions. What do you guys think if he’s not instance stacking? If he’s legitimately just spec’ing out, should he buy one server or five servers? You should buy five servers if you need five instances.

Erik Darling: Right. So it sounds like you’re either going to stack instances or you’re going to stack applications together on a single instance. Either way I’m opposed to and very. If you have no choice then you have no choice. Or if like you know you’d have like your crappy app server that you stick a whole bunch of like internal apps on. That’s one thing. Servers at the size that you’re talking about though, I’m looking at very special, specific hardware for each one. So if there’s a performance problem with one database or one application, I’m troubleshooting a performance problem with that one database and that one application. Not for that one database or one application and the 28 that you have surrounding it. Make life easier on yourself. Separate things as much as you can.

Angie Rudduck: Dennis follows up saying that he was gifted two of the big servers and now he has to find a use for them. If you have spare hardware sitting around…

Erik Darling: Dev servers.

Angie Rudduck: Yeah, staging. Dev test environments.

Erik Darling: I’ll forward you a DBCC_CHECKDB this way.

Angie Rudduck: Because you’re doing that, right? Nightly? Full checks?

What’s a good way of encrypting data in a database?

Angie Rudduck: What’s a good way of encrypting data in a database? Cameron wants to know.

Erik Darling: Transparent data encryption if you’re on Enterprise edition. If not, there are the various third-party tools out there. I believe one from a company called—well it used to be called GreenSQL now it’s called something I can’t remember the name of, Hexa… Dexa… Donkey something. They changed their name to something.

Angie Rudduck: So look up GreenSQL and see what really pops up.

Erik Darling: Yeah, see what their new name is and if you can remember it and spell it and email, shoot them an email about how to encrypt your database. Unless you’re on Enterprise, then use TDE then you can come back next week and ask us questions about TDE.

Richie Rump: Yeah, I’ve used TDE as well as you know regular third-party or just middleware-type components where we would just encrypt certain parts of data and then throw the encryption stuff into the database.

Erik Darling: Yep.

If I’m having CXPACKET issues, what should I do with MAXDOP?

Angie Rudduck: J.H. wants to know he has a particular database sometimes experiencing frequent CXPACKET. He wants to know if changing MAXDOP recommended—if it’s recommended. If yes, could it hurt performance for other databases running at the same?

Erik Darling: You betcha. MAXDOP, unless you’re on SQL Server 2016 is a server-level setting. So if you change MAXDOP to 1, then you will affect all the other databases. Your best bet is to actually do some work tuning queries and indexes on that database to make sure that the cost of the query stays low enough so that they don’t have frequent CXPACKET problems.

Angie Rudduck: What if he doesn’t have MAXDOP configured at all? Could that possibly be his problem?

Erik Darling: Sure could.

Angie Rudduck: What would you recommend starting out with MAXDOP?

Erik Darling: I don’t know, it depends on how many cores you have, right?

Angie Rudduck: It’s a multi-layered questionnaire.

Richie Rump: Would it be better to just set MAXDOP in a particular query?

Tara Kizer: You could.

Erik Darling: If you change that, sure. If you can change the code, sure.

Angie Rudduck: I like asking questions on questions.

What’s the best isolation level to use with Entity Framework?

Angie Rudduck: Let’s go with developers from Paul. The developers are using Entity Framework to create their application, what’s the best isolation level to use?

Richie Rump: Well what would be the best isolation used without any framework? Same thing?

Tara Kizer: Who is the question by?

Angie Rudduck: Paul. He’s kind of in the middle. Right below all the happy birthdays.

Tara Kizer: I sorted by asker so maybe it will find it.

Angie Rudduck: Oh. So smart.

Erik Darling: There’s really no single best isolation level. It kind of depends on what you end up running into as your application grows and matures. So if you run into a lot of weird blocking and deadlocking and you’ve tried your best to sort it out then read committed snapshot isolation could really help you. Otherwise just leave SQL as is.

Tara Kizer: I don’t think that Entity Framework being used is even related to why you would set the isolation level to be something different than the default.
Richie Rump: Right, exactly. If you are having problems with locking and stuff like that then you need to fix the queries, the Entity Framework queries. Either you need to change the link statement or you just trash the link statement if it’s too complex and then write your own SQL statement to be embedded either in the code or in a store procedure.

Erik Darling: My preference is to have everything in a stored procedure and then just have a new framework called the stored procedure.

Richie Rump: Yeah, it’s up to the team itself. I’ve been on some teams where they didn’t want any stored procedures because they wanted the app to update everything when they do deployments because they’re doing deployments once a day and it’s like, “Oh, okay, we’ll do it that way.” And I’ve been on some other teams where they’re all stored procedures. Where we have thousands of stored procedures sitting out there and everyone is rewriting the same stored procedure over again because they can’t find the old one. So you know, it’s up to the team on how they want to do it.

Angie Rudduck: More horror stories there, Richie?

Richie Rump: Oh, I’ve got many. So many.

I’m getting backup IO errors and CHECKDB errors on my backups.

Angie Rudduck: Let’s move on to Gordon. He says SQL log is reporting VSS backup IO errors and suggesting running DBCC CHECKDB on the VSS backup. Is that relevant and if so, how does he do that?

Tara Kizer: I don’t know how you would do that.

Erik Darling: I don’t think—there’s no way to, currently, to run DBCC CHECKDB on a backup. You would have to just run it on the database itself.

Tara Kizer: You have to restore it.

Erik Darling: Yeah.

Angie Rudduck: Yeah that seems kind of strange. Maybe it’s referring to like what we talked about early about taking your backup, restoring it elsewhere, and then checking db.

Tara Kizer: I’d be interested in the error that is actually suggesting running DBCC CHECKDB on the VSS backups. I wonder if you’re misreading the error. If you can pop the error in the message box, we could take a look at it. We might not be able to answer it though.

Angie Rudduck: Yeah. We’ll see if he gets that error in there.

How can I get a readable replica database without replication?

Angie Rudduck: In the meantime, let’s see what Nate has to say. “Given the complexities and general distaste for replication, what’s the best alternative to maintain a quote ‘read replica type of database’ in Standard Edition?”

Tara Kizer: Wait, I know this answer. Availability groups. Even with Standard Edition you can use synchronous replicas. Enterprise edition gives you asynchronous replicas but I’m right, aren’t I right? That Standard Edition offers synchronous replication? It does not offer synchronous at all?

Erik Darling: No. SQL Server 2016.

Tara Kizer: 16?

Erik Darling: Right. That’s when availability groups came to Standard Edition. They are asynchronous only.

Tara Kizer: That’s what I was trying to remember.

Erik Darling: And you don’t get a readable replica with it. So it really does put the “basic” in basic availability group. So what I would do is sort of depending on how fresh the data needs to be is I would probably try log shipping with a caveat that every x amount of hours people are going to get kicked out so the data can be updated by restoring logs. Or if I’m using mirroring, I’m going to take a database snapshot and go through all the pain of programmatically funneling people off to a different database snapshot.

Tara Kizer: And you have to make sure that you refresh that snapshot on a schedule. Otherwise it’s going to just keep growing and growing. I would actually use replication in this scenario though. Yeah, there are some issues and it can be complex and it takes a lot of time to set up on a large database but it works. If you really don’t like it, move to Enterprise Edition or upgrade to 2016 after it’s released so Standard Edition has the readable secondaries.

Angie Rudduck: Yeah, just because we don’t like replication all the time, it doesn’t mean that it’s not valuable. It just means that it’s not used correctly a lot of the times.

Erik Darling: Yeah, unfortunately a lot of people…

Tara Kizer: And it’s hard to troubleshoot.

Erik Darling: Yeah, there’s that. But a lot of people use it for HADR which it really isn’t a good solution for.

Tara Kizer: I like it for reporting if you can’t use availability groups to do your reporting. You just have to know that you might be spending a lot of time working on replication if that’s the technology you use. It took a lot of my time.

Erik Darling: Yeah, make sure that you are very precise in what you’re replicating over.

Tara Kizer: Yeah.

Erik Darling: If you can put filters on it, put filters on it. If you can get away with only moving certain tables that are necessary for whatever reporting over, only move those tables over. Move as little as possible. Don’t just push everything.

Angie Rudduck: Yeah, that’s good.

Should I disable my SA account, rename it, or both?

Angie Rudduck: Okay, Konstantinos wants to know, auditors are telling them—excuse me, they’re requesting to disable their SA account or to rename it. Is it safe to do either of these?

Erik Darling: Yes.

Angie Rudduck: And when is it the best time to do it? Do you do it in a production environment that is two years old for example?

Tara Kizer: I love that. I mean you’re not supposed to be using the SA account. Disable it. Create a new account and grant that sysadmin or just rename the SA account to something else. Don’t ever use SA. And don’t ever even use this other account. Give it a really complex password, put it in a lockbox, never use it. You should be connecting with your own Windows account or you know if you’re having Windows authentication issues, some other SQL account, but really Windows authentication, you should almost never need the SA account. You don’t really need to do this in a maintenance window unless the other things are using SA which they shouldn’t be.

Erik Darling: Yeah, just a quick word of warning on renaming SA is in the past some SQL updates have broken when SA was renamed because even if the account is disabled it was attempting to do something with the account where it got screwed up so you don’t have to rename it but disabling it is fine.

Tara Kizer: Yeah.

How do I convince my server admin that multiple instances are a bad idea?

Angie Rudduck: All right, I have a good one from Karen. How can she convince the server admin that multiple instances are a bad idea? I’m guessing she means stacked instances.

Tara Kizer: Why do they think it’s a good idea? I mean why do we have to install more? It’s hard to maintain. How are you going to split the resources? You know, memory, CPU, everything is shared. I mean, yeah, you can use Resource Governor and specify our server configurations but I don’t know.

Angie Rudduck: It’s worse performance than it is cheaper, Karen. It will cost them more in headaches for performance, your clients are going to be complaining, than it would be if for instance you had couple VMs and you were being really diligent about not overcommitting your host with resources to all of your VMs.

What’s the right threshold to use for index maintenance jobs?

Angie Rudduck: Let’s see, we’ve got a few more minutes. Did anybody see anybody? I thought I saw one I liked but I’m trying to find it.

Erik Darling: Yeah, there was a Mandy Birch one.

Angie Rudduck: Oh, yes, okay. I didn’t see hers. Mandy wants to know they are trying to determine their optimal thresholds of index fragmentations to use. Deciding for reorg versus rebuilds. They’ve been using the default 5 percent reorg and 30 percent thresholds for rebuilds but they’ve been reading those are probably unnecessarily low. What do you guys recommend? What do we recommend?

Tara Kizer: What are you trying to solve? Is fragmentation really a problem?

Angie Rudduck: Are you on a Standard Edition?

Tara Kizer: Yeah, if you’re on Standard Edition, you’re not going to get online rebuilds so these indexes are coming offline while you’re rebuilding them, but, yeah. I’m not sure why the SQL industry just wants to rebuild their indexes all the time to get rid of fragmentation. It can help you with storage because the more fragmentation you have the more storage it’s using but it’s not helping you with performance. I mean you are getting updated statistics but rebuild indexes less frequently than you probably are doing now and I would set much higher thresholds.

Angie Rudduck: Yeah, I think commonly we say anywhere from 30 to 50 for reorganize and even 60 to 80 percent for rebuilds. Something else I would say is check out Erik’s blog about just updating stats.

Tara Kizer: Update statistics daily or more often and then rebuild or reorg indexes less frequently. I mean I don’t think you even need to do it weekly.

Angie Rudduck: Is there any performance impact to updating statistics?

Tara Kizer: It does add a load, yeah, especially if you’re doing a full scan on a large table but you can have custom scripts to decide on what the sampling is going to be. The bigger the table, the less sampling you may need.

Erik Darling: The other thing that can happen is if you update a lot of statistics and then a lot of queries decide to recompile. Then that can happen. Also updating the statistics themselves can have an IO impact on your server, you know, reading a lot of statistics, big tables, that can certainly make some stuff happen. There is no such thing as free maintenance but updating statistics is a much more lightweight operation than index reorgs or rebuilds.

Angie Rudduck: Yeah. All right, well, it’s 45 after the hour now in everybody’s time zone. Should we call it a day?

Erik Darling: Yeah, might as well.

Angie Rudduck: Let’s all go to the pool, drink, or something, you know.

Erik Darling: Boat drinks.

Angie Rudduck: Bye guys. See you next week.

All: Bye.

Breaking News, Literally: 2014 SP1 CU6 Breaks NOLOCK

Last Updated September 21, 2017

Breaking News, SQL Server

Just announced on the Microsoft Release Services blog, if you run a SELECT query with the NOLOCK hint and your query goes parallel, it can block other queries.

This is a bug, and it will be fixed soon, but it is a very big deal for people who think NOLOCK means, uh, NOLOCK.

More technical details:

While one transaction is holding an exclusive lock on an object (Ex. ongoing table update), another transaction is executing parallelized SELECT (…) FROM SourceTable, using the NOLOCK hint, under the default SQL Server lock-based isolation level or higher. In this scenario, the SELECT query trying to access SourceTable will be blocked.
Executing a parallelized SELECT (…) INTO Table FROM SourceTable statement, specifically using the NOLOCK hint, under the default SQL Server lock-based isolation level or higher. In this scenario, other queries trying to access SourceTable will be blocked.

If you haven’t already installed CU6, don’t.

If you have installed it, Microsoft recommends that you leave it in place unless you experience this exact issue, at which point you’d need to uninstall CU6.

To know when a fix comes out, watch the CU6 download page, or subscribe to SQLServerUpdates.com and we’ll email you.

UPDATE 2016/05/31 – It’s back! Microsoft has released a fixed CU6.

sp_Blitz v51: @IgnorePrioritiesAbove = 50 gets you a daily briefing

Last Updated November 16, 2017

Index Maintenance, SQL Server

sp_Blitz

You have a monitoring tool, but you’ve set up an email rule to dump all the alerts into a folder.

You’re not particularly proud of that, but it is what it is. You’re just tired of the spam.

Group query in the registered servers list

But when you get in in the morning, you want a simple screen that shows you if anything is really and truly broken in your environment.

Step 1: set up a list of registered servers or a Central Management Server. This lets you execute a single query across multiple servers.

Step 2: start a group query. Right-click on the group of servers, and click New Query.

Step 3: run sp_Blitz @IgnorePrioritiesAbove = 50, @CheckUserDatabaseObjects = 0. This gets you the fast headline news, especially when used with the improved priorities in the latest version in our SQL Server download kit:

You’ll discover when:

Corruption has been detected
Databases in full recovery mode aren’t getting log backups
You’re running a known dangerous build of SQL Server
Poison waits have struck
And much more

On Monday mornings, start here. I know, you’re probably not going to find anything, because your servers are in flawless shape and nothing ever goes wrong.

But just in case….

Spring Cleaning Your Databases

Last Updated February 13, 2017

Tara Kizer

Even with lots of monitoring in place, we should perform periodic checks of our SQL Servers.

Think of this like “Spring Cleaning”, except I would recommend that it be more frequently than just once a year. Doing it monthly might be a bit ambitious due to our busy schedules, but quarterly could be achievable.

Below are the areas I recommend for Spring Cleaning your databases.

SP_BLITZ

There is so much good stuff reported by sp_Blitz. You’ll find common health, security and performance issues in there. Once you’ve fixed the issues of concern, you should periodically check if there are any new issues being reported. Did someone enable xp_cmdshell? Is it reporting any poison waits? Is there a new sysadmin that you weren’t aware of? This once happened at my previous job. Desktop Support team had added a user to our DBA group because it resolved an issue. This user was not even in IT. Imagine the damage that could have been done since it was common for most people to have Management Studio installed.

For more information about sp_Blitz, go here and here.

SP_BLITZINDEX

sp_BlitzIndex does a sanity check of the indexes in your database. You can increase your query performance by having the right indexes on your tables. sp_BlitzIndex helps with that. Don’t just look at it once, review the output on a regular basis, especially the High Value Missing Index, Duplicate Keys, Borderline Duplicate Keys, Unused NC Index, Active Heap sections.

For more information about sp_BlitzIndex, go here.

HIGH-VALUE MISSING INDEXES

When adding high-value missing indexes, be sure you aren’t creating a duplicate or borderline duplicate key index. Review your current indexes to determine if the high-value missing index can be combined with an existing one. Maybe an existing index just needs some INCLUDEs.

DUPLICATE AND BORDERLINE DUPLICATE INDEXES

Duplicate key indexes mean that you have two or more indexes that have the same exact key. Borderline duplicate key indexes mean two or more indexes start with the same key column but do not have completely identical keys. You may be able to combine these into one index, but analysis is needed as there are some that shouldn’t be touched.

Check this out to get more details about duplicate and borderline duplicate keys.

UNUSED INDEXES

Unused indexes are tricky. When you are analyzing this data, you have to keep in mind that this data is only available since the last restart. If you rebooted the server yesterday and are viewing the data today, you might see a lot of unused indexes in the list. But are they really unused? Or have the indexes not just been used YET? This is a very important point when deciding to disable or drop an index based on this list. If you reboot your servers monthly due to Microsoft security patches, consider reviewing the list the day prior to the reboot. I once dropped an index 3 weeks after the server was rebooted, thinking that the entire application workload must have been run by now. A few days later, I got a call on the weekend that the database server was pegged at 100% CPU utilization. I reviewed which queries were using the most CPU and found that the top query’s WHERE clause matched the index I had dropped. That query only ran once a month, which is why it hadn’t recorded any reads yet. We later moved that monthly process to another server that was refreshed regularly with production data.

HEAPS

Generally speaking, a table should have a clustered index. A good exception is staging tables, such as those needed for ETL processing. When a table doesn’t have a clustered index, it’s called a heap.

Heaps are great for INSERTs but not for SELECTs. DELETEs leave the space behind unless a table lock is used during the delete, either via a table hint or by lock escalation. Empty space takes up space in backups, restores and memory. If you scan the heap, you must also scan the empty space even if there are no rows. And then there’s UPDATEs. UPDATEs can cause forwarded records if the updated data does not fit on the page. Forwarding pointers are used to keep track of where the data is (a different page). This means extra and random IO.

Heaps cause fragmentation, extra reads and sometimes a huge waste of space.

Just say NO to heaps unless it’s a staging table or some other really, really good reason.

DATABASE SIZE AND GROWTH

Is your database growing faster than you expect? Knowing how big your database is and how fast it is growing can help you plan for future hardware upgrades including memory and disk space.

DISK SPACE

Most of us have monitoring in place to alert when a drive is running out of free space. Wouldn’t it be nice to proactively add storage before you receive an alert? Keep track of total and free disk space over time to help you determine when to add more space or even when to order more storage.

VIRTUAL LOG FILES

If a database has a high number of Virtual Log Files (VLFs), it can impact the the speed of transaction log backups and database recovery. I once had a database with 75,000 VLFs. I didn’t even know what a VLF was at the time (hence having so many). After rebooting the server, a mission critical database with extremely high SLAs took 45 minutes to complete recovery. We were in the process of opening a support case with Microsoft when the database finally came online. The next day, I contacted a Microsoft engineer and learned about VLFs.

For more information about VLFs and how to fix them, go here.

BACKUP TABLES

During a production problem, you might be saving data to a backup table to later review or possibly restore from. But are you remembering to drop these objects at a later time? Search for key words such as “backup”, “bkp”, “bak”, “temp”, or even your name or your initials.

SP_WHOISACTIVE

Hopefully you’re saving sp_WhoIsActive data to a table regularly, such as every 30-60 seconds. You may be using this data to help you find blocking, slow-performing queries, bad execution plans, or tempdb contention. But you probably are looking at the data when there is a current production problem. It might make sense to review the data periodically even if there isn’t a production problem. You might be able to spot a problem or a trend before it becomes a larger problem.

If you aren’t saving sp_WhoIsActive data, check this out for one method.

SP_BLITZCACHE

For sp_BlitzCache, I would take a peek to see if anything stands out. Capture the output of sp_BlitzCache into a table so that you can compare it to previous checks. Is there a stored procedure that’s running slower than it did previously? Is there anything surprising in there, such as a stored procedure that executing several times per second? I once supported a system that had a stored procedure running several hundred times per second. This isn’t necessarily a problem, but I wasn’t sure if it should be running this often. After speaking with the developer, I learned that it was an application bug. The developer fixed the bug in the next release, and I verified it by checking how often it was executing.

For more information about sp_BlitzCache, go here.

WRAPPING IT UP

Wouldn’t it be great to automate collecting all of this data? I leave that exercise up to the reader, but I think it’s important to also do manual checks of your SQL Servers. Set aside some time to proactively fix problems.

How often do you “Spring Clean” your SQL Servers and databases?

What else would you add to this list?

Brent says: as a DBA, it’s so easy to become completely reactive, putting out fires. There’s always gonna be a fire to distract you – you just have to buckle down and set aside time to get proactive.

Temporal Tables, Partitioning, and ColumnStore Indexes

Last Updated February 13, 2017

Indexing, Partitioning, SQL Server

This post is mostly a thought experiment

The thought was something along the lines of: I have a table I want to keep temporal history of. I don’t plan on keeping a lot of data in that table, but the history table can accumulate quite a bit of data. I want to partition the history table for easy removal of outdated data, and I want to use ColumnStore indexes because I’m just so bleeding edge all of my edges are bleeding edges from their bloody edges.

Fair warning here

This post assumes you’re already familiar with temporal tables, Partitioning, and ColumnStore indexes. I’m not going to go into detail on any of the subjects, I’m just walking through implementation. If you’re interested in temporal tables, Itzik Ben-Gan has a two part series here and here. We have a list of great Partitioning resources here, and of course, Niko Neugebauer has a (so far) 80 part series on ColumnStore over here.

On to the experiment!

The hardest part was getting the ColumnStore index on the history table. Let’s look at the process. There’s a lot of braindumping in the code. Feel free to skip the setup stuff, if you don’t care about it.

--Sample For President 2016
CREATE DATABASE [Sample2016]

--Let's not make a big deal out of this
ALTER DATABASE Sample2016 SET RECOVERY SIMPLE

--Netflix and chill
USE [Sample2016];

--Make sure this stuff is gone
IF OBJECT_ID('dbo.Rockwell') IS NOT NULL
BEGIN
ALTER TABLE [dbo].[Rockwell] SET (SYSTEM_VERSIONING = OFF);
DROP TABLE [dbo].[Rockwell]
DROP TABLE [dbo].[RockwellHistory]
END 

IF (SELECT [ps].[name] FROM [sys].[partition_schemes] AS [ps] WHERE [ps].[name] = 'TemporalDailyScheme') IS NOT NULL
BEGIN
DROP PARTITION SCHEME [TemporalDailyScheme];
END

IF (SELECT [pf].[name] FROM [sys].[partition_functions] AS [pf] WHERE [pf].[name] = 'TemporalDailyPFunc') IS NOT NULL
BEGIN
DROP PARTITION FUNCTION [TemporalDailyPFunc];
END

/*I always feel like...*/
/*Somebody's watching me...*/
CREATE TABLE [dbo].[Rockwell]
(
[ID] BIGINT,
[ProductName] NVARCHAR(50) NOT NULL,
[Price] DECIMAL(18,2) NOT NULL,
[Description] VARCHAR(8000) NOT NULL,
[AuditDateStart] DATETIME2(2) GENERATED ALWAYS AS ROW START HIDDEN NOT NULL,
[AuditDateEnd] DATETIME2(2) GENERATED ALWAYS AS ROW END HIDDEN NOT NULL,
PERIOD FOR SYSTEM_TIME ([AuditDateStart], [AuditDateEnd]),
CONSTRAINT [pk_id] PRIMARY KEY CLUSTERED ([ID]) 
) WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[RockwellHistory]) )

/*Partitioning Crap*/

--Make a partitioning function
DECLARE @sql NVARCHAR(MAX) = N'', @i INT = -366, @lf NCHAR(4) = NCHAR(13) + NCHAR(10)
SET @sql = N'CREATE PARTITION FUNCTION TemporalDailyPFunc (DATETIME2(2)) AS RANGE RIGHT FOR VALUES ('
WHILE @i < 0
BEGIN
SET @sql += @lf + 'DATEADD(DAY, ' + CAST(@i AS NVARCHAR) + ', ' + CAST(CAST(GETDATE() AS DATE) AS NVARCHAR) + N' )' + N', '
SET @i += 1
END
SET @sql += @lf + 'DATEADD(DAY, ' + CAST(@i AS NVARCHAR) + ', ' + CAST(CAST(GETDATE() AS DATE) AS NVARCHAR) + N' )' + N');';
EXEC sp_executesql @sql;
--PRINT @sql
GO

--I don't have a ton of filegroups, I just want everything on Primary
CREATE PARTITION SCHEME TemporalDailyScheme
AS PARTITION TemporalDailyPFunc
ALL TO ( [PRIMARY] );

--Sample For President 2016

CREATE DATABASE [Sample2016]

--Let's not make a big deal out of this

ALTER DATABASE Sample2016 SET RECOVERY SIMPLE

--Netflix and chill

USE [Sample2016];

--Make sure this stuff is gone

IF OBJECT_ID('dbo.Rockwell') IS NOT NULL

BEGIN

ALTER TABLE [dbo].[Rockwell] SET (SYSTEM_VERSIONING = OFF);

DROP TABLE [dbo].[Rockwell]

DROP TABLE [dbo].[RockwellHistory]

END

IF (SELECT [ps].[name] FROM [sys].[partition_schemes] AS [ps] WHERE [ps].[name] = 'TemporalDailyScheme') IS NOT NULL

BEGIN

DROP PARTITION SCHEME [TemporalDailyScheme];

END

IF (SELECT [pf].[name] FROM [sys].[partition_functions] AS [pf] WHERE [pf].[name] = 'TemporalDailyPFunc') IS NOT NULL

BEGIN

DROP PARTITION FUNCTION [TemporalDailyPFunc];

END

/*I always feel like...*/

/*Somebody's watching me...*/

CREATE TABLE [dbo].[Rockwell]

(

[ID] BIGINT,

[ProductName] NVARCHAR(50) NOT NULL,

[Price] DECIMAL(18,2) NOT NULL,

[Description] VARCHAR(8000) NOT NULL,

[AuditDateStart] DATETIME2(2) GENERATED ALWAYS AS ROW START HIDDEN NOT NULL,

[AuditDateEnd] DATETIME2(2) GENERATED ALWAYS AS ROW END HIDDEN NOT NULL,

PERIOD FOR SYSTEM_TIME ([AuditDateStart], [AuditDateEnd]),

CONSTRAINT [pk_id] PRIMARY KEY CLUSTERED ([ID])

) WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[RockwellHistory]) )

/*Partitioning Crap*/

--Make a partitioning function

DECLARE @sql NVARCHAR(MAX) = N'', @i INT = -366, @lf NCHAR(4) = NCHAR(13) + NCHAR(10)

SET @sql = N'CREATE PARTITION FUNCTION TemporalDailyPFunc (DATETIME2(2)) AS RANGE RIGHT FOR VALUES ('

WHILE @i < 0

BEGIN

SET @sql += @lf + 'DATEADD(DAY, ' + CAST(@i AS NVARCHAR) + ', ' + CAST(CAST(GETDATE() AS DATE) AS NVARCHAR) + N' )' + N', '

SET @i += 1

END

SET @sql += @lf + 'DATEADD(DAY, ' + CAST(@i AS NVARCHAR) + ', ' + CAST(CAST(GETDATE() AS DATE) AS NVARCHAR) + N' )' + N');';

EXEC sp_executesql @sql;

--PRINT @sql

--I don't have a ton of filegroups, I just want everything on Primary

CREATE PARTITION SCHEME TemporalDailyScheme

AS PARTITION TemporalDailyPFunc

ALL TO ( [PRIMARY] );

Wew

If you’re here, I should make a couple notes. Microsoft added a really cool feature to Temporal Tables recently: The ability to mark them as hidden. This is gravy for existing apps and tables, because you don’t have to store the row versioning data along with all your other data. It would be really nice if they’d add valid date ranges (read: expiration dates) to the syntax, but hey, maybe in the next RC…

I explicitly named our history table, because SQL will name it something horrible and dumb if you don’t. You don’t have much control over History table creation or indexing at conception, but you can make changes afterwards. SQL will drop a clustered index on your table that mirrors the clustered index definition of the base table.

ColumnStore Party!

So let’s see here. I have a base table. I have a history table. I have a Partitioning Scheme and Function. How does one get their history table to Partitioned and ColumnStored status? With a few catches!

First, you have to drop the index on the history table:

DROP INDEX [ix_RockwellHistory] ON dbo.RockwellHistory

1	DROP INDEX [ix_RockwellHistory] ON dbo.RockwellHistory

The first thing I tried was just creating my Clustered ColumnStore index in place:

CREATE CLUSTERED COLUMNSTORE INDEX [cx_cs_RockwellHistory] ON [dbo].[RockwellHistory] ON TemporalDailyScheme ([AuditDateStart]);

1	CREATE CLUSTERED COLUMNSTORE INDEX [cx_cs_RockwellHistory] ON [dbo].[RockwellHistory] ON TemporalDailyScheme ([AuditDateStart]);

But that throws an error!

Msg 35316, Level 16, State 1, Line 119 The statement failed because a columnstore index must be partition-aligned with the base table.

For reference, trying to create a nonclustered ColumnStore index throws the same error.

The next thing I did was create a nonclustered index, just to make sure I could create something aligned with the Partitioning. That works!

CREATE NONCLUSTERED INDEX [ix_RockwellHistory_ADS] ON [dbo].[RockwellHistory] ([AuditDateStart]) ON TemporalDailyScheme([AuditDateStart]);

1	CREATE NONCLUSTERED INDEX [ix_RockwellHistory_ADS] ON [dbo].[RockwellHistory] ([AuditDateStart]) ON TemporalDailyScheme([AuditDateStart]);

Please and thank you. Everyone’s a winner. But can you create ColumnStore indexes now?

Nope. Same errors as before. Clearly, we need a clustered index here to get things aligned. The problem is, you can’t have two clustered indexes, even if one is ColumnStore and the other isn’t.

CREATE CLUSTERED INDEX [cx_RockwellHistory_ADS] ON [dbo].[RockwellHistory] ([AuditDateStart]) ON TemporalDailyScheme([AuditDateStart]);

CREATE CLUSTERED COLUMNSTORE INDEX [cx_cs_RockwellHistory] ON [dbo].[RockwellHistory] ON TemporalDailyScheme ([AuditDateStart]);

CREATE CLUSTERED INDEX [cx_RockwellHistory_ADS] ON [dbo].[RockwellHistory] ([AuditDateStart]) ON TemporalDailyScheme([AuditDateStart]);

CREATE CLUSTERED COLUMNSTORE INDEX [cx_cs_RockwellHistory] ON [dbo].[RockwellHistory] ON TemporalDailyScheme ([AuditDateStart]);

Msg 35372, Level 16, State 3, Line 121 You cannot create more than one clustered index on table ‘dbo.RockwellHistory’. Consider creating a new clustered index using ‘with (drop_existing = on)’ option.

Ooh. But that DROP_EXISTING hint! That give me an idea. Or two. Okay, two ideas. Either one works, it just depends on how uh, bottom-side retentive you are about how things are named. This will create a ColumnStore index over your clustered index, using DROP_EXISTING.

CREATE CLUSTERED COLUMNSTORE INDEX [cx_RockwellHistory_ADS] ON [dbo].[RockwellHistory] WITH (DROP_EXISTING = ON) ON TemporalDailyScheme([AuditDateStart]);

1	CREATE CLUSTERED COLUMNSTORE INDEX [cx_RockwellHistory_ADS] ON [dbo].[RockwellHistory] WITH (DROP_EXISTING = ON) ON TemporalDailyScheme([AuditDateStart]);

This will drop your current Clustered Index, and create your Clustered ColumnStore index in its place, just with a name that lets you know it’s ColumnStore. Hooray. Hooray for you.

DROP INDEX [cx_RockwellHistory_ADS] ON [dbo].[RockwellHistory]

CREATE CLUSTERED COLUMNSTORE INDEX [cx_cs_RockwellHistory] ON [dbo].[RockwellHistory] ON TemporalDailyScheme ([AuditDateStart]);

DROP INDEX [cx_RockwellHistory_ADS] ON [dbo].[RockwellHistory]

CREATE CLUSTERED COLUMNSTORE INDEX [cx_cs_RockwellHistory] ON [dbo].[RockwellHistory] ON TemporalDailyScheme ([AuditDateStart]);

SUCCESS!

Never tasted so… Obtuse, I suppose. Maybe like the parts of a lobster you shouldn’t really eat. Anyway, I hope this solves a problem for someone. I had fun working out how to get it working.

I can imagine more than a few of you seeing different ways of doing this through the course of the article, either by manipulating the initial index, creating the history table separately and then assigning it to the base table, or using sp_rename to get the naming convention of choice. Sure, that’s all possible, but a lot less fun.

Thanks for reading!

Brent says: when Microsoft ships a feature, they test its operability. When you use multiple new features together, you’re testing their interoperability – the way they work together. Microsoft doesn’t always test – or document – the way every feature works together. For example, in this example, if you want to play along as a reader, your next mission is to look at query plans that span current and history data, see how the data joins together, and how it performs.

SQL Interview Question: “Tell me what you see in this screenshot.”

Last Updated May 28, 2016