In order to improve your applications and your databases, they will need to change over time. The structure of the database changes, the structure of the tables change, the data in the tables change, the application changes, the queries against the data change. Indexes that once aided performance now sit unused while new indexes are added.
When index needs change and you want to test how removing an index will affect performance, you have two options – you can disable or drop the index. Which is better?
Disabling an Index
To disable in index, issue an ALTER INDEX command.
ALTER INDEX IX_IndexName ON Schema.TableName DISABLE
What happens when you do this? The metadata about the index is retained, as are the usage statistics in sys.dm_db_index_usage_stats. The index is not usable by the query optimizer, however. If you have an index hint in a query specifying the index, the query will fail. Also, foreign key constraints are disabled.
Disabling a nonclustered index will delete the index pages – the space is freed in the database.
Disabling a clustered index has additional effects. The data in the table still exists, but will be inaccessible for anything other than a drop or rebuild operation. All related nonclustered indexes and views are also unavailable. Foreign key constraints that reference the table are disabled. Queries against the table will fail.
If you want to re-enable the index, you need to rebuild it – there is no ALTER INDEX…ENABLE command.
ALTER INDEX IX_IndexName ON Schema.TableName REBUILD
When you rebuild an index, the usage stats will be reset in sys.dm_db_index_usage_stats.
Dropping an Index
To drop a clustered or nonclustered index, issue a DROP INDEX command.
DROP INDEX IndexName ON Schema.TableName
When you do this, the metadata, statistics, and index pages are removed. If you drop a clustered index, the table will become a heap.
Once an index has been dropped, it can’t be rebuilt – it must be created again. Thus, if you are going to drop an existing index, make sure you script the CREATE statement also. If it is determined that you need it again in the future, it will be much easier to open the script than try to remember how you had it set up previously!
Do you want to maintain statistics?
The biggest difference between disabling and dropping an index is whether the metadata and statistics are persisted. If disabled, they are. If dropped, they are not. Make sure you carefully weigh your options before performing either action, and always have a way to recreate the index available.
Log shipping is a tried and true method in SQL Server for keeping a copy of your data on a secondary server. If you haven’t worked with this feature yet, join Jes to learn the basics, such as what log shipping is used for, what the moving pieces are, and how to set it up in this 30-minute video:
Want to learn more about monitoring and failing over? Watch Part 2: When Disaster Strikes!
Getting data out of SQL Server is easy—you don’t need special tools. With just a few lines of code, you can easily send the results of a query to a file. When you’re building a prototype, this is also the easiest way to work; you create quick and dirty tools that get the job done. ETL (extract, transform, load) processes are no exception. During a recent project working with SQL Server and Hive, I assumed that most of the pain would involve Hive. I was sorely mistaken: SQL Server and dirty data were the source of our problems.
Extracting the Errors
Getting data out of SQL Server seemed painless. It only took a few SQL statements and several lines of C# before I had a working data extractor. Data came out of SQL Server and was written to disk in a delimited file format—to make life easy, I decided to use a tab character as the delimiter. The hardest part of the extraction process was checking the data for funny characters.
Hive doesn’t support quoted strings; this made it especially important to make sure that the source data was as clean as possible. Hive’s schema on read makes it possible to support files with a different number of fields in each record, but the downside of this flexibility is that stray delimiter characters will be picked up as a new column.
The initial portion of the data looked correct, so I proceeded with data loads. As we began loading more and more data, problems started appearing in the queries. Data wasn’t where it should be and some of the aggregations were failing—Hive was reporting errors converting strings to numbers. I was perplexed and started looking at the data in the source system. Everything looked correct. Rather than examine the source data, I looked into the extracted data and rapidly got lost in a sea of massive files because Hive’s MapReduce jobs create a detailed log of all activity.
While digging through the chain of MapReduce logs, I found references to the problem data. Ultimately, I found several lines of code where an enterprising user had put a tab character in a string. Not to be outdone, a few other users had managed to sneak newline characters into text as well. After creating additional data cleansing rules, I reloaded all of the source data. Somewhat surprisingly, everything began working at this point. The users and their desire to put random characters in text had been defeated by a simple
A Transformative Experience
Applying data transformations in Hive was anticlimactic. Really. The process of transforming source data into a queryable format was painless—barring date and time bugs, there’s nothing to report. Hive was able to fly through the source data, pivot several hundred gigabytes of tables, and produce complex aggregations in a matter of minutes.
The only difficulty came from figuring out the right techniques for aggregating data. That, however, is a different story.
There’s always a letdown. After a reasonable run of success building a prototype, something had to break and break it did.
After getting data out of SQL Server, pushing data into Hive, running queries, and pulling results out of Hive, you would assume that the hard part was over. After all, the only thing left to do was load the results into SQL Server. And that’s where the problems started.
There are a lot of techniques to load data into SQL Server and they pose different problems. Although the most efficient techniques involve using
BULK INSERT, they require that SQL Server has access to the files. Although that works, troubleshooting a combination of Active Directory and SQL Server security is nobody’s idea of a good time; sorting out the required permissions proved time consuming and difficult. Once security issues were resolved, a new error cropped up:
BULK INSERT may encounter problems loading data into tables created with LOB data types. It was easy to get around this problem by using the correct data types with appropriate precision. This wasn’t difficult to fix, but it did slow down our end to end testing.
In the end…
After a lot of troubleshooting security issues and data formatting issues, we were able to resolve our problems and get an end to end test running. It’s rewarding to watch data move through a processing pipeline, especially after struggling with multiple ETL problems.
ETL remains the most difficult part of database projects. Moving data between different instances of SQL Server can be problematic, depending on the methods used; moving data between SQL Server and different databases can cause problems in ways that you didn’t expect. Building end to end systems is difficult. If you haven’t done it before make sure you pad your schedule; I was able to solve these problems quickly through previous experience and by reaching out to my network, but everyone may not be so lucky. ETL can be problematic even for the most seasoned SQL Server professional.
Curious if you might have a problem suitable for Hadoop but don’t want to run into some of the same problems with your implementation? Check out our Introduction to Hadoop training class or
Whether you’re planning a major migration, a code release, or an update to Windows security patches, there’s always the risk that something may go terribly wrong. How can you prevent disaster? In this 30 minute webcast, Kendra will share the top three elements to planning a successful change for SQL Server.
Aggregate functions are convenient – they solve a business need and they make development easy. Unfortunately, not all business requirements are so easy to solve. Let’s look at one example: rolling averages.
The Rolling Average
A rolling average is a simple concept; an average is computed over a fixed subset of data. Rolling average calculations are most frequently used with time series data and help remove short term fluctuations while highlighting long term trends – utility bills often feature a rolling average of consumption to help the customer understand their usage. Consumers aren’t concerned about electricity usage being high on one day in August when there was a heat wave; they want to see how their consumption is changing over time.
Rolling Averages with Common Table Expressions: 2005 – 2008R2
Starting with SQL Server 2005, it became easy to write a rolling average in a single T-SQL statement using a Common Table Expression. CTEs rapidly became popular; an incredible amount of prevously difficult functionality was made possible through CTEs including recursive queries and rolling averages. Take a look at this example in the
AdventureWorks2012 sample database:
WITH cte AS ( SELECT DENSE_RANK() OVER ( ORDER BY tm.YearName, tm.MonthOfYearNumber ) AS r , tm.YearName AS [year] , tm.MonthOfYearNumber AS [month] , SUM(SubTotal) AS SubTotal FROM dbo.TimeMaster tm LEFT JOIN Sales.SalesOrderHeader AS soh ON tm.ActualDate = soh.OrderDate WHERE tm.ActualDate BETWEEN '2005-07-01' AND '2008-08-01' GROUP BY tm.YearName , tm.MonthOfYearNumber ) SELECT cte1.[year] , cte1.[month] , AVG(cte1.SubTotal) AS AverageSubTotal FROM cte AS cte1 JOIN cte AS cte2 ON cte1.r > ( cte2.r - 12 ) AND cte1.r <> cte2.r GROUP BY cte1.[year] , cte1.[month] ORDER BY cte1.[year] , cte1.[month] ;
While not the most straightforward approach to constructing a rolling average, the CTE manages to get the job done. In this query, we are using the CTE to create a work table and then performing a self-join. This same sort of thing is possible using a temporary table or table variable, but the CTE accomplishes it in one statement and is, arguably, easier to read.
Common Table Expressions also hide a dark secret – SQL Server executes the CTE body every time the CTE expression,
cte in this example, is referenced. The more complex the Common Table Expression is, the more work that has to be performed. Running this rolling average with
STATISTICS IO turned on, it’s easy to see the multiple executions in the form of two scans on each table:
Table 'TimeMaster'. Scan count 2, logical reads 14, physical reads 1, read-ahead reads 5, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'SalesOrderHeader'. Scan count 2, logical reads 208, physical reads 1, read-ahead reads 102, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Worktable'. Scan count 1, logical reads 147, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
On a database this small, this doesn’t pose major performance problems, but this will cause big problems for a moderately sized database.
Rolling Averages with Window Functions: 2012 and beyond
SQL Server 2012 provided better support for windowing functions. Although support for
OVER() was already available in SQL Server 2005, SQL Server 2012 brings considerably more functionality to the table. By using the
ROW or RANGE clause of the windowing function, it’s possible to simplify the query and improve performance. Take a look:
SELECT YearName , MonthOfYearNumber , AVG(st) OVER ( PARTITION BY YearName, MonthOfYearNumber ORDER BY YearName, MonthOfYearNumber ROWS 12 PRECEDING ) FROM ( SELECT tm.YearName , tm.MonthOfYearNumber , SUM(COALESCE(SubTotal, 0)) AS st FROM dbo.TimeMaster tm LEFT JOIN Sales.SalesOrderHeader AS soh ON tm.ActualDate = soh.OrderDate WHERE tm.ActualDate BETWEEN '2005-07-01' AND '2008-08-01' GROUP BY tm.YearName , tm.MonthOfYearNumber ) AS x ;
Although the two queries are remarkably different, the biggest difference is the introduction of
ROWS 12 PRECEDING. This takes the place of the self join in the previous example. Instead of writing out a join ourselves, we simply tell SQL Server that we’d like an average of
st over the last 12 rows sorted by year and month. What kind of effect does this have on the work SQL Server performs?
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'TimeMaster'. Scan count 1, logical reads 7, physical reads 1, read-ahead reads 5, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'SalesOrderHeader'. Scan count 1, logical reads 104, physical reads 1, read-ahead reads 102, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
111 reads instead of 369 reads. Clearly this change makes for a substantial performance improvement for SQL Server. We’ve reduced the number of reads, eliminated some query complexity, and made it somewhat obvious to the future developers how they could modify or build on this going forward. Changing the new query to a rolling average by day instead of by month is simple and requires even fewer lines of code:
SELECT ActualDate , st AS SubTotal , AVG(st) OVER ( ORDER BY ActualDate ROWS 365 PRECEDING ) AS RollingAverageSales FROM ( SELECT ActualDate , SUM(COALESCE(soh.SubTotal, 0)) AS st FROM dbo.TimeMaster tm LEFT JOIN Sales.SalesOrderHeader soh ON tm.ActualDate = soh.OrderDate WHERE tm.ActualDate BETWEEN '2005-07-01' AND '2008-08-01' GROUP BY ActualDate ) AS x ORDER BY x.ActualDate;
This performs the name number of logical and physical reads as the monthly rolling average using a window function.
There you have it – two different ways to perform a rolling average in SQL Server. One method is clearly a lot easier than the other. There are a number of optimizations in SQL Server 2012 to make it easy for you to build this functionality and to improve SQL Server performance at the same time.
Let’s step back and take a look at the big picture. (Today, I’m writing for beginners, so you advanced gurus can go ahead and close the browser now. I’m going to simplify things and leave a lot out in order to get some main points across. Don’t well-actually me.)
Microsoft SQL Server databases are stored on disk in two files: a data file and a log file.
What’s Stored in the Data File (MDF)
Let’s start with a simple table. If you want to follow along with my code, this will work on SQL Server 2005 & newer, but please do it in a brand new database rather than reusing one of your existing ones. We’ll be looking at the log file later, and you won’t be able to quickly find the relevant entries in a sea of unrelated ones. Off we go:
CREATE TABLE dbo.Friends (id INT IDENTITY(1,1), FriendName VARCHAR(30)); GO INSERT dbo.Friends (FriendName) VALUES ('Brent Ozar'); INSERT dbo.Friends (FriendName) VALUES ('Jeremiah Peschka'); INSERT dbo.Friends (FriendName) VALUES ('Jes Schultz Borland'); INSERT dbo.Friends (FriendName) VALUES ('Kendra Little'); GO
We now have a table. I was going to say you’ve got four friends, but we’re not your friends. Let’s take this slow, alright? We just met. You can start by buying us a drink first. Let’s see how the table is stored in SQL Server behind the scenes – look under the table, as it were:
DBCC IND('MyDatabaseName', 'Friends', -1);
This command is totally safe to run – it just lists out where SQL Server is storing your data. Replace ‘MyDatabaseName’ with your database’s name. The result is a list of pages where SQL Server stored the Friends table:
Data Files Are Broken Up Into 8KB Pages
These pages are the smallest unit of storage both in memory and on disk. When we write the very first row into a table, SQL Server allocates an 8KB page to store that row – and maybe a few more rows, depending on the size of our data. In our Friends example, each of our rows is small, so we can cram a bunch of ‘em onto a page. If we had bigger rows, they might take up multiple pages even just to store one row. For example, if you added a VARCHAR(MAX) field and stuffed it with data, it would span multiple pages.
Each page is dedicated to just one table. If we add several different small tables, they’ll each be stored on their own pages, even if they’re really small tables.
If we shut down the SQL Server, started it back up again, and then issued the following query:
SELECT * FROM dbo.Friends WHERE FriendName = 'Brent Ozar'
SQL Server would check to see what page the dbo.Friends table is on, then read our entire 8KB page from disk, and cache that 8KB page in memory. I say “entire” as if it’s a big deal, but I want to make a point here: pages are stored identically both in memory and on disk, and they’re the smallest unit of caching. If you use SQL Server’s data compression, the data isn’t uncompressed from the page until it needs to be read again to satisfy another query – you get the benefit of compression in memory as well as on disk.
What happens if we change a data page? For example, if we issue the following command, what happens:
INSERT dbo.Friends (FriendName) VALUES ('Lady Gaga');
That’s where the log file comes in.
What’s Stored in the Log File (LDF)
The log file is a sequential record of what we did to the data. SQL Server writes down, start to finish, what we’re trying to do to those helpless, innocent data pages.
Your first reaction is probably, “Wow, I never want to look in there because my users do horrible, unspeakable things to my database server.” Good news – SQL Server doesn’t need to log the SELECT statements because we’re not affecting the data, and that’s usually where the worst nastiness happens. Bad news – even if you did want to look in the log file, SQL Server doesn’t give you an easy way to do it. The log file exists for SQL Server, not for you.
When we insert, update, or delete rows in our table, SQL Server first writes that activity into the log file (LDF). The log file must get hardened to disk before SQL Server says the transaction is committed.
But not the change to the data page – that part doesn’t have to hit the disk right away. See, SQL Server knows you’re the kind of person who makes lots of changes to the same data, over and over. You’re a busy person with things to do and data to trash. SQL Server can keep the same data page in memory for a while, and then flush it out to disk later – as long as the log file was written.
When Windows crashes hard or somebody pulls the power cables out from under your SQL Server, SQL Server will use the database’s log file on startup. SQL uses the log file to reconcile the state of the data file, deciding which transactions should be applied to the data file and which ones should be rolled back.
How the Data File and Log File are Accessed
This starts to point to a significant storage difference between these two files.
Log files are written to sequentially, start to finish. SQL Server doesn’t jump around – it just makes a little to-do list and keeps right on going. Eventually when it reaches the end of the log file, it’ll either circle back around to the beginning and start again, or it’ll add additional space at the end and keep on writing. Either way, though, we’re talking about sequential writes. It’s not that we never read the log file – we do, like when we perform transaction log backups. However, these are the exception rather than the norm.
Data files, on the other hand, are a jumbled mess of stuff. You’ve got tables and pages all over the place, and your users are making unpredictable changes all over the place. SQL Server’s access for data files tends to be random, and it’s a combination of both reads and writes. The more memory your server has, the less data file reads happen – SQL Server will cache the data pages in memory and just work off that cache rather than reading over and over. This is why we often suggest stuffing your SQL Server with as much memory as you can afford; it’s cheaper than buying good storage.
More Resources for SQL Server Data Storage
Want to learn more? We’ve got video training explaining it! In our 90 minute video series How to Think Like the SQL Server Engine, you’ll learn:
- The differences between clustered and nonclustered indexes
- How (and when) to make a covering index
- The basics of execution plans
- What determines sargability
- How SQL Server estimates query memory requirements
- What parameter sniffing means, and why it’s not always helpful
For $29, you get 18 months of access to the videos for one person. You can watch them at work, at home, even on your iPad. Learn more about it now.
Hi. I’m a former developer who’s moved into database administration, and here’s what I wish somebody would have told me when I got started.
7. SQL functions rarely perform well.
Good developers like to reuse code by putting it into functions, and then calling those functions from multiple places. That’s a great practice in the app tier, but it has huge performance drawbacks in the database tier.
Check out Paul White’s excellent post on Forcing a Parallel Query Plan – in particular, the list of things that produce a serial zone in the plan. Most functions will cause your query to go single-threaded. Sad trombone.
If you do want to reuse code, consider stored procedures and views instead. (Granted, they can come with their own performance drawbacks, but I’m just trying to get you started on the right foot as quickly as possible here, and functions are a broken foot.)
6. “WITH (NOLOCK)” doesn’t actually mean no locking.
At some point in your career, you’re going to start using WITH (NOLOCK) on everything because it gets your query results faster. That’s not necessarily a bad idea, but it can come with some surprising side effects that Kendra discusses in her “There’s Something About Nolock” video. I’m going to focus on one of them here, though.
When you query a table – even WITH (NOLOCK) – you take out a schema stability lock. No one else can change that table or indexes until your query is finished. That doesn’t sound like a big deal until you need to drop an index, but you can’t because people are constantly querying a table, and they think there’s no overhead as long as they use WITH (NOLOCK).
There’s no silver bullet here, but start by reading about SQL Server’s isolation levels – I bet READ COMMITTED SNAPSHOT ISOLATION is an even better choice for your app. It gets you consistent data with less blocking hassles.
5. Use 3 connection strings in your app.
I know, you’ve only got one SQL Server today, but trust me, this is worth it. Set up three connection strings that all point to the same destination today, but down the road, when you need to scale, you’ll be able to set up different database servers to handle each of these:
- Connection for Writes & Realtime Reads – this is the connection string you’re already using today, and you think that all data needs to come from here. You can leave all of your code in place, but as you write new code or touch existing pages, think about changing each query to one of the below connections.
- Connection for Data 5-15 Minutes Old – this is for data that can be slightly stale, but still needs to be from today.
- Connection for Data as of Yesterday – for management reports and trending data. If you run an online store, you might pull reviews from here, for example, and tell users that their reviews take a day to publish.
That first connection string is the toughest one to scale; we don’t have a lot of options in SQL Server to scale out multiple servers that handle writes. (We do have options – they’re just painful to implement and manage.) The lower-tier connection strings 2 and 3 are much, much easier and cheaper to scale. For more about this technique, check out my 3 Favorite Connection String Tips.
4. Use a staging/apptempdb database.
Your app probably uses the database for some scratch work – processing, sorting, loading, caching, etc. It wouldn’t break your heart if this data disappeared, but you’d like to keep the table structures around permanently. Today, you’re doing this work in your main application database.
Create a separate database – call it MyAppTemp – and do your work in there instead. Put this database in simple recovery mode, and only back it up once daily. Don’t hassle with high availability or disaster recovery on this database.
This technique accomplishes a lot of really cool scalability stuff. It minimizes the changes to the main app database, which means you get faster transaction log backups and differential backups for it. If you’re log shipping this database to a disaster recovery site, your important data will arrive faster – and not be impeded by all the scratch work. You can even use different storage for these different databases – perhaps cheap local SSD for MyAppTemp, keeping your shared storage connection free for the critical production stuff.
3. Yesterday’s articles and books are often wrong today.
SQL Server has been out for over a decade, and a lot has changed over the years. Unfortunately, the old material isn’t updated to cover what’s happening today. Even today’s material from reputable sources is often wrong – take this critique of Microsoft’s Performance Tuning SQL Server guide. Fellow Microsoft Certified Master Jonathan Kehayias points out a bunch of really bad advice that comes straight from a Microsoft document.
When you read something that sounds like good advice, I like to try the Anti-Doctor-Phil strategy. Dr. Phil preaches that you should love every idea for fifteen minutes. Instead, try hating it – try to disprove what you read before you put it into production. Even when advice is generally good, it might not be good advice for your own environment. (Yes, that includes my advice too.)
2. Avoid ORDER BY; sort in the app instead.
To sort your query results, SQL Server burns CPU time. SQL Server Enterprise Edition goes for about $7,000 per CPU core – not per processor, but per core. A two-socket, 6-core-each server rings up at around $84k – and that’s just the licensing costs, not the hardware costs. You can buy a heck of a lot of application servers (even ones with 256GB or more of memory) for $84k.
Consume all of the query results as fast as possible into memory in your app, and then sort. Your application is already designed in a way that you can scale out multiple app servers to distribute CPU load, whereas your database server…is not.
UPDATE: I’ve gotten a lot of comments wailing about how the app only needs ten rows of a ten million row dataset. Sure, if you’re doing TOP 10, you’ll need an order by – but how about reworking the query to avoid juggling so much data? And if the data sounds like too much for the app server to sort, it’s probably causing work on the SQL Server too. We talk about how to find those queries in the webcast listed at the bottom of this post. Also, keep in mind that I said “Avoid ORDER BY”, not “Never use ORDER BY”. I use ORDER BY myself too – but if I can avoid that work in the very expensive data tier, I’ll avoid it. That’s what avoid means.
(This part here is where the MySQL and PostgreSQL guys start screaming about how you can avoid licensing costs altogether with open source databases.) (This part here is where you would expect me to have a witty retort, but I don’t. If you’re building a brand new app and you’re choosing a database, read my StackOverflow answer on which database handles the most load.)
1. SQL Server has built-in zero-impact instrumentation tools.
SQL Server’s dynamic management views (DMVs) can tell you all kinds of killer stuff like:
- Which SQL statements are causing the most load on your server
- Which indexes are wasting space and slowing down inserts/updates/deletes
- How fast storage is responding to requests on a database-by-database level (and even more fine-grained than that)
- Where your server’s bottleneck is, like CPU, disk, network, locking, etc
All you have to know is where to look, and I’ll show you.
Want to learn more? Check out our classes.
We have a 3-day SQL Server Performance Troubleshooting class that teaches you internals, TempDB, indexing, T-SQL anti-patterns, and more. Learn about the class.
Right now is an incredible time to do what you love.
Take this one-minute video from Red Bull:
It’s chock full of people doing cool, crazy stuff that they’re absolutely passionate about – skating, dancing, shooting hoops, singing – all the stuff the cool kids do.
Red Bull’s logo is scattered in throughout the video because they sponsor these events. The people you see skating, dancing, whatever – they’re getting paid by Red Bull to do what they love, preferably while in close proximity to a Red Bull logo. Notice that not once in the video do you see one of the skaters/dancers/singers drinking from a Red Bull can. In fact, water bottles are more present in the video than Red Bull cans.
But those are just the cool kids, right? Not us geeks. We never get paid to do what we love.
Or do we?
That’s the DBA in Space trailer from Red Gate, and it stars Brad McGehee, an actual DBA. Well, technically his job title was evangelist when it was filmed, but he’s back to being a full time DBA again. Red Gate spent a ton of money on a slick ad campaign that brought fun and games to DBA work.
Evangelist sounds like such a crazy title, like an oddball one-off that only the luckiest people around get. But check out this timeline:
- 2005: Quest hires Kevin Kline
- 2006: Red Gate hires Steve Jones
- 2007: Red Gate hires Brad McGehee
- 2008: Quest hires Brent Ozar
- 2010: Confio hires Tom LaRock
- 2010: SQL Sentry hires Aaron Bertrand
- 2011: Red Gate hires Grant Fritchey
- 2011: Idera starts their ACE Program
- 2012: Idera contracts Robert Davis
- 2012: SQL Sentry hires Kevin Kline
That’s a hell of a lot of evangelist-style jobs for a niche market like SQL Server DBAs.
A few years ago, when one of these announcements came out, I’d say to myself, “Well, that’s it. Everybody’s hired their DBA spokesmodel now. We won’t see another one of those hires again for a while. There’s no big vendors left, and no software vendor needs more than one DBA spokesmodel on staff.”
In 2013, this will happen even faster.
There’s a lot of vendors out there – just check out the list of sponsors for the last PASS Summit. Ask yourself, “Do I know anyone who works for each of these vendors?” For every vendor you don’t know personally, there’s somebody at that vendor who desperately wants to fix that problem. Every vendor on that list wants you to know who they are, and old-school advertising isn’t working anymore. They can’t just buy ads in publications you don’t read.
To get your attention, vendors have to produce valuable content that brings you to their site. At Quest we did this with SQLServerPedia, Red Gate does it with Simple Talk, and SQL Sentry’s doing it with SQLPerformance.com - and that’s really only the tip of the iceberg. Vendors do webcasts, whitepapers, books, free software, and free mini-conferences like Red Gate’s SQL in the City.
And of course we do this too – you’re reading my free content right now, and our First Aid page has webcasts, posters, video archive, our sp_Blitz and sp_BlitzIndex scripts, and our email newsletter. We spend days per week putting together good stuff to give away for free. (I’m writing this on a Sunday.)
Marketing teams are even starting to talk about content inflation: a skyrocketing amount of really good content available for free.
Doing this much good content requires people who know their stuff.
Today, Lori Edwards is joining SQL Sentry.
Yep, a vendor acquired another database pro – this time, @LoriEdwards is going to work with Kevin Kline and Aaron Bertrand over at SQL Sentry. She’ll be telecommuting full time, staying in her current Arizona residence with her husband Tim and their kids. She’ll get to work with the SQL Server she knows and loves.
Sounds cool, doesn’t it?
Working from home and doing what you love? It’s like the geek equivalent of a Red Bull commercial. (Sometimes, I even dance around my office while pretending there’s a camera crew.)
I’m not blogging about this because I’m fond of SQL Sentry, or Kevin, or Aaron, or Lori, or any vendor in particular. Every vendor has their pros and cons, and that’s true for both hardware and software. I’m blogging about this because vendors will keep hiring people like you and me because we love what we’re doing, and we’re good at it. (Well, you are, anyway – I’m a hot mess.)
It’s not just for presenters and bloggers anymore.
Lori’s hire marks an interesting point in the timeline because you might not know her. (Cover your eyes for a second, Lori.) She hasn’t blogged much in months, and she hasn’t been doing the conference circuit. She’s just been plugging along, getting good at what she loves to do, and building a network of people that respect what she does. I contracted out the last round of sp_Blitz™ updates & testing to her, for example. She’s not an evangelist per se – but she gets to do cool stuff with SQL Server from home.
I bet you think you can’t get a dream job with a vendor because you haven’t been speaking or blogging enough. You’re wrong: it’s not just about speaking or blogging anymore. We’ve reached a point where SQL Server vendors have seen enough benefits of hiring SQL Server DBAs that they’re willing to hire ‘em even if the DBA hasn’t been grabbing the spotlight.
Stop learning technology. If you’re scared of getting up in front of the podium, you can even stop trying that (although I’d recommend you keep going, because presenting really does change your life.) Just start working on your soft skills by building your network, and start by reading my post Rock Stars, Normal People, and You.
Want more incentive? Quest doesn’t have an evangelist right now, I know at least one other vendor that’s ready to start looking, and you never know when we’ll hire again.
DBAs have been asking me, “Brent, why do you have a book of Lady Gaga photos on your standing desk?”
Also, “Brent, why are you guys offering training for developers instead of training for DBAs? Have you lost your minds? Why have you crossed over to the other side? Have you forgotten that we know where you live and where you walk your dog?”
Reason #1: A lot of shops don’t have DBAs.
A good database administrator is hard to find, and when you find ‘em, they’re expensive. We’ve got clients who have been trying to hire a DBA for over a year – in some cases, two years! They’ve given up on all the crappy job candidates who can’t tell the difference between their elbow and their LDF file. It’s cheaper and quicker to take one of their savviest developers and train them to become a SQL Server DBA.
Except that it’s not, because it’s really hard to find training to do that.
There’s local training companies who just take Microsoft’s certification curriculum, but what if your goal isn’t to get certified? What if your goal is just to get good, practical advice for real-world database administration as quickly as possible? What if you don’t want to learn about a dozen nearly useless features just to check a box? What if you want to spend more time on the most vital features, like, say, indexes?
That’s where our training comes in – we focus on giving developers the stuff they really want to know about SQL Server to make it faster and more reliable.
DBAs, that’s really the only reason. You can close the browser now. Or if you’re bored, go check out the Lady Gaga photo book – it’s on sale. My favorite photo is the one of her and Ellen DeGeneres in an elevator at the MTV VMAs. I’d give you the page number but like any indecipherable Terry Richardson work, there’s no friggin’ page numbers.
Are they gone?
Okay, good. Developers, here’s the rest of the reasons.
Reason #2: You can’t get time from your DBA.
If your shop has gotten big enough to hire a full time DBA, they’re probably overwhelmed with work just keeping things running, dealing with backups, managing security, and making sure there’s enough free drive space.
You’ve got questions about making your queries faster, and you’ve got the feeling that the answers are nearby, but just out of reach. You’ve tried the Index Tuning Wizard & Database Tuning Advisor, but they’re out of options.
Reason #3: Maybe you wanna be a DBA.
I used to be a developer myself. After going through four languages in three years (Topspeed Clarion, VBscript ASP, VB.NET, Java), I decided I wasn’t dedicated enough to stick with it. I switched over to the database side because there’s only one language to learn – SQL is basically the same across not just versions of SQL Server, but across different platforms like PostgreSQL, MySQL, and Oracle. Even the new kids on the block like Apache Hive use similar syntax.
Learning the basics of performance tuning on any database will help you everywhere. Index tuning, isolation levels, and sargability are universal concepts.
Reason #4: A lot of shops don’t need DBAs.
I’m not saying you don’t need a DBA’s advice now and then, but you don’t have enough work to keep a DBA busy full time. I know this because a lot of our clients – and I mean a lot of them – don’t have DBAs, and after our initial SQL Critical Care™ session, they only need to get our advice about once a quarter.
Your questions aren’t complex, but you’re tired of Googling and getting answers you don’t really trust. You want clear, concise training that teaches you the important parts of indexing, query antipatterns, isolation levels, and performance troubleshooting. You don’t want boring best practices – you want to cut to the chase.
Sound good? Check it out.
On our SQL Server for Developers training page, we list the class abstracts. At the bottom, there’s a business justification PDF to help you convince management that the training is worth the money. And speaking of money, it’s cheaper than a conference or a week-long local training class taught by somebody who’s never actually been a DBA, much less a Microsoft MVP and Microsoft Certified Master. Come join us and have a good time while we poke fun at DBAs.*
* – Did I say poke fun at DBAs? I meant sing the praises of DBAs. Listen, I told you to go look at Lady Gaga pictures, and why are you still reading this, DBA?
Hoowee, we’ve got a lot of good stuff lined up! You know the routine: click the links, check boxes, and get smart.
Kendra Little, Technology Triage Tuesday
Whether you’re planning a major migration, a code release, or an update to Windows security patches, there’s always the risk that something may go terribly wrong. How can you prevent disaster? In this 30 minute webcast, Kendra will share the top three elements to planning a successful change for SQL Server. Register now.
Brent Ozar, Technology Triage Tuesday
When you create tables and insert data, what’s SQL Server doing behind the scenes? How do your rows become part of the files on disk? This session isn’t about solving problems – it’s about just going fishing inside SQL Server to see what’s going on behind the scenes. We’ll use undocumented commands like DBCC PAGE to track down your data. Register now.
Jes Schultz Borland, Technology Triage Tuesday
Log shipping is a tried and true method in SQL Server for keeping a copy of your data on a secondary server. If you haven’t worked with this feature yet, join Jes to learn the basics, such as what log shipping is used for, what the moving pieces are, and how to set it up. Register now.
Before you join us in Atlanta for our 2-day training class (or if you’re just thinking about it), let’s talk about your environment. In this 30-minute session, we’ll give you some vital statistics to check in your SQL Server before you leave work, and we’ll explain how our upcoming training class will help you improve those metrics.
Jes Schultz Borland, Technology Triage Tuesday
Log shipping is a tried and true method in SQL Server for keeping a copy of your data on a secondary server. You have your primary and secondary server set up, and it’s working great. Are you monitoring it? Do you know what to do if you need to fail over to the secondary? Join Jes to find out! Register now.
Jeremiah Peschka, Technology Triage Tuesday
Session state frequently ends up on a busy SQL Server. What seemed like a good idea in development turns into a problem in production. While there are valid business reasons for persisting session state to permanent storage; there are equally valid reasons to avoid using SQL Server as the permanent storage. We’ll investigate why session state poses problems for SQL Server and cover an alternate solution that allows for persistent session state. This talk is for developers and DBAs who want a better way to safely track ASP.NET session state. Register here.
Brent Ozar, Technology Triage Tuesday
Get frustrated when you read conflicting opinions on the web? Me too – I can’t go to sleep when someone’s wrong on the Internet, but it’s tough to correct everybody. In this 30-minute session, I’ll explain the most common bad advice that I see, explain why it’s wrong, and show you how to set up your SQL Server for speed, not slowness. Register now.
Can’t be there on time? Yeah, we know how “busy” you are – you can catch recorded versions of past webcasts in our First Aid video archive.