Blog

Getting an Oracle Execution Plan

Execution plans are fantastic – they make it easier to visualize how a query is running and find tuning opportunities. SQL Server DBAs starting with Oracle may be a bit frustrated by the lack of a clear way to get to execution plans. This covers three ways to get to execution plans in Oracle – estimated plans, actual plans from an active session, and actual plans from the plan cache.

Getting Estimated Oracle Execution Plans

One way, likely more familiar to MySQL and PostgreSQL DBAs, is to use the EXPLAIN PLAN FOR command to get an execution plan out of Oracle. To do this, a DBA would simply add the command EXPLAIN PLAN FOR to the beginning of a SQL statement:

EXPLAIN PLAN FOR
SELECT o.order_id, v.product_name
FROM   oe.orders o,
       (  SELECT order_id, product_name
          FROM   oe.order_items o, oe.product_information p
          WHERE  p.product_id = o.product_id
                 AND list_price < 50
                 AND min_price < 40  ) v
WHERE  o.order_id = v.order_id;

Oracle will immediately finish and return Explained. SQL Server DBAs will likely exclaim “WTF, Oracle?” and stomp off in frustration in search of a GUI that will let them connect to Oracle and show them a graphical execution plan.

Oracle gives DBAs the ability to save off plans into tables, and that’s exactly what the EXPLAIN PLAN FOR syntax does – it saves the plan into a table named plan_table. You can view the execution plan using the following command:

SELECT plan_table_output
FROM   table(dbms_xplan.display());

This produces lovely text output:


Plan hash value: 1906736282

---------------------------------------------------------------------------------------------
| Id  | Operation             | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                     |     1 |    40 |     6   (0)| 00:00:01 |
|   1 |  NESTED LOOPS         |                     |     1 |    40 |     6   (0)| 00:00:01 |
|   2 |   MERGE JOIN CARTESIAN|                     |     4 |   128 |     6   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL  | PRODUCT_INFORMATION |     1 |    28 |     5   (0)| 00:00:01 |
|   4 |    BUFFER SORT        |                     |   105 |   420 |     1   (0)| 00:00:01 |
|   5 |     INDEX FULL SCAN   | ORDER_PK            |   105 |   420 |     1   (0)| 00:00:01 |
|*  6 |   INDEX UNIQUE SCAN   | ORDER_ITEMS_UK      |     1 |     8 |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("MIN_PRICE"<40 AND "LIST_PRICE"<50)
   6 - access("O"."ORDER_ID"="ORDER_ID" AND "P"."PRODUCT_ID"="O"."PRODUCT_ID")

19 rows selected.

There are three parts here - the plan hash value, the row source tree, and predicate information. Plans can be read similarly to SQL Server execution plans; just start at the deepest level and work your way out. In this case, start at Id 6 and work your way up to 3, then 2, then 1, then 0. An asterisk next to the Id means that there’s more detailed information about that step in the Predicate Information section.

You can get more detailed output from Oracle by supplying parameters to the DBMS_XPLAN.DISPLAY() function:

select plan_table_output
from table(dbms_xplan.display('plan_table', null, 'all'));

More documentation on DBMS_XPLAN can be found in the Oracle documentation.

Getting Actual Execution Plans From Oracle

Run the query first:

SELECT o.order_id, v.product_name
FROM   oe.orders o,
       (  SELECT order_id, product_name
          FROM   oe.order_items o, oe.product_information p
          WHERE  p.product_id = o.product_id
          AND    list_price < 50
          AND    min_price < 40  ) v
WHERE  o.order_id = v.order_id;

And then follow up by executing:

select * from table(dbms_xplan.display_cursor());

You should see the following execution plan:


SQL_ID  gpbdw897v4mbg, child number 0
-------------------------------------
SELECT o.order_id, v.product_name FROM   oe.orders o,        (  SELECT
order_id, product_name           FROM   oe.order_items o,
oe.product_information p           WHERE  p.product_id = o.product_id
        AND    list_price < 50           AND    min_price < 40  ) v
WHERE  o.order_id = v.order_id

Plan hash value: 1906736282

---------------------------------------------------------------------------------------------
| Id  | Operation             | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                     |       |       |     6 (100)|          |
|   1 |  NESTED LOOPS         |                     |     1 |    40 |     6   (0)| 00:00:01 |
|   2 |   MERGE JOIN CARTESIAN|                     |     4 |   128 |     6   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL  | PRODUCT_INFORMATION |     1 |    28 |     5   (0)| 00:00:01 |
|   4 |    BUFFER SORT        |                     |   105 |   420 |     1   (0)| 00:00:01 |
|   5 |     INDEX FULL SCAN   | ORDER_PK            |   105 |   420 |     1   (0)| 00:00:01 |
|*  6 |   INDEX UNIQUE SCAN   | ORDER_ITEMS_UK      |     1 |     8 |     0   (0)|          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(("MIN_PRICE"<40 AND "LIST_PRICE"<50))
   6 - access("O"."ORDER_ID"="ORDER_ID" AND "P"."PRODUCT_ID"="O"."PRODUCT_ID")


28 rows selected.

Hey, that looks exactly the same! The default parameters for dbms_xplan.display_cursor won’t show us all the juicy details that we want. Adding in FORMAT => '+ALLSTATS'will do exactly what we want. Before you go running that execution plan query again, you have two choices: you can either execute the full select query again, or you can copy down the SQL_ID and use it in the dbms_xplan.display_cursor function like this:

select *
from table(dbms_xplan.display_cursor(sql_id => 'gpbdw897v4mbg',
                                     format => '+ALLSTATS'));

The output gets a bit more detailed and ends up looking like this:


SQL_ID  5p20zwvbgb93j, child number 0
-------------------------------------
    SELECT o.order_id, v.product_name     FROM   oe.orders o,
 (  SELECT order_id, product_name               FROM   oe.order_items
o, oe.product_information p               WHERE  p.product_id =
o.product_id               AND    list_price < 50               AND
min_price < 40  ) v     WHERE  o.order_id = v.order_id

Plan hash value: 1906736282

----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name                | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem |  O/1/M   |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                     |      1 |        |    269 |00:00:00.08 |    1337 |       |       |          |
|   1 |  NESTED LOOPS         |                     |      1 |      1 |    269 |00:00:00.08 |    1337 |       |       |          |
|   2 |   MERGE JOIN CARTESIAN|                     |      1 |      4 |   9135 |00:00:00.03 |      34 |       |       |          |
|*  3 |    TABLE ACCESS FULL  | PRODUCT_INFORMATION |      1 |      1 |     87 |00:00:00.01 |      33 |       |       |          |
|   4 |    BUFFER SORT        |                     |     87 |    105 |   9135 |00:00:00.01 |       1 |  4096 |  4096 |     1/0/0|
|   5 |     INDEX FULL SCAN   | ORDER_PK            |      1 |    105 |    105 |00:00:00.01 |       1 |       |       |          |
|*  6 |   INDEX UNIQUE SCAN   | ORDER_ITEMS_UK      |   9135 |      1 |    269 |00:00:00.02 |    1303 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(("MIN_PRICE"<40 AND "LIST_PRICE"<50))
   6 - access("O"."ORDER_ID"="ORDER_ID" AND "P"."PRODUCT_ID"="O"."PRODUCT_ID")


28 rows selected.

Now we can see the estimated (E-Rows) and actual (A-Rows) returned from each operation as well as timing and memory information. Parallel queries will return even more information (take a look at the example DBMS_XPLAN commands and output to see what that would look like.

Viewing The Oracle Execution Plan Cache

You might hear Oracle people talk about this as the library cache or the cursor cache. In the SQL Server world, we’d call it the execution plan cache. Whatever you want to call it, it’s contained in a system view: V$SQL_PLAN.

We could find our query in V$SQL_PLAN and get the execution plan doing something like this:

select plan_table_output
from   v$sql s,
       table(dbms_xplan.display_cursor(s.sql_id, s.child_number,
'basic')) t
where  s.sql_text like 'SELECT o.order_id, v.product_name%';

This query will pull back basic information about the different statements that have been executed that match our search condition. That output looks different because I used thebasic parameter for plan display instead of the default of typical, but here’s an example:


EXPLAINED SQL STATEMENT:
------------------------
SELECT o.order_id, v.product_name FROM   oe.orders o,        (  SELECT
order_id, product_name       FROM   oe.order_items o,
oe.product_information p       WHERE  p.product_id = o.product_id
    AND    list_price < 50       AND    min_price < 40  ) v
WHERE  o.order_id = v.order_id

Plan hash value: 1906736282

-----------------------------------------------------
| Id  | Operation             | Name                |
-----------------------------------------------------
|   0 | SELECT STATEMENT      |                     |
|   1 |  NESTED LOOPS         |                     |
|   2 |   MERGE JOIN CARTESIAN|                     |
|   3 |    TABLE ACCESS FULL  | PRODUCT_INFORMATION |
|   4 |    BUFFER SORT        |                     |
|   5 |     INDEX FULL SCAN   | ORDER_PK            |
|   6 |   INDEX UNIQUE SCAN   | ORDER_ITEMS_UK      |
-----------------------------------------------------

If you want to get the full plan from any of these statements, you can either include the sql_id in the query to find all plans for a statement, or you can use the plan hash value to get retrieve it from the v$sql_plan view and use it in the dbms_xplan.display_cursor function.

In the query to retrieve matching plans, our query uses Oracle's v$sql_plan system view to find matching cursors (everything's a cursor, yay) in memory. Once we have a list of matching cursors in memory, dbms_xplan.display_cursor is used to pull back information about the actual plan from the cursor cache.

Wrapping Up

There you have it - three ways to view execution plans in Oracle:

  1. Using EXPLAIN PLAN FOR to get an estimated execution plan.
  2. Using dbms_xplan.display_cursor() after running a query to get the actual execution plan.
  3. Querying v$sql_plan to find plans in the plan cache.

SQL Server experts and novices alike can get the kind of query details from Oracle that they’re used to seeing from SQL Server.

Backing Up an Oracle Database

One of a DBA’s most important tasks is taking backups of databases. As SQL Server DBAs, we’re used to using the same tools to backup the database that we use for the rest of our job. With Oracle, there are two main ways to back up a database: user managed backups and Oracle Recovery Manager (RMAN).

User Managed Backups of Oracle

User managed backups leave the database out of the backup equation. It’s up to the DBA and system administrator to figure out the best way to accomplish backup and recovery.

In the distant past, user managed backups required that the Oracle database be taken offline in order for the backup to occur. The database has to be in a consistent state; since the OS doesn’t know how Oracle is writing data, it can’t be trusted to copy files correctly.

Obviously, it’s not possible to take most databases out of production to take a backup. Oracle has a feature called a “hot backup”. In a hot backup, databases or tablespaces can be placed into backup mode. Once in backup mode a DBA or system administrator can use OS/storage tools to take a backup of the Oracle database. Writes can continue during this time, but Oracle changes its behavior to make it possible to get a consistent snapshot of the data.

As the name implies, user managed backups place the entire burden of managing database backups on the DBA. While this is more primitive than what most SQL Server DBAs are used to, the fundamental principle is the same – it’s up to the DBA to figure out when and how to backup the database.

Good backups are like good plumbing - you don't miss them until they're broken.

Good backups are like good plumbing – you don’t miss them until they’re broken.

Backing Up an Oracle Database With Recovery Manager

RMAN is the answer to your anguished cries of “There has to be a better way to back up a database!”

RMAN combines many aspects of automated user managed backups, but provides additional functionality and fail safe mechanisms. The overall upside is that RMAN ships with Oracle – there’s limited need to write custom software, instead DBAs configure RMAN to work in ways that make sense for the environment.

Effectively, here’s what happens: the DBA sets up RMAN with space to store backups, the fast recovery area in RMAN terms. The fast recovery area is where RMAN will store data file backups as well as archived redo log files. The idea behind the fast recovery area is that RMAN itself will manage backup retention based on policies that a DBA sets up. By having a fast recovery area present, it’s also possible to rapidly handle online restore – a DBA just has to tell RMAN a point in time for the restore and the software does the rest.

RMAN is a highly configurable tool – it supports many different types of backups (including the hot backups mentioned earlier), full backups similar to SQL Server full backups (data + some log), incremental backups, and even collapsing incremental backups so that a DBA has to restore fewer files. It’s relatively simple to start out with RMAN, but it provides a wealth of functionality that lets a DBA create the backup and recovery strategy the business needs.

Not everything is automatic: DBAs need to carefully analyze backup space requirements. RMAN handles backups, but if the fast recovery area fills up, the database will refuse writes until space is freed up.

By leveraging RMAN, DBAs can focus less on the technical details of backups and instead get to managing the business requirements of RPO and RTO while leaving tedious tasks to software.

How Should a SQL Server DBA Approach This?

From a SQL Server DBA’s perspective, RMAN is closer to a third party solution like Ola Hallengren’s maintenance scripts combined with a third party tool like Dell Litespeed or Red Gate SQL Backup. It handles many aspects of database backups, provides centralized reporting, and automates most tasks around backup and recovery. RMAN will seem more familiar to a SQL Server DBA, but some shops will still use user managed backups.

To get started learning more, check out the Oracle Database Backup and Recovery User’s Guide. It contains a number of detailed examples, syntax explanations, and considerations about how to best approach backing up an Oracle database.

Oracle Terminology for the SQL Server DBA

translation-monster

The translation team hard at work.

At some point you’re going to be confronted with an Oracle installation or even just an Oracle DBA. Communicating with a DBA who works on a different product can be difficult, it’s like speaking US English and having a conversation with a native English speaker from Scotland. The words are the same, but they have different meanings.

While this is by no means an exhaustive list, it will help SQL Server DBAs have a meaningful conversation with their Oracle colleagues.

Database
Oracle refers to the database as the data files on a disk that store data.

Database instance
The set of memory structures and system processes that manage database files. Basically, the instance is executables and memory. Oracle has some different terms to separate out plan cache, buffer pool, and other concepts. But at a high level, executables and memory make a database instance.

So far things seem the same. Up until Oracle 12c, though, these two concepts were close to one in the same – one instance of Oracle housed one database (things like Oracle RAC not included). One thing to take note of - Oracle on Windows runs within one process, just like SQL Server. On Linux, however, there will be multiple Oracle processes each with a clearly defined purpose.

Tablespace
A tablespace is roughly analogous to a filegroup. You can create tables and indexes inside a tablespace. Like a filegroup, you can take tablespace backups separate from the rest of the database.

Unlike SQL Server, each tablespace can have many different options – some tablespaces can be logged while others are not. During tablespace creation, DBAs can manage a variety of features of each tablespace including a separate undo tablespace (see below), per user disk quotas, logging, or even on-disk block size (this can be helpful when dealing with LOB data).

In short, Oracle DBAs can customize database behavior at the tablespace level as well as at the database level. This can be useful for controlling archive data performance, blobs, or managing other aspects of storage.

This tablespace is vast and untouched.

This tablespace is vast and untouched.

Default tablespace
Every user is created with a default tablespace. The default tablespace defines where that user’s tables and indexes will be created unless a different location is specified. This is like setting up a default filegroup, but it can be set per user instead of per database, and it provides finer grained control. A default tablespace is not like a default schema in SQL Server – users can create objects with different schemas inside their default tablespace. This isn’t related to object ownership like schemas in SQL Server, it’s related to object placement on disk.

Temporary tablespace
You know how SQL Server has one tempdb? Within Oracle, database administrators can specify a different temporary work space on a user by user basis. Fast OLTP workloads can have access to SSD temporary tablespace while data warehouse queries and ETL jobs can have their own temporary tablespace that uses rotational disks. Heck, you could even allocate PCI-Express storage for executives’ temporary tablespace if they needed lightning fast joins or just wanted to foot the bill for PCI-Express storage.

Undo tablespace
Oracle uses MVCC by default (in SQL Server you’d call it READ COMMITTED SNAPSHOT ISOLATION). Row versions have to be stored somewhere, but there’s no tempdb. The undo tablespace is used to track changes that have been made and to put the database back into a consistent state if a transaction is rolled back. Although it is possible to create multiple undo tablespaces, only one undo tablespace will be used any single Oracle instance at a time.

If only one tablespace can be active per Oracle instance, why have multiple undo tablespaces? Oracle RAC can contain multiple Oracle instances reading the same database. Each of the Oracle RAC instances can have a separate undo tablespace. If this sounds confusing, don’t worry – Oracle RAC is complex and deserves a separate blog post.

Rollback
Once upon a time, Oracle DBAs had to configure the undo tablespace by hand. This was called the rollback segment. Poorly configured rollback segments led to “snapshot too old” errors and grumpy DBAs. If you ever encounter Oracle using a rollback segment, kindly ask the DBA why they aren’t using automatic rollback management (undo tablespaces).

Redo log files
It’s a transaction log file! A key Oracle difference is that everything gets logged, even the undo information. Redo log files are used just like SQL Server transaction log files.

Like SQL Server’s transaction log, Oracle can have multiple redo log files. These log files are written to in a circular fashion – the log files are written to in order and, when all log files are full, Oracle will circle around to the beginning again.

Unlike SQL Server, having multiple redo log files is the preferred way to manage Oracle logging. By default, there are two groups of redo log files, but this can and should be configured, based on RPO/RTO needs.

Archived redo log files
These are redo log files that have been backed up. There are a number of ways to have Oracle automatically manage creating backups of redo log files that vary from manual to completely automated. If the disks storing these files fills up, Oracle will not be able to write to the data files – active redo log files can’t be archived any more. To ensure safety, writes are stopped.

Temporary tables
Oracle temporary tables are similar to SQL Server’s with one major exception – they’re statically defined. Even though an Oracle temp table definition will stick around until dropped, the data only persists for the duration of a session (or transaction if the table is configured that way).

The data inside a temporary table exists only for the current session – you can’t view data in another session’s temp table. The upside is that temp table metadata is always available for other users to query.

Backups
Oracle backups are very different from SQL Server backups – they’re both more simple and more complex than SQL Server at the same time. Many Oracle shops use a tool call Oracle Recovery Manager (RMAN) to handle database and redo log backups, archival, and even the expiration and deletion of backup files.

In-Memory Analytics: What’s Different Between SQL Server and Oracle

Pick a memory, any memory.

Pick a memory, any memory.

Both Microsoft and Oracle have released what they call “in-memory analytics” features. Microsoft started two years ago with SQL Server 2012 and Oracle is following suit next month (July 2014) with the Oracle In-Memory Analytics feature. Most of us won’t pick our database platform based on just one feature, but it’s important to know the difference between these two so you can have a better discussion around the technical choices that you’re making.

SQL Server Columnstore Indexes

When they first arrived, SQL Server columnstore indexes were touted as memory optimized. By using a columnar data structure and columnar compression techniques, SQL Server columnstore indexes are able to get tremendous improvements in performance.

In effect, SQL Server columnstore indexes are “in-memory” because of the effects of columnar compression. The compression means that the index size is significantly smaller. It’s likely that the whole index can be cached in memory and will stay in memory because the index smaller.

By thinking smarter, Microsoft created a technology that allowed BI professionals to achieve significant speeds on data warehouses. There was one problem: the SQL Server 2012 columnstore was read-only.

SQL Server 2014 columnstore fixed the read-only problem by introducing a writable clustered columnstore. When the index is created as a clustered columnstore index (it’s the only index on the table) SQL Server will use a writable deltastore to hold data. This advantage over SQL Server 2012 has prompted many teams to migrate to SQL Server 2014.

The downside of both SQL Server columnstore implementations is that they’re designed for pure data warehouse workloads. If you wanted to combine an OLTP workload with a data warehouse workload on the same system, you’d have some work head of you to build something that would work with both regular indexes and columnstore indexes. At its core, SQL Server columnstore technology is a data warehouse technology.

Oracle Database In-Memory

Oracle Database In-Memory takes a different approach to SQL Server. The Oracle feature is designed to be a transparent drop-in. Rather than require a separate physical structure on disk, a DBA will activate the in-memory feature for a given table and wait as Oracle does the rest.

Oracle does this by creating a dual-format data structure. As data is read and written to tables using the in-memory feature, the database builds columnar indexes in-memory and on-the-fly. Yes, this means that you’ll need more memory in your database servers. Many Oracle customers use Oracle RAC for scale-out performance improvements. Dedicating one RAC node to reporting isn’t unheard of and can be used to alleviate the pressure caused by holding a second, albeit compressed, copy of a table or index in memory.

Once the row-level data has been merged into the columnar index, Oracle uses compression to further improve performance. Just like other columnar database engines, compression vastly improves the performance of queries.

Oracle’s offering doesn’t persist the in-memory analytic indexes to disk – they’re purely in-memory indexes. This avoids the overhead of logging and persisting them to disk. The double write hit is avoided. However, there’s one problem with this technology – restarts. Whenever the database engine is restarted, the in-memory analytic indexes will have to be rebuilt from disk.

Technical details about the Oracle offering are sparse – we can expect to learn more after Oracle has released the feature to the public.

Comparing In-Memory Analytics

While both companies have produced an in-memory analytic feature of one form or another, direct comparisons are difficult. Microsoft SQL Server columnstore indexes are designed to improve pure data warehouse implementations. Oracle Database In-Memory, on the other hand, is intended to help DBAs get the most out of the hardware they already have. Many business users want to report on live data and this feature makes live reporting a possibility.

Sizing SQL Server for AWS

Let’s skip the shenanigans and get right down to it – you’re responsible for SQL Server and someone in your company wants to move things into Amazon Web Services (AWS). You’ve got SQL Server setup covered thanks to our helpful SQL Server setup checklist and you’re confident you’ve got hardware under control, but things are different up in the clouds.

The question on your mind is “What’s the best way to size a SQL Server in the cloud?”

The Basics: Sizing the Server

measure-this

Amazon gives you a set list of server sizes to choose from. It’s up to you to pick the right server for the job. The available choices can be overwhelming at first glance, but it gets simpler when you consider different classes of SQL Server and map them to different families of AWS instances.

By the way: we call servers instances in the cloud. Yes, it’s confusing. No, we can’t change it.

Instances are grouped into instance types – you can think of an instance type as a general category of performance. This is going to be your starting point for deploying SQL Server. Don’t worry, though, you can usually change the instance type after you’ve configured SQL Server.

Possible SQL Server instance types:

  • M3 General Purpose – These are basic servers with up to 8 cores and 30GB of memory. M3 instances won’t win any awards for speed, but they are cheap.
  • R3 Memory Optimized – These instances ship with up to 32 cores (2 sockets) and 244GB of memory. Two local SSDs are thrown in – seems like a good place for TempDB to me!
  • I2 IO Optimized – Sounds great at first, right? All the memory of the R3 with 6.4TB of local SSD. At only $7 an hour, it’s a steal!

So, which one do you need? Remember that you don’t want to just choose the biggest and baddest instance right off the bat – there’s a noticeable cost associated with any of these instances.

We’ll get back to picking an instance type after covering a few other details.

Support Level

In many organizations, DBAs are lucky to have tiered support. Mission critical applications like POS systems, or credit card processing get put in the highest tier of support. Internal applications may end up in a second tier. And, finally, legacy applications or applications that are not critical in any way to the business end up in the third tier.

We can’t ask AWS to give us faster disks, SAN replication or some other magical enterprise HA feature. But we can use this support information to figure out if this SQL Server requires the attention of a full time DBA or if it can be offloaded to something like Amazon RDS. Making this decision is outside the scope of this article, but knowing that there is an option is enough.

  • What’s the support level required for this application?
  • Does the application require regular manual tuning?
  • What’s the support SLA?
  • Is there an incident response plan?
    • Has anyone ever reviewed the incident response plan?
    • Have you ever had to follow it? That is: does it actually work?

Networking

I bet you don’t even think about networking.

The first thing you need to know is: by default, AWS instances use a single network card (NIC) for all traffic. Storage and communication use the same pipe. If you want to use database mirroring and rebuild your indexes, you may be in for a nasty surprise when mirroring drops in the middle of a rebuild.

Second: not all instance types are networked equally. Look in the Networking Performance column to get a vague idea of what’s going on. “Moderate” equates to 500 Mbps, “High” is 1000 Mbps, and “Low”/”Very Low” shouldn’t be entering the equation for a SQL Server.

Watch the storage and network throughput of your SQL Server to verify how much total throughput you need.

For SQL Server it’s better to guarantee a certain level of both network and disk performance. AWS offers what are called
EBS-Optimized Instances. These EBS-Optimized Instances use a separate network to communicate with storage in AWS. Traffic on the storage network won’t cause problems for traffic on the regular network.

Use EBS-Optimized Instances for your most demanding applications.

Other applications may not require as much throughput, but you should adjust maintenance to prevent the shared network from becoming a bottleneck.

Network considerations are:

  • What is typical network throughput for the application like?
  • Do you need separate storage and networking pathways?

Storage

storage

I’m going to repeat myself: Use EBS-Optimized Instances for your most demanding applications. You can change this after the instance is up and running but it will require a restart. Plan carefully.

For I/O intensive workloads, AWS offers Provisioned IOPS (PIOPS). The PIOPS option lets you decided on a guaranteed number of IOPS (+/- 10%) available to the system using that disk. PIOPS disks have an increased cost associated with them – they’re 2.5 times more expensive per GB than regular storage and you pay $0.10 / IOP / month.

Stripe drives with caution. In a conventional data centers, it’s a best practice use RAID to configure large volumes and to stripe data across many disks for additional performance. Although RAID striping is possible in AWS, keep in mind that any RAID volume is only as fast as the slowest disks in the RAID volume. Since all permanent storage in AWS is networked attached (and subject to unpredictable latency), think very carefully before setting up RAID volumes.

Storage is somewhat limited in AWS, too. Each AWS disk (EBS volume) can be up to 1TB in size. Up to either 16 or 25 disks can be attached to an instance (this depends on some driver limitations). At a maximum, 25TB of data can be stored on an AWS instance using EBS volumes. I bet you were worried for a minute, right? Anyone wanting to store more than 25TB of data will need scale out across multiple servers.

The storage considerations are:

  • How much storage do you need for data, logs, and backups?
  • How fast does that storage connectivity need to be?
  • How much are you going to access that storage in any given second?
  • Do you need to guarantee the accesses per second?
  • What is the rate of data growth?
  • What is the rate of data retention?
  • What is the total volume of data being stored?

Picking an AWS Instance Type for SQL Server

So how should you go about picking an AWS instance type for SQL Server?

Take a look at your memory requirements for SQL Server and compare that the instance types on offer in AWS. This may exclude several instances types immediately.

After answering the storage questions, determine if you need provisioned IOPS, dedicated storage connectivity, or even 10Gb networking. This further narrows down the choices you have for instances.

Once you’ve reached this point, you should have several choices of instance size available.

Put It All Together:

  • Network
    • What is typical network throughput for the application like?
    • Do you need separate storage and networking pathways?
    • Are features in place that require additional network throughput?
      (This might be AlwaysOn Availability Groups, database mirroring,
      or log shipping.)
  • Storage
    • How much storage do you need for data, logs, and backups?
    • How fast does that storage connectivity need to be?
    • How much are you going to access that storage in any given second?
    • Do you need to guarantee the accesses per second (IOPS)?
    • What is the total volume of data being stored?
  • Memory
    • What are the memory requirements of the existing SQL Server?
    • What is the use case for this SQL Server? OLTP? DSS?
    • What other services will run on this system?
    • Do you need Enterprise Edition?
  • CPU
    • What is the current CPU count?
    • What is the current CPU speed?
    • Do you require any Enterprise Edition features?
  • Support
    • What is the support agreement for this SQL Server?
    • What’s the support level required for this application?
    • Does the application require regular manual tuning?
    • What’s the support SLA?
    • Is there an incident response plan?
      • Has anyone ever reviewed the incident response plan?
      • Have you ever had to follow it? That is: does it actually work?
  • Instance Size: If you still have choices left to make between one of several instance sizes, ask a few more questions:
    • What is our current rate of activity growth? (e.g. as data volume doubles, will user activity double?)
    • How much will this instance cost over 1 year? 3 years?
    • Is a reserved instance a possibility (taking into account your growth rate)?

When you put it all together, sizing a SQL Server in AWS is a lot like sizing a on-premises SQL Server with several constraints around sizing and capacity.

Bonus questions:

  • Performance Profile
    • Is there a performance SLA?
    • What are the top wait types of an existing SQL Server?
    • What aspects of the application can be changed?
  • Doing Less Work

 Ignorable AWS Instance Types

Instance types we can completely ignore:

  • G2 GPU – This is for people who do math on graphics cards.
  • C3 Compute Optimized – Instances with a lot of CPU, some local SSD scratch space, and not a lot of memory. Great for HPC, not great for SQL Server.
  • HS1 High Storage Density – These monsters can have up to 24 2TB local drives delivering 2.6 gigabytes per second of disk throughput. At first glance, they look like a dream. However, the data stored on the local disks will be erased if you stop and then start your AWS instance. You could try to use AlwaysOn Availability Groups, but a total AWS Availability Zone or Regional outage could lead to massive data loss. These are primarily intended for clustered computing like Hadoop or distributed data warehouses (GreenPlum).

Introducing… Your Contributions

You’ve got great ideas, don’t lie. We want to help you get those ideas out there… and into our scripts. It’s great to say that our tools (sp_AskBrent®sp_Blitz®, sp_BlitzIndex®, and sp_BlitzCache™) are as good as they are today because of your help. We’re adding two new ways to make it easier to get a chance to share code and collaborate on some awesome tools.

Feature Requests

Make your voice heard! Mostly by other users, not us. We don't listen to that stuff. But we'll pretend.

Make your voice heard! Mostly by other users, not us. We don’t listen to that stuff. But we’ll pretend.

Starting right now, you can head over to http://support.brentozar.com and submit new feature requests or vote on existing feature requests. You only get so many votes until we implement your feature, so make ‘em count!

This is also your chance to interact with us during the development of the feature.

Contribute Code

A number of you have done this already! You’ve contributed a bunch of fixes, enhancements, and new checks for sp_Blitz®.

In the past you’ve had to email code contributions. We’ve made it easier – there’s now a form with a place for you to dump sample code. In fact, your sample code gets turned into a feature request in our version control system which, in turn, emails everyone. That really means we’ll get to your changes faster.

Ready to share your code? Head over to http://brentozar.com/contributing-code/ and get started.

Don’t freak out at the licensing agreement – the lawyers caught wind of what we were doing and said we should get you to sign it. All it says is that you’re the person who owns the code and you’re giving us the right to use your code; you still own the code.

What Are You Waiting For?

You’re full of great ideas! We’ve got two new ways for you to share them! Sound off or code off.

How Much Memory Does SQL Server Need?

A frequently asked question is “How much memory does SQL Server need?” It’s difficult to figure out how much memory you should allocate to SQL Server.

Sizing Memory for a New SQL Server

Sizing memory for a new SQL Server application sounds like it’s going to be tricky. But there is one thing that makes this simpler: SQL Server Standard Edition.

SQL Server Standard Edition limits available memory; there are two options based on the version of SQL Server:

  • SQL Server 2008 and earlier: You get unlimited memory. Rejoice in this, it’s about the old good thing your AMC Pacer of a database has going for it. Plan for an eye opener when you look at licensing changes.
  • SQL Server 2008R2 – 2012: Put 96GB of memory in the server. There is a limit of 64GB of memory on these versions of SQL Server: just give SQL Server 64GB and walk away.
  • SQL Server 2014 or newer: Put 192GB of memory in the server. SQL Server 2014 allows up to 128GB of memory for Standard and BI Edition. Give SQL Server the maximum and walk away. This can get more complex if you’re using SQL Server Analysis Services, but have you ever known me to talk about SSAS?

Sizing memory is more difficult with SQL Server Enterprise Edition. Experience with capacity planning makes it easier to size SQL Server. Let’s assume you don’t have a capacity planning team, what should you do?

Get the cheapest, biggest, DIMMs that money can buy. What does that mean? Well, there’s a significant price difference between 16GB and 32GB DIMMs (that’s memory sticks to you and me). For example, a Dell 820 can accept up to 24 DIMMs. What’s the cost look like?

  • 16GB DIMMs - 384GB of system memory, $5,849.28
  • 32GB DIMMs - 768GB of system memory, $19,779.60
  • 64GB DIMMs - 1536GB of system memory, $108,310.32

The 32GB DIMMs fall into the “cheapest, biggest, DIMMs that money can buy” category. Besides, the cost of RAM is likely to be a rounding error compared to the cost of licensing SQL Server Enterprise Edition.

Sizing Memory for an Existing SQL Server

This is where things start to get tricky. There are a few different metrics available to measure the amount of memory that SQL Server is working with.

Remember, we’re trying to measure the size of the active working set of data. This is difficult to do, but there are a few easy ways that we can get most of the way there.

Sizing Memory with Page Life Expectancy

On a single CPU socket system, this is easy:

SELECT  object_name,
        counter_name,
        cntr_value AS [value]
FROM    sys.dm_os_performance_counters
WHERE   LTRIM(RTRIM(object_name)) = 'SQLServer:Buffer Manager'
        AND LTRIM(RTRIM(counter_name)) = 'Page life expectancy' ;

If you’ve got more than one CPU socket (or you’re on a system with multiple NUMA nodes), then you might want to get a bit more specific. The previous query averages memory consumption across all CPU sockets/NUMA nodes. This query looks at each one individually:

SELECT  object_name,
        counter_name,
        instance_name AS [NUMA Node] ,
        cntr_value AS [value]
FROM    sys.dm_os_performance_counters
WHERE   LTRIM(RTRIM(object_name)) = 'SQLServer:Buffer Node'
        AND LTRIM(RTRIM(counter_name)) = 'Page life expectancy' ;

Why track the page life expectancy at all? PLE can be a simple indicator of memory and I/O health. If PLE ever decreases, then you know you’re reading from disk. That doesn’t mean the active working set of data is bigger than memory, it just means that you’re reading from disk at some point.

Track page life expectancy for a regular period of time. Make sure to measure PLE at the same time of day and at regular intervals. Graph your PLE and use that to determine trends about memory use. If PLE is trending down – you may need more RAM.

By the way, here are a few things that could need a read from disk (and decrease PLE):

  • Rebuilding all indexes
  • DBCC CHECKDB

Measuring Disk Activity with Performance Monitor

Perfmon is a great way to track physical disk activity. You don’t need to do any complicated math to determine if disks are being accessed. Once you’re ready to measure, fire up perfmon and add the following counters for every disk that contains non-system SQL Server data files; don’t include log file drives.

  • Physical Disk \ Avg. Disk sec/Read
  • Physical Disk \ Disk Reads/sec
  • SQLServer: SQL Statistics \ Batch Requests/sec

Monitor these counters for at least a week, just like you should monitor PLE for at least a week. You don’t need to collect this information at a particularly fine granularity – even once an hour will do. Just make sure you take into account when regular maintenance (index rebuilds, DBCC, etc) is running so you can account for any reads and writes during those periods.

If you don’t see heavy read activity during business hours, then you can assume you’re not under memory pressure.

By including the Batch Requests/sec counter, we can track disk activity to SQL Server activity. We don’t need worry about system level reads that happen when there’s no SQL Server activity. Also make sure anti-virus is configured to ignore SQL Server files.

If reads increase along side Batch Requests/sec, then we know there’s some correlation – as user activity increases, so does disk activity. If you want to figure out the exact cause, you’ll need to get more invasive and use SQL Server Profiler or Extended Events.

For more information on collecting and analyzing perfmon counters, take a look at SQL Server Perfmon (Performance Monitor) Best Practices.

Measuring with the DMVs

We can get even more scientific using the DMVs. This is where things can get really tricky and it’s important to exercise a great deal of caution – not because the DMVs are inaccurate, but because they contain a great deal of information from many sources.

While we can use sys.dm_io_virtual_file_stats to measure I/O against specific files, it’s important to take into account regular maintenance on those database files. Database backups (both full and differential) will register reads against the data files. In order to determine how many reads are happening against a file, we need a quick and dirty measure.

DECLARE @startup DATETIME,
        @tbb DECIMAL(38, 0),
        @total_reads DECIMAL(38, 0),
        @total_data_size DECIMAL(38, 0),
        @mb DECIMAL(38, 0) ,
        @gb DECIMAL(38, 0) ;

SET @mb = 1024.0 * 1024.0 ;
SET @gb = @mb * 1024.0 ;

SELECT  @startup = create_date
FROM    sys.databases
WHERE   name='tempdb' ;

SELECT  @tbb = SUM(CAST(COALESCE(backup_size, 0) AS DECIMAL(38,0)))
FROM    sys.databases d
        LEFT JOIN msdb.dbo.backupset b ON d.name = b.database_name
WHERE   b.backup_start_date >= @startup
        AND b.type <> 'L'
        AND d.database_id > 4 ;

SELECT  @total_reads = SUM(num_of_bytes_read)
FROM    sys.dm_io_virtual_file_stats(NULL, NULL) vfs
WHERE   vfs.database_id > 4 ;

SELECT  @total_data_size = SUM(CAST(size * 8 AS DECIMAL(38,0))) / @mb
FROM    sys.master_files
WHERE   [type] <> 1
        AND database_id > 4 ;

SELECT  (@total_reads - COALESCE(@tbb, 0)) / @gb AS [Non-backup reads (GB)] ,
        ((@total_reads - COALESCE(@tbb, 0)) / @gb) /
            DATEDIFF(DAY, @startup, CURRENT_TIMESTAMP) AS [Non-backup reads / day (GB)] ,
        @total_data_size  AS [Total Data Size (GB)] ;

This query, while not exact, will at least take into account the reads and writes against the file but will exclude regular backups. If you’re performing index maintenance, that index maintenance will also be included. There’s no easy way to measure the load of index maintenance apart from taking a snapshot of sys.dm_io_virtual_file_stats before and after the database maintenance occurs.

Again – this method isn’t exact, but it will give you a better understanding of the physical I/O requirements of your server since the last SQL Server start up.

Indexes?

There’s one other thing to keep in mind – if you have the right indexes in place, SQL Server won’t need to read from disk and you won’t need to keep the entire database in memory. Adding memory until you don’t perform additional I/O is one way to solve the problem, but tuning indexes is another. Check out sp_BlitzIndex® for ways to tune up your indexes.

Summarizing our Sizing Journey

For new SQL Servers – fill it with memory.

For existing SQL Servers there are three ways to check memory:

  1. Page life expectancy.
  2. Measuring disk activity with Performance Monitor.
  3. Measuring disk activity through the DMVs.

None of these are precise – they all miss out certain key measurements – but they all give us a good enough idea without requiring heavy weight monitoring.

Curious About the SQL Server 2014 Cardinality Estimator?

If you’ve been following the blogosphere, or heading to SQL Saturdays, you’ll know that many folks are excited about the brand new SQL Server cardinality estimator (CE for short). As Kendra showed in her article The SQL 2014 Cardinality Estimator Eats Bad TSQL for Breakfast, the new CE provides dramatically improved query plans.

What if you want to know more about the new CE? What’s your option? Well, a lot of bloggers, presenters, and crazy people (I’m looking at you Paul White), will observe how the CE behaves by building carefully contrived scenarios and verifying the outcome. We can observe the CE in the same way that we observe the optimizer – by observing the side effects.

What if you want to know why something happened? To get an answer to that, you’d have to go directly to the source. Thankfully, the source has published their work in the paper Testing cardinality estimation models in SQL Server. Warning – this paper does cost money if you aren’t a member of the ACM.

SQL Server 2014 Cardinality Estimator

SQL Server 2014 Cardinality Estimator

Why would you want to go and read an academic paper like this one? Well, it contains a lot of juicy information about what the SQL Server product team considered in terms of possible optimization paths for the CE. One item of interest is the concept of a simple join optimization. To summarize – the simple join ignores histograms and assumes that all distinct values from the child are contained in the other table. This could cause some query estimation issues, but there will be faster optimization. The authors note that for some workloads, this will be much more desirable.

If you’re not up for an academic paper or two, Joe Sack has put together an in depth look in Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator. It’s well researched, peer reviewed, and based on observable verifiable information that’s in the product right now.

What are you waiting for? Read a white paper or two today!

Introducing sp_BlitzCache™

Finding the worst queries in SQL Server is easy, right? You just query one of the dm_exec_* DMVs, bring in the SQL text and the execution plan, and then you spend hours looking at execution plans trying to figure out what’s wrong in the query, much less how to fix it.

All that’s changing – we’re introducing sp_BlitzCache™.

Life Should Be Easy

We try to make things easy for you. sp_Blitz® and sp_BlitzIndex® make life easy – they find help you find configuration and indexing problems. sp_AskBrent® helps you find problems that are hitting your SQL Server right now. We decided to take it one step further and apply our expertise to analyzing the plan cache.

There are a lot of queries that will help you get data out of the plan cache. A lot of those queries are also fairly complicated. They present a lot of data, but they don’t help you interpret the results.

Work Just Got Easier

sp_BlitzCache™ analyzes the top poorly performing queries for many different problems. After finding your bad habits, sp_BlitzCache™ builds a list for easy analysis. Your bad habits are outlined in a summary result set:

A summary of your code sins.

A summary of your code sins.

In addition, each of the queries in the top 50 worst performers will have their warnings called out in a separate warnings column:

Your detailed confession.

Your detailed confession.

Warning!

This will not work on SQL Server 2005. If you want to analyze your plan cache on SQL Server 2005, you can use our legacy plan cache script.

Options

There are a number of different options available with sp_BlitzCache™. If you want a refresher, just run EXEC sp_BlitzCache @get_help = 1. Otherwise, here’s a quick overview:

@get_help - Displays a summary of parameters and columns.

@top - Determines the number of records to retrieve and analyze from sys.dm_exec_query_statssys.dm_exec_procedure_stats, and sys.dm_exec_trigger_stats.

@sort_order - CPU, reads, writes, etc. This controls collection and display.

@use_triggers_anyway - Don’t set this unless you’re crazy. SQL Server 2008R2 and earlier has a bug where trigger count can’t be trusted. It seemed safer to ignore triggers on those versions of SQL Server than give you bad information.

@results - There are three ways to display the detailed results grid. * Narrow will only display averages for CPU, reads, writes, and duration. * Simple, the default, displays totals and averages. * Expert displays all of the information collected by sp_BlitzCache in addition to metrics normally displayed.

If you want to save the results to a table, you can use @output_database_name@output_schema_name@output_table_name to persist the results to a table. If the table doesn’t exist, it’ll be created for you.

Download it now!

Download sp_BlitzCache now!

Introduction to Extended Events (video)

Extended Events were introduced with SQL Server 2008. Extended Events is a replacement for Server Side Traces and so much more – it’s a lightweight way to look at SQL Server performance, events, deadlocks, locking, blocking, and more. In this webcast, Jeremiah Peschka provides a high-level introduction to Extended Events. You’ll learn about how Extended Events are structured, how to find out what’s available, and where to get started.

Find out more at our Extended Events resource page.

css.php