Databases uses statistics about data to determine the best way to query tables. Should the database seek or scan? How many rows will be returned from tables that are being joined? Oracle is no exception, but the way Oracle treats statistics is different from SQL Server.
Oracle Statistics Collection
Just like SQL Server, Oracle can collect statistics manually. Unlike SQL Server, Oracle includes an automatic statistics collection job.
The Oracle automatic stats collection job is scheduled to run, by default, during configured maintenance windows. These maintenance windows are configured for each Oracle database that you’re running. This gives the people who know the data best, the DBAs, a chance to schedule statistics collection whenever it’s needed and whenever it won’t cause problems for production workloads.
Oracle also supplies manual stats gathering at the index, table, schema, or even database level through the DBMS_STATS package:
1 2 3 4 |
BEGIN DBMS_STATS.gather_table_stats('SH', 'SALES'); END; / |
The statistics gathering utilities have a wealth of options available that allow DBAs to choose to only rebuild existing stats, change the number of buckets in the histogram, choose sample percent, and control how new statistics are generated.
Controlling Statistics Collection
How much control do you have about how statistics are collected in SQL Server? Your options are basically: automatic generation is on or off, automatic updates are on or off, and the time your statistics collection job is allowed to run.
There are a number of options that are available in Oracle for determining how statistics are collected. The options exist at multiple levels – from the entire database down to a single table. This allows DBAs to tune for specific scenarios where one table may need to gather statistics with a much higher degree of parallelism, or at a different sample rate, than other tables.
Data Change Threshold
SQL Server will opt to update statistics if 20% of the data, plus 500 rows, has changed.
While Oracle will not automatically update statistics, the percentage of stale rows is tracked on an index by index basis. When the percentage of stale rows goes beyond the STALE_PERCENT
value, the statistics are considered to be stale and need to be collected again. Although the statistics update will not happen automatically, it is possible to monitor this value and take appropriate action, if necessary.
The Ascending Key Problem
What happens when you add new data that the engine isn’t aware of? Oracle will make a best guess based on the current distribution of data in the table. There’s additional logic built-in to assume that fewer rows exist the further the queried value gets from the minimum or maximum value that’s currently tracked. This is probably true for many systems, but it could cause problems when a lot of data is coming in. On the bright side, as a DBA, you have full control over when statistics get updated.
Learning More About Oracle Statistics
You can learn more about Oracle’s table statistics in the Optimizer Statistics Concepts documentation, the DBMS_STATS
package, and through the DBA_TAB_HISTOGRAMS
view.
2 Comments. Leave new
Excellent post Jeremiah. Especially for those of us who are primarily Oracle DBA’s but also have to manage SQL Server. This gives us more insight between the two. Please do more.
Thanks.
Thanks for the kind words!