Big Investment in Data Warehousing
Almost all of the new features reflect an increased focus on data warehousing. This is similar to recent SQL Server releases and, indeed, most mainstream database releases. Different vendors are betting on BI and Oracle’s most recent release is no exception.
Full Database Caching
Caching data in memory can avoid costly disk-based I/O. By default, databases will cache frequently accessed data. There’s a setting in Oracle called the small table threshold – scans of tables larger than the
_small_table_threshold will use a portion of memory that should age out faster. This idea (in SQL Server it’s called page disfavoring) exists to prevent large table scans from clearing out database server memory.
What if you want those tables to stick around in memory? Modern servers can hold huge amounts of memory and with Oracle RAC, you can scale your buffer pool across multiple servers.
The full database caching feature changes this behavior. Once full database caching is enabled, Oracle will cache all data in memory as the data is read.
It’s important to note the guidance around this new feature – it should only be used if the logical size of the database is no bigger than 80% of available memory. If you’re using Oracle RAC, DBAs need to be doubly careful to make sure that the workload is effectively partitioned across all RAC nodes. Read more about the features and gotchas in the Oracle Database Performance Tuning Guide.
Big Table Caching
Speaking of larger tables, the new patch includes a feature called automatic big table caching. If you have more data than you have memory, this feature seems a lot safer than full database caching.
Rather than attempt to cache all data in memory, big table caching sets aside a portion of the buffer pool for caching data. The DBA sets up a caching target (
DB_BIG_TABLE_CACHE_PERCENT_TARGET) and up to that percentage of memory will be used for caching really big tables.
Like the full database caching feature, this works with single instances of Oracle or Oracle RAC. There are, of course, considerations for using the feature. The considerations are documented in the VLDB and Partitioning Guide.
In-Memory Column Store.
I’ve already covered Oracle’s In-Memory Column Store. There are some additional features that have been announced that make the In-Memory Column Store feature a bit more interesting.
A new data flow operator has been added to Oracle: the
VECTOR GROUP BY. This new operator enables efficient querying for In-Memory Column Store tables. Although no new SQL operators have been added to Oracle for the
VECTOR GROUP BY, it’s possible to use query hints to control optimizer behavior. Six new hints have been added to push the optimizer in the right direction.
In the Oracle world, tables are almost always heaps. Clustering is done only rarely and in special circumstances.
Attribute clustering lets the data warehouse designer specify a table order based on one or more columns of the table. So far, this sounds like something that you could already do. It is also possible to use join attribute clustering to cluster on columns connected via a foreign key relationship. The data warehouse designer, or even the DBA, can modify table structure to cluster a fact table based on dimension hierarchy.
An additional feature called interleaved ordering uses some pretty advanced storage mechanisms to get better I/O out of an attribute clustered table. Interleaved clustering makes it more efficient to perform queries where you may not know the search predicates, or where search predicates may vary over multiple columns (sounds like a dimensional model to me).
There are a few upsides to attribute clustered tables – they require fewer (or no) indexes, they may require less I/O, and they improve data compression. As always, consult the Attribute Clustering documentation for all the details and fine print.
Zone maps are new to Oracle 22.214.171.124 – they provide an additional level of metadata about the data stored in a table. Instead of operating at the row level, like an index, a zone map operates at the level of a range of blocks. In a way, an Oracle zone map is a lot like a SQL Server Columnstore dictionary – it allows I/O pruning to happen at a much coarser granularity than the row level.
When zone maps are combined with features like attribute clustering or partitioning, the database can take advantage of metadata at many levels within the zone map and perform incredibly efficient I/O pruning. Data can be pruned at the zone, partition, or subpartition level, and when attribute clustering is in use the pruning can happen at the dimension level too.
Zone maps are going to have the most benefit for Oracle databases with a heavy data warehousing workload. Big scans with common predicates, especially with low cardinality predicates, will use zone maps to eliminate a lot of I/O. Users taking advantage of attribute clustering and partitioning will also see improvements from this feature. You can read more in the Zone maps section of the data warehousing guide.
More than Data Warehousing
This is a wealth of features. Seriously, that’s a crazy investment in data warehousing for any release of a product, much less a point release.
Every Developer’s Dream: JSON Support
Everybody hates XML. It’s cumbersome to query and people love making fun of it. Many databases have been adding support for JSON and Oracle is no exception.
Oracle’s JSON implementation supports search indexes for general querying (give me the third element of the array in property “sandwiches”). Multiple index types exist for the JSON data type, giving developers fine grained control over how they interact with the data stored in the database. Oracle full text search also supports full-text queries using a new
json_textcontains operator. If you want to jump down the JSON rabbit hole, make sure you check out the JSON documentation and pay particular attention to the sections about indexing – there are a number of examples of what will and won’t work.
Astute readers will notice that with the addition of JSON to Oracle 12c, SQL Server is now the only major database platform that doesn’t support JSON as a native datatype.
Cross Container Querying
Oracle 12c brought multi-tenant databases, sometimes referred to as contained databases or containers. If you wanted to query across databases, you’d typically have to go through some shenanigans. The
CONTAINERS clause makes it easier for developers to write queries that can go across all containers, or even some containers.
Basically, instead of querying directly from one table, a developer would put the name of the table in a
SELECT CON_NAME, *
FROM CONTAINERS(employees) ;
And if you wanted to query only specific containers:
SELECT CON_NAME, *
WHERE CON_ID IN (1,2,3,4) ;
While not useful for all systems, this does allow DBAs to perform administrative or back office queries across multiple database containers. The gotcha? All databases need to have the same schema.
Oracle 126.96.36.199 has a number of new features, some of which I didn’t mention since they relate entirely to managing the contained database features. If you want to learn more, check out the new feature guide – it’s full of links to the Oracle documentation on each new feature.