Tag Archive: fts

A Different World: Three Use Cases For Riak

Writing and deploying an application is pretty easy, there’s a process to follow that will get you most of the way there. The tricky part is getting the business logic down. But this isn’t about writing code, this is about what happens once you have an application out in the wild. The tricky part comes when you have to grow with your application. There are some tricky parts of keeping an application up and running. Things get complicated when you start trying to add complex functionality for your data. For those of us who use SQL Server there are a lot of options within SQL Server, but some of them require a seasoned DBA, expensive features, or both.

Durability

Making sure your data sticks around is important. Losing data can be catastrophic for a business. Even if you have a backup, it can take hours or even days, to restore your database and verify that everything is back up and running. That doesn’t even include the possibility that a backup is corrupted and you won’t be able to bring your database online. Or, worse yet, you could lose your SAN completely. Durability is important.

With SQL Server it’s possible to set up a cluster, mirroring, or replication to provide a second copy of your data for readability. These three solutions all require knowledge of some specialized features of SQL Server. Keeping each of these features up and running requires some knowledge, planning, set up, and monitoring. None of it is easy, some of it’s hard, and some features even need a lot of care and feeding to stay up and running. DBAs tell horror stories about hand feeding replication for days to get it up and running again after a catastrophic failure; it’s almost a badge of honor. Here’s the thing: none of this should be difficult. One way data durability (mirroring, transactional replication) is limited in its functionality – reads can happen everywhere, but writes can only happen in one place. If the master server goes down, it can be difficult to switch the master server around.

Durability problems plague DBAs every day, whether they know it or not. I spent a lot of time trying to solve my durability problems by configuring replication or log shipping and building complex application logic to support the possibility of multiple write locations. I later solved the problem by using Riak. One of the main benefits of Riak is the availability and fault tolerance it provides. That is: it’s durable as all get out. Data isn’t written to one place, it’s written to several places at once; writes won’t succeed unless more than one server responds that a write is successful. In a way, it’s like a really fancy version of database mirroring or SQL Server Denali’s Availability Groups feature that you can have right now.

When a server in a Riak cluster eventually fails, as hardware always does, you could recover it using a filesystem backup. Riak will handle making sure everyone has the right data by using a very cunning technique called read repair. If the server completely fails the other option is to simply replace it. You tell the cluster of servers that one server is gone and another server is taking its place. The cluster then figures out how to perform recovery or redistribute the data. It’s a lot less painful than shipping full database backups or replication snapshots across the data center or even across the country.

Latency

Relational database, and SQL Server in particular, use B+trees indexes to store data. B+trees provide reasonably fast look ups of data (any lookup of a single row will need to traverse the same number of pages on disk as any other lookup). By contrast, inserting data into a B+tree is not always particularly fast – page splits may occur, forcing data to be shuffled around on disk and moving things out of order could require intermediate pages to be updated. Because of the B+tree structure, many SQL Server DBAs advocate using an ordered, constantly increasing key for clustered indexes.

Riak, by contrast, defaults to using the Bitcask storage back end (with the option to use several others). Bitcask is unique in that it doesn’t use a B+tree to store data on disk. Instead Bitcask uses a pair of data structures – a series of data files (written as log-structured hash table) and an in memory directory of keys (a keydir) to make it easy to find records in the log-structured hash table. Reading data out of Bitcask will take two hops – one for the keydir and one for the data file – versus many potential hops in larger SQL Server tables (probably around 4 physical hops with a cold cache on a large-ish table).

There are two things that immediately stand out to me about Riak’s data latency. One is that Riak’s latency is predictable. Any write is going to take as much time as it takes to stream the data to disk. Since Bitcask is a log-structured hash, there’s no possibility of fragmentation; data is written in the order it arrives. Like some other databases, Riak does not perform in place updates of data. Instead, a new record is written in the log-structured hash table and the keydir is updated with the location of the new record. Because of this, it’s easy to predict how long an insert or update will take – as long as it takes to write that many bytes of data to disk.

Just as Riak provides a predictable low latency for writes, it also provides low latency for reading data from disk. Riak doesn’t use locking, so there can be no blocking. Instead read latency boils down to the amount of time that it takes to pull data off of disk. The more data there is in a given record, the slower the read is going to be. Obviously, there’s some seek time involved, but it’s negligible when you consider that a seek involves a single memory read and a single disk read.

Full Text Search

SQL Server’s full text search has caused a lot of problems for a lot of smart people. The more complex the schema and the more data load involved, the more likely that SQL Server’s full text search is going to have some problems. It’s a great tool, but there are some limitations and tuning full text queries is very different from tuning regular T-SQL queries.

A problem around full text search is the inability to scale the full text search service independently of the SQL Server. They’re tied together on the same physical instance. If you need to increase the performance of full text search, you must resort to increasing the performance of SQL Server, including the same licensing costs for SQL Server. Anyone how has gone from a 2 socket to a 4 socket machine can tell you that a doubling in license costs isn’t trivial. Full text search can, like many things, be moved to a separate SQL Server using replication, but SQL Server’s transactional replication has a reputation for being a bit manpower intensive. Many teams eventually move full text search to something other than SQL Server. Pushing full text search outside of the database engine frees up the database engine to serve other queries. Some people, like the fine folks at StackOverflow, have used Lucene.net. One of the advantages of using a full text search engine is that it offers phenomenal flexibility. Unfortunately, both SQL Server’s full text search Lucene and limited to a single node. Riak Search is Lucene compatible, flexible, and durable.

Riak Search automatically indexes documents whenever they’re saved in a specific bucket, just like SQL Server. Server side triggers are created to asynchronously index data as it is saved. Unlike SQL Server’s full text search, it’s possible to create custom functionality for searching. Different word breakers and tokenizers can be created to support different methods of indexing. Riak Search also allows complex documents to be indexed using custom schemas, much like Lucene.

The icing on the cake, for me, is that Riak Search can be used to provide linear scaling – as servers begin to run out of capacity, additional nodes can be brought online to quickly scale out. The upside of Riak Search is that there’s also durability built-in: the search indexes are spread across the cluster. Spreading indexes across the cluster, term partitioning, means that load from complex queries can be served by many nodes at once making it possible to provide overall higher query throughput on large data sets.

Making It Work

It’s fair to say that many people using SQL Server are also using the .NET Framework. While Riak is implemented in Erlang and runs on Unix based operating systems, it’s easy to work with SQL Server and Riak together in the same environment. CorrugatedIron is a community developed, open source, .NET library for Riak. While it’s still under heavy development pending Riak’s 1.0 release later this month, CorrugatedIron makes it possible to develop cross database functionality to save data in SQL Server and Riak without requiring developers to learn a new programming language or operating system. Functionality can be moved out of SQL Server and onto a distributed platform where functionality can be scaled horizontally and linearly.

This isn’t to say that I’m advocating for teams to abandon SQL Server in favor of Riak. In fact, that’s the opposite of what I want to say. Teams should pick a best of breed solution for their data storage needs. SQL Server is a great relational database, but there are some things it doesn’t do as well as we’d like. Riak is a great distributed database that is impervious to single node failure. They both provide different features and functionality that complement each other very well. Riak removes single points of failure, provides rich full text search functionality, and provides consistent latency guarantees. SQL Server provides rich querying semantics on high structure data and support for atomic transactions.

Jeremiah Peschka

Jeremiah Peschka has worked as a database and emerging technology expert at Quest Software where he researched new trends and technologies in the world of data storage. Over the course of his career he’s worked with companies across many industries as a system administrator, developer, and DBA. He’s been involved with all aspects of application development and deployment. He likes cheesecake, coffee, and ice cream.

Website - Twitter - Facebook - More Posts

StackOverflow’s SQL 2008 FTS issue solved

Whew – the Microsoft folks really came through on this one and tracked down the problem pretty quickly.  I want to personally thank Denis Churin, Nikita Shamgunov and Sundaram Narayanan for their detailed investigation into the issue and helping us get it fixed. I’m going to explain the issue to help anybody else get through it in case they run into similar problems.

How SQL 2008′s Full Text Search is Different than 2005

Let’s start with the basics: in SQL Server 2005, the full text data lived outside of SQL Server and wasn’t subject to transactional locking.  If you inserted a gazillion records into a table that had a full text index on it, the indexes were rebuilt without worrying about simultaneous inserts and locking.  It wasn’t like a traditional SQL Server table.

In SQL Server 2008, the full text indexes were moved inside the database server and became subject to transactional locking.  By default, the indexes are updated automatically whenever SQL Server determines they need to be updated.  When that merge process happens, SQL Server needs to obtain some locks on the indexed table.  Ideally, it grabs the locks when there’s a brief period of no load, does its merge work, and lets go of the locks.  The merge process can be quite brief (well under a second) as long as the amount of data hasn’t changed dramatically.

In a heavily transactional environment when there’s a whole lot of inserts/updates/deletes on the indexed table, the DBA may need to restrict those merge activities to only a certain time window of the day.  SQL Server 2008 gives you that ability to do merges manually, but I wanted to avoid that on StackOverflow.com.  Every time I put in a manual job into a solution, it requires manual maintenance, alerting and corrective actions, and I don’t take that lightly.

Where The Problem Comes In: Convoys of Queries

Imagine this scenario:

  1. A full-text select query is issued that looks like it’ll finish extremely quickly
  2. The SQL Server doesn’t see much activity in the full text table (only selects, not inserts/updates/deletes) so it kicks off a merge
  3. The select query doesn’t finish as quickly as the engine expected, and the merge can’t start until it obtains the locks it needs
  4. More full-text queries come in (could be selects, inserts, updates, deletes) that need to obtain locks

Those newly issued queries in step 4 are suddenly delayed while waiting for query #1 to finish.  The impact on full text performance varies depending on how long it takes query #1 to finish – might be milliseconds, might be seconds.

Denis Churin, Nikita Shamgunov and Sundaram Narayanan (the Microsoft heroes) suspected this might be our performance problem at StackOverflow, so they had us grab a memory dump and a database backup at the exact moment we were having performance problems.  They looked at the memory dump and isolated a single particular full-text query that was confusing the engine.  The engine was building an execution plan for it that didn’t work well, and instead of taking milliseconds, it took seconds (as many as 50 seconds).  During that time, performance went into the toilet.

They rewrote the query in a different way the engine would analyze better, and when that query was changed, presto, the performance problems disappeared.

There’s a QFE coming in a few weeks that will let the merge thread run without blocking other queries, but for now, we’re in good shape.

Diagnosing The Problem in Your Environment

Sundaram gave us this query to help troubleshoot when a long-running query is blocking other queries.  This helps identify the issue when a full-text select query suddenly blocks the SQL Server 2008 full text merge thread, and you can look at that query to make it run faster.  I haven’t tested this in depth, but these guys have proven to be much more qualified than me, so I have a hunch it’ll work, heh:

declare @temp int
declare @parent int
declare @final int
set @parent = 0
while (@parent = 0)
BEGIN
select @parent=blocked from sys.sysprocesses where lastwaittype=’LCK_M_SCH_M’ and waittime > 30000
WAITFOR DELAY ’00:00:01′;
END

WHILE (@parent <> 0)
BEGIN
set @final = @parent
select @temp=blocked from sys.sysprocesses where spid = @parent
set @parent = @temp
END
select * from sys.sysprocesses where spid = @final
select * from sys.dm_exec_requests ER
CROSS APPLY sys.dm_exec_sql_text(ER.plan_handle) ST
where ER.session_id = @final
select * from sys.dm_exec_requests ER
CROSS APPLY sys.dm_exec_query_plan(ER.plan_handle) ST
where ER.session_id = @final

And now, finally, I can start performance tuning that server!

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

Website - Twitter - Facebook - More Posts

Before you upgrade to SQL Server 2008…

This month’s Redmond Mag glows about some new features in SQL 2008, and yes, it does have a lot of cool tricks up its sleeve.  But before you go upgrading your servers to get those new features, there’s one thing you need to know.

New versions of SQL Server are not always faster for every query.

This may come as a surprise to you, but every versions of SQL Server have areas where they require manual tweaking in order to be as fast as the last version.  I worked with Jeff Atwood and the guys at StackOverflow this past weekend to move them onto SQL Server 2008, and we had a nasty surprise.  Jeff summed up the issues with the SQL 2008 upgrade on his blog, but I’ll cover it here from a DBA perspective.

The app and SQL 2005 were on the same box (which tells you a lot about the performance of his code, because that site is doing pretty well) and they got a new box running SQL 2008.  We restored the databases over to the 2008 box for testing, ran queries, and compared performance.

Full text search results were slower, but we didn’t catch just how slow they were because we focused on the queries that were currently running slow on the 2005 box.  Some of the queries we didn’t test had been running great on 2005, but were dead slow on 2008.

Why slower? Different execution plans.

We did catch a second issue: on a particularly slow set of queries, SQL 2008 was building a different execution plan than 2005.  This execution plan worked – but not for the kind of load StackOverflow has.  I narrowed down the differences and I was able to overcome it with trace flag 2301, which lets SQL spend a little more time building a good execution plan. By spending more time compiling the plan initially, and saving that plan in cache, we got phenomenally better results.  Query times went from 190ms on SQL 2005 to 40ms on SQL 2008.  Hubba hubba.  All systems go.

Denny Cherry, a performance tuning guru with a long history at MySpace.com and Awareness Technologies, asked me why I didn’t manually set up query execution plans for them.  If it was my server and I was Jeff’s full-time employee, that’s exactly what I’d do.  Problem is, if you don’t have a full-time DBA to watch the server and identify what the right (and wrong) execution plans are, you introduce an element of mystery.  I can imagine what would happen three months down the road: performance would go to hell in a handbasket as schemas, queries and indexes changed over time.  Jeff wouldn’t know what things were the fault of the engine, versus what things were the fault of the DBA who’d changed these settings a while back.  So I had to pick a solution that wouldn’t require StackOverflow to incur a huge new payroll expense.

Went live with 2008, -T2301 killed us.

We went live with SQL 2008, rebuilt the indexes & stats, turned on the site (now hosting IIS on a separate box, mind you) and immediately the server slowed to a crawl.  I figured it’d take a few minutes to get a good set of execution plans built, but the server just wasn’t recovering.  Doing diagnostics on the server, I discovered that queries using sp_executesql were just knocking the server over.  Ahhh-ha!  Those were dynamic SQL strings, and those would probably get new execution plans built every time.  The trace flag -T2301 failed us there, so we had to rip it back out.

How much would you pay to avoid a scenario like this?  $19.95?  $29.95?  But wait, there’s more!

After ripping out the trace flag, the server stabilized around 20-30% CPU, but those numbers were too high for a weekend night.  When they came up to full load during the week, the server fell over, averaging 100% CPU for a few minutes at a time.  The problem query was doing a union between three full text searches, but before you scream bloody murder about a union, even running the three searches independently was taking 60-70% of the time they took when unioned together.  We were screwed.  The guys had to make a change to their application and cache data on the web server’s hard drive in order to sustain their normal load.

Ugh.  As a DBA, that’s a failure when the app guys tell me that.  This is an application that used to live fine on a single box, and now, even with SQL 2008 on its own hardware, the app guys have to work around a weakness in SQL 2008.  Ouch.  I take that pretty personally.

The lesson: capture a full trace before you upgrade SQL Server.

The lesson: before you upgrade, capture a full trace of your server’s load and replay it against the new version.  Analyze before and after duration times and CPU numbers for both versions, and identify the list of queries that run slower.  Examine how often they actually run in production, and think about how that’s going to affect your load.  This was my own failure – after working with the guys at StackOverflow and seeing how tight their queries were, it seemed like the slowest queries on SQL 2005 were still in pretty good shape.  Unfortunately, hidden below the surface in queries that were running in 50-75ms on SQL 2005, were some queries that ballooned to over 1 second on SQL 2008, and went much higher under load.

Furthermore, a simple trace replay still won’t give you the full picture because traces don’t throw the same amount of load at the replay server in the same time.  In a web server scenario, you may have a hundred queries come in simultaneously, and you want to see exactly how the new server will be affected by that – but replaying a trace with the Microsoft native tools won’t give you that answer.  For that, you need a benchmarking tool like Quest Benchmark Factory or HP Mercury LoadRunner, etc – something that can capture a load and then replay it with the same velocity and bandwidth.

Do I like SQL Server 2008?  Yeah.  But do I wish we could have avoided what happened this week with StackOverflow?  Hell yeah.

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

Website - Twitter - Facebook - More Posts