Aug 302010

This post is also available in: Bulgarian

(this is relevant for 10.2.)

Let me tell you a story. It is about an application, that “freezes” for a few seconds every hour. In fact, identification of the problem took us quite some time. The symptoms are visible on application server level. So we checked this and that, in the app, network, OS, DB…

We noticed the following: this application uses one very busy table (among the others). The information there is kept for a few hours, then can be deleted; but during those few hours we gather lots of gigabytes. The operations on this table are Insert and Select. The table is partitioned by hour, and every hour a DB job is run to purge old data using

alter table ... truncate partition ... UPDATE GLOBAL INDEXES

This UPDATE GLOBAL INDEXES is put, because the primary key uses a global index. For partitioned tables an index can be unique inly if it is global or if the partition kay column is part of the index.

There is one major problem in the scheme. The strange application “freeze” happens every hour at the very same minute, when the TRUNCATE job is started.

So we looked carefully in the database, holding the mentioned table. There we saw that for a few seconds almost all INSERT-ing sessions INSER wait for cursor: pin S wait on X. There is also some library cache lock. We are talking for 4-5-6 seconds, but quite a lot sessions.

* * *

What happens indeed?

Disclaimer: I might be wring here! If you find some mistake, please write me back.

truncate (or, in our case, alter table ... truncate partition) is a DDL statement. Every DDL statement invalidates all cursors on the object (table), that gets modified (altered). Then another session comes, trying to do an insert (we have tens of INSERTing sessions every second). The first INSERT notices the invalid cursor and tries to parse it. We are talking about parsing a single INSERT if a single row in a table – should complete in milliseconds. But for this parse to happen, the session need to “lock” the definition of the table in the library cache. Otherwise ith may parse the query, come up with a plan, but somebody else might change the structure of the table in the same time and the plan may be wrong. So the firs INSERTing session needs a library cache lock.

But the truncate is still running. The insert have to wait for the truncate to finish. The it will receive the lock on the table structure and complete the parse. It all seems waste of time for a simple truncate – it does not change the table structure at all. But it is a DDL. So the first insert is blocked, and waits.

A few milliseconds later another session comes, trying to execute an insert. So the second session has to check if the insert statement is already parsed. Calculates the hash of the statement, goes to the library cache and, just to be sure, puts one instant lock on the hash. In 10g/11g this is done using a mutex. yes, but the first insert is still trying to parse (in fact, waits to begin parsing). And here we have the rare chance to see a real mutex – an operation that is supposed to be so fast, to happen in an instant. This wait is known as cursor: pin S wait on X.

And the same happens with the next session trying to do an insert. And the next. And next…

* * *

Usually a truncate is quite fast. It should be only a data dictionary update, does not depend on the table size. But here we have some malformed truncate, that, other than lowering HWM, goes to fix some index.

After the completion of the tuncate everything goes to normal. But until then, we have a few seconds downtime. And this is every hour.

Oracle have identified this as a problem in bug 7640597. They even made a patch. The idea if the patch is very simple: do not touch the logic, simply the cursor invalidation is postponed until the very end of the truncate. This way all cursor can be re-parsed.

unfortunately, after releasing the patch, some unwanted “side effects” appeared. So Oracle removed it form Metalink. The general solution is in bug 8441239, fixed in 11.2. But it is not backported to older versions as of the writing of this post.

* * *

So we have a few options:

1. We may avoid doing a TRUNCATE (or other DDL – like alter table exchange partition). Instead, we can do a DELETE. This will add some load to the database; generate lots of redo (and undo); and work slower. But it is a simple DML, does not invalidate any cursor.

2. We may try to do the TRUNCATE faster by creating partitions every 30 minutes, or even less. This way every TRUNCATE will have less work on the index. But this only masks the problem – it is not a solution.

3. We may add the partitioning key column to the primary key. Then we may do the index local, and the TRUNCATE will complete in an instant, as we usually expect for a TRUNCATE. But this is not always possible – depends on the data and the application.

4. We may upgrade to 11.2. But this cannot be done easily – a lots of tests are needed

5. We may request a backport of the patch. But then we will need “a serious business justification”. This may take weeks, even months; and we will be the first ones to use it on production.

The solution depends on the situation. I will show you my testing suite, which I used to evaluate the options

Stay tuned!

 Posted by at 17:35

Sorry, the comment form is closed at this time.