Average DB load, RAC edition

 Oracle  Comments Off on Average DB load, RAC edition
Feb 172014
 

A few days ago I posted a query that returns the average load for given database per day, week or month. Works great on single instance databases, but not in RAC environments

RAC databases have one line per instance in DBA_HIST views, so my strategy of subtracting the value from the previous line returns very strange results. So here is the RAC-aware version of the query:

select trunc(this_day, 'MONTH'), avg(day_db_time), sum(cnt) scnt
  from (select this_day, count(1) cnt,
                sum(case
                       when snap_db_time > 0 then
                        snap_db_time/1000/1000/60/60/24
                       else
                        null
                     end) day_db_time
           from (select trunc(begin_interval_time) this_day, tm.snap_id,
                         value - lag(value, 1) over(order by tm.instance_number, tm.snap_id) snap_db_time
                    from dba_hist_sys_time_model tm, dba_hist_snapshot s
                   where tm.stat_name = 'DB CPU'
                     and s.snap_id = tm.snap_id
                     and s.instance_number = tm.instance_number
                     and s.begin_interval_time > add_months(trunc(sysdate, 'YEAR'), -24)
                     and s.begin_interval_time < trunc(sysdate, 'MONTH')
                     )
          group by this_day)
 group by trunc(this_day, 'MONTH')
 order by 1

The only difference is, I'm careful which is the instance on the previous line. This statement works also on single-instance databases.

 Posted by at 19:38

Average DB load

 Oracle  Comments Off on Average DB load
Feb 052014
 

Sooner or later, the year ends. Again, it’s time to do some reports, to plan a budget, to say “The overall performance of our database is this and this”. But how to find this data?
As I have told already, we keep AWR reports for very long period. One I calculated the database space usage growth, in order to plan the storage needs; now I need tha database server load.

select trunc(this_day, 'MONTH'), avg(day_db_time), sum(cnt) scnt
  from (select this_day, count(1) cnt,
                sum(case
                       when snap_db_time > 0 then
                        snap_db_time/1000/1000/60/60/24
                       else
                        null
                     end) day_db_time
           from (select trunc(begin_interval_time) this_day, tm.snap_id,
                         value - lag(value, 1) over(order by tm.snap_id) snap_db_time
                    from dba_hist_sys_time_model tm, dba_hist_snapshot s
                   where tm.stat_name = 'DB time'
                     and s.snap_id = tm.snap_id
                     and s.begin_interval_time > add_months(trunc(sysdate, 'YEAR'), -24)
                     and s.begin_interval_time < trunc(sysdate, 'MONTH')
                     )
          group by this_day)
 group by trunc(this_day, 'MONTH')
 order by 1

This query is a tricky one. It gives the average load (DB time) for every month in the last 2 years, if you keep such amount of snapshots. If you keep only one month, you can report by day - it it is not that impressive.

One of the detail worth mentioning is, DB time (just like every system stat) is an ever growing number, counted since the instance restart. The query takes the value for every snapshot and subtracts the value of the former snapshot. If the instance is restarted, the number is negative - because the counter starts form zero. To build proper stats, I simply skip such numbers. Instance restart happens so rarely, so I think it cannot influence the stats so much.

SQL> select sysdate, startup_time, trunc(sysdate-startup_time) Days_since_restart from v$instance;

SYSDATE   STARTUP_T DAYS_SINCE_RESTART
--------- --------- ------------------
05-FEB-14 19-FEB-13                350

The next thing I do in the query is sum the numers by day, and get he average for each month.

To be honest, the query took me quite a while to create. If you see some error, please share your thoughts in the comments below

Once I have the data, I draw a graph in Excel - one line for DB time, one for DB CPU (where tm.stat_name = 'DB CPU') and, to put some business context - number of business transactions for the month (which represents the actual, real-world "work done"). The I show to the Bug Boss how we do more and more actual work, but manage to keep the database load almost constant. In some months there's even decrease in the load - great example for the optimizations done. For example, we we have enabled result cache for some queries, this was easy to see on the graph; optimizations to the stats gathering procedure was also easily seen. The good admins alway suffer from lack of visibility of their efforts. It's hard to show the lack of problems. So a graph like this speaks the same language as the people which decide on salaries and head count.

Another interesting graph is the growth over some period - e.g. a year. "During Jan 2014 we made X% more payments through our system compared to Jan 2013, but the database server load growth is only Y%". Every manager aware of Oracle's CPU license cost will understand that this means.

P.S. RAC-aware edition of this query can be found here

 Posted by at 10:03