Feb 052014

This post is also available in: Bulgarian

Sooner or later, the year ends. Again, it’s time to do some reports, to plan a budget, to say “The overall performance of our database is this and this”. But how to find this data?
As I have told already, we keep AWR reports for very long period. One I calculated the database space usage growth, in order to plan the storage needs; now I need tha database server load.

select trunc(this_day, 'MONTH'), avg(day_db_time), sum(cnt) scnt
  from (select this_day, count(1) cnt,
                       when snap_db_time > 0 then
                     end) day_db_time
           from (select trunc(begin_interval_time) this_day, tm.snap_id,
                         value - lag(value, 1) over(order by tm.snap_id) snap_db_time
                    from dba_hist_sys_time_model tm, dba_hist_snapshot s
                   where tm.stat_name = 'DB time'
                     and s.snap_id = tm.snap_id
                     and s.begin_interval_time > add_months(trunc(sysdate, 'YEAR'), -24)
                     and s.begin_interval_time < trunc(sysdate, 'MONTH')
          group by this_day)
 group by trunc(this_day, 'MONTH')
 order by 1

This query is a tricky one. It gives the average load (DB time) for every month in the last 2 years, if you keep such amount of snapshots. If you keep only one month, you can report by day - it it is not that impressive.

One of the detail worth mentioning is, DB time (just like every system stat) is an ever growing number, counted since the instance restart. The query takes the value for every snapshot and subtracts the value of the former snapshot. If the instance is restarted, the number is negative - because the counter starts form zero. To build proper stats, I simply skip such numbers. Instance restart happens so rarely, so I think it cannot influence the stats so much.

SQL> select sysdate, startup_time, trunc(sysdate-startup_time) Days_since_restart from v$instance;

--------- --------- ------------------
05-FEB-14 19-FEB-13                350

The next thing I do in the query is sum the numers by day, and get he average for each month.

To be honest, the query took me quite a while to create. If you see some error, please share your thoughts in the comments below

Once I have the data, I draw a graph in Excel - one line for DB time, one for DB CPU (where tm.stat_name = 'DB CPU') and, to put some business context - number of business transactions for the month (which represents the actual, real-world "work done"). The I show to the Bug Boss how we do more and more actual work, but manage to keep the database load almost constant. In some months there's even decrease in the load - great example for the optimizations done. For example, we we have enabled result cache for some queries, this was easy to see on the graph; optimizations to the stats gathering procedure was also easily seen. The good admins alway suffer from lack of visibility of their efforts. It's hard to show the lack of problems. So a graph like this speaks the same language as the people which decide on salaries and head count.

Another interesting graph is the growth over some period - e.g. a year. "During Jan 2014 we made X% more payments through our system compared to Jan 2013, but the database server load growth is only Y%". Every manager aware of Oracle's CPU license cost will understand that this means.

P.S. RAC-aware edition of this query can be found here

 Posted by at 10:03

Sorry, the comment form is closed at this time.