2. Database side diagnostics

Once all the considerations on "Application side diagnostics" are applied, the next step is to look into the settings of ALTIBASE HDB.

Log file I/O contention

ALTIBASE HDB makes use of WAL (Write Ahead Logging) protocol which requires that all modifications are written to a transaction log (logfile) before they are applied to DBMS. If you want to enhance the performance of the application, you will need to check the state of log files.

There are two threads related to the state of log files. The first one is "logfile prepare thread" which allocates new memory in order to write a redo log. The second one is "log write thread" which applies all modifications into DBMS.
Once the speed of "log write thread" exceeds "logfile prepare thread", "log write thread" has to wait for "logfile prepare thread" to prepare log files. This symptom is called as "logfile IO contention".
ALTIBASE HDB provides a parameter called "PREPARE_LOG_FILE_COUNT"(default value is 5) to address this issue. Increasing this value sufficiently can prevent the problem in advance.
You can monitor the state of logfile as follows:

select LF_OPEN_COUNT, LF_PREPARE_COUNT, LF_PREPARE_WAIT_COUNT from v$lfg;

If the value of LF_PREPARE_WAIT_COUNT is greater than 1 from the result of the above query, you need to increase the value. The value can be estimated as following.
Double the parameter value of "PREPARE_LOG_FILE_COUNT", then see the value of LF_PREPARE_WAIT_COUNT is still larger than 1.
If so, double the value again and see the result until the value reaches 100.
If LF_PREPARE_WAIT_COUNT is still over 1, it means that the performance of disk is slow. Therefore, you have to consider changing the disk to resolve the problem.

Icon

Caution : ALTIBASE HDB allocates memory for log files based on the value of PREPARE_LOG_FILE_COUNT. Therefore, larger value means larger memory usage.

Checkpoint I/O

The fuzzy and ping-pong based implementation of ALTIBASE HDB checkpointing does not impact transaction response times. However, if the log files and data files are physically on the same file system there may be some read and write I/O contention.

In another case, If buffered I/O mechanisms of the Operating System (OS) is used, most of I/O the traffic resides in the OS buffer. Once OS flushes all data onto disk area, the performance again may fluctuate at that time. If buffered I/O of the OS causes performance degradations, you may want to consider changing it to DIRECT I/O. Please, refer to the OS manual how to change this setting.

The best approach to deal with such I/O contentions is to separate the physical location of the log files and datafiles.

ALTIBASE HDB provides the following parameters to scatter I/O writes.

CHECKPOINT_BULK_WRITE_PAGE_COUNT 0 => 100
CHECKPOINT_BULK_WRITE_SLEEP_SEC 0 => 0
CHECKPOINT_BULK_WRITE_SLEEP_USEC 0 => 5000
CHECKPOINT_BULK_SYNC_PAGE_COUNT 3200 => 100

Icon

Caution : Checkpoint duration with tuned parameters as above is longer than when using the default parameters. Therefore in this case, you have to have more disk space available for log files.

MULTIPLEXING_THREAD_COUNT setting

We highly recommend that the number of MULTIPLEXING_THREAD_COUNT be twice or four times bigger than the number of CPU cores. The best-suited value must be determined by testing with the application.

iSQL> select rpad(type, 30), count(*) from v$service_thread group by type
    2  union all
    3 select '# of IPC ', count(*) from v$session where comm_name like '%IPC%';
RPAD(TYPE, 30)                  COUNT
--------------------------------------------------------
DEDICATED                       1
SHARED                          16
# of IPC                        1

It is best to have the number of "DEDICATED" thread same as "# of IPC"
Having "DEDICATED" value greater than # of IPC means a long running query may exist, so you will need to locate and tune it.

iSQL> select rpad(type, 30), count(*) from v$service_thread group by type
    2  union all
    3 select rpad(name, 30), value1 from v$property where name like 'MULTIPLEXING%_THREAD_COUNT';
RPAD(TYPE, 30)                  COUNT
-----------------------------------------------
SOCKET                          16
MULTIPLEXING_THREAD_COUNT       16
MULTIPLEXING_MAX_THREAD_COUNT   1024

The number of "SOCKET" thread might grow up to the value of MULTIPLEXING_MAX_THREAD_COUNT(default 1024). If the value of "SOCKET" is bigger than the value of MULTIPLEXING_THREAD_COUNT, again there may exist some long running queries.

A long running query slows down the performance and takes much CPU usage in the system.

Memory Ager (Garbage Collector - GC)

The Memory Ager (GC) reclaims garbage or memory occupied by objects that are no longer in use by the application - the committed old versions by transactions. The GC works independently formed by threads and is consisted with MEM_LOGICAL_AGER and MEM_DELTHR. The former mostly clears unused "index pages" while the latter does unused "data pages", respectively. If GC can not process the work fast enough, it affects the performance of transactions. The reason is that transactions may look up not only the current version of record and index pages, but also the old versions of them.

ALTIBASE HDB provides the performance view tables to monitor the state of GC.

select add_oid_cnt, gc_oid_cnt , add_oid_cnt - gc_oid_cnt GCGAP from v$memgc;

If you get increasing values of GCGAP while executing the query above, there may exist some long running queries. You can locate such queries with the following SQL.

select c.session_id, comm_name, client_pid, execute_flag, total_time, execute_time, fetch_time, rpad(query,500)
  from (select * from v$memgc limit 1) a, v$transaction b, v$statement c, v$session d
 where (a.MINMEMSCNINTXS = b.MEMORY_VIEW_SCN or a.MINMEMSCNINTXS = b.MIN_MEMORY_LOB_VIEW_SCN)
   and b.id = c.tx_id
   and c.session_id = d.id;

If there are no symptoms related to the cases above, ALTIBASE HDB provides the parameters below to take control.

AGER_WAIT_MINIMUM=200000 (microseconds)
AGER_WAIT_MAXIMUM=1000000 (microseconds)

To enhance GC's performance, you need to lower the parameter values. Then the next wake-up time of GC, it will be faster.

Icon

Caution : The faster wake-up time of GC is, the more CPU cycle is consumed by GC which may result in slower transaction response times due to high CPU consumption.

BUFFER_AREA_SIZE

The Buffer Cache uses the LRU (least recently used) algorithm to keep track of what was used and when. The LRU allows HDB to determine which buffers have not been accessed for a long time so that they can be replaced first.In ALTIBASE HDB, the LRU list is separated into hot and cold zones, and can thus be called a “hot-cold LRU list”. Buffers that are accessed frequently are placed in the hot zone, whereas those that are not accessed frequently are placed in the cold zone. When a buffer needs to be replaced, only the cold zone is searched, meaning that hot buffers are not considered as replacement candidates.
When a page is first loaded into a buffer, it is inserted at the mid-point (LRU cold first) of the LRU list. When allocating a buffer to the new data page, if there are no free buffers in the prepare list, the end (LRU cold last) of this list is searched first, and a cold buffer is then replaced. The buffer that is replaced is called a “victim”. Buffers that are read frequently are moved to the “LRU hot first” position in the hot zone. Meanwhile, dirty buffers, that is, buffers containing pages that have been updated but haven't been flushed to disk, are moved to the flush list. Additionally, clean buffers, that is, buffers containing pages that haven't been updated, are designated as replacement buffers as long as they are not in the hot zone. The relative size of the hot zone can be set using the HOT_LIST_PCT property. The default is 50, which means that half of the LRU list is used as the hot zone. Users need to analyze access patterns in their application since HOT_LIST_PCT property values can influence the performance.

Example SQL how to alter HOT_LIST_PCT value

alter system set HOT_LIST_PCT = 60;

SQL PLAN CACHE

The Plan Cache is a piece of memory that ALTIBASE HDB uses in order to store SQL plans. This feature has been introduced to HDB because the Prepare-Phase(Parse-Validate-Optimize) is expensive in terms of CPU cycles. It's functionality can be compared to the Library Cache of Oracle.

ALTIBASE HDB provide performance-views to find out about the state of the SQL plan cache.

V$SQL_PLAN_CACHE
V$SQL_PLAN_CACHE_PCO
V$SQL_PLAN_CACHE_SQLTEXT

Following parameters are used to tune the performance of the SQL plan cache.

name	value
SQL_PLAN_CACHE_SIZE	67108864
SQL_PLAN_CACHE_BUCKET_CNT	127
SQL_PLAN_CACHE_HOT_REGION_LRU_RATIO	50
SQL_PLAN_CACHE_PREPARED_EXECUTION_CONTEXT_CNT	1

Check the hit_ratio of the plan cache in the time interval.

?
select CACHE_HIT_COUNT, CACHE_MISS_COUNT from v$sql_plan_cache;

If CACHE_HIT_COUNT is much bigger than CACHE_MISS_COUNT, that would be a desirable state.
We highly recommend you that the ratio of "CACHE_MISS_COUNT / CACHE_HIT_COUNT" be normally less than 1/10000. Please also note that this figure is not applicable to all cases.
Icon

Recommended Action :
a. extend the size of PLAN_CACHE
b. convert the source "direct-execute" to "prepare-execute"
c. configure the SQL_PLAN_CACHE_HOT_REGION_LRU_RATIO.
Check the number of statements and SQL_PLAN_CACHE_BUCKET_CNT
The plan cache is managed by a hash-table. SQL_PLAN_CACHE_BUCKET_CNT property inidicates the number of hash table buckets in a SQL plan cache. To reduce the scan time, we highly recommend that SQL_PLAN_CACHE_BUCKET_CNT would be 1/4 of total number of statements based on our experience.

?
select count(*) as total_number_of_statement from v$statement;

Set the SQL_PLAN_CACHE_BUCKET_CNT = total_number_of_statement/4
SQL_PLAN_CACHE_PREPARED_EXECUTION_CONTEXT_CNT
This property indicates the number of execution contexts that are initially created when plans are generated. The initial number of execution contexts is specified before plans are created, however, this only determines the initial number. The number of execution contexts increases or decreases automatically
as required during runtime. Increasing this value can help realize better performance when only one plan is executed at a time, however, in other cases the plan size is merely increased, without realizing improved performance.To avoid the concurrent contention, we recommend that SQL_PLAN_CACHE_PREPARED_EXECUTION_CONTEXT_CNT would be similar as the number of CPU's.

Others

If all considerations above are applied, and still observing performance issues, the following procedure will be helpful to figure out what the problem is.

pstack $ALTIBASE_PID

We highly recommend you to consecutively gather 3 times of pstack with sleep interval (for example: 10 seconds).

select * from v$mutex order by miss_count desc limit 10;

select event, wait_time from v$statement where wait_time > 1;

select event, TOTAL_WAITS, time_waited from v$system_event order by 2,3;

Log file I/O contention

Checkpoint I/O

MULTIPLEXING_THREAD_COUNT setting

Memory Ager (Garbage Collector - GC)

BUFFER_AREA_SIZE

SQL PLAN CACHE

Others

2 Comments

Anonymous

Technical Support 1 Team