Automated Checkpoint Tuning (MTTR)

Overview

Determining the time to recover from an instance failure is a necessary component for reaching required service levelsagreements. For example, if service levels dictate that when a node fails, instance recovery time can be no more than 3 minutes, FAST_START_MTTR_TARGET should be set to 180

Fast-start checkpointing refers to the periodic writes by the database writer (DBWn) processes for the purpose of writing changed data blocks from the Oracle buffer cache to disk and advancing the thread-checkpoint. Setting the database parameter FAST_START_MTTR_TARGET to a value greater than zero enables the fast-start checkpointing feature.

Fast-start checkpointing should always be enabled for the following reasons:

It reduces the time required for cache recovery, and makes instance recovery time-bounded and predictable. This is accomplished by limiting the number of dirty buffers (data blocks which have changes in memory that still need to be written to disk) and the number of redo records (changes in the database) generated between the most recent redo record and the last checkpoint.

Fast-Start checkpointing eliminates bulk writes and corresponding I/O spikes that occure traditionally with interval- based checkpoints, providing a smoother, more consistent I/O pattern that is more predictable and easier to manage.

If the system is not already near or at its maximum I/O capacity, fast-start checkpointing will have a negligible impact on performance. Although fast-start checkpointing results in increased write activity, there is little reduction in database throughout, provided the system has sufficient I/O capacity.

Check-Pointing

Check-pointing is an important Oracle activity which records the highest system change number (SCN,) so that all data blocks less than or equal to the SCN are known to be written out to the data files. If there is a failure and then subsequent cache recovery, only the redo records containing changes at SCN(s) higher than the checkpoint need to be applied during recovery.

As we are aware, instance and crash recovery occur in two steps - cache recovery followed by transaction recovery. During the cache recovery phase, also known as the rolling forward stage, Oracle applies all committed and uncommitted changes in the redo log files to the affected data blocks. The work required for cache recovery processing is proportional to the rate of change to the database and the time between checkpoints.

Mean time to recover (MTTR)

Fast-start recovery can greatly reduce the mean time to recover (MTTR), with minimal effects on online application performance. Oracle continuously estimates the recovery time and automatically adjusts the check-pointing rate to meet the target recovery time.

With 10g, the Oracle database can now self-tune check-pointing to achieve good recovery times with low impact on normal throughput. You no longer have to set any checkpoint-related parameters.

This method reduces the time required for cache recovery and makes the recovery bounded and predictable by limiting the number of dirty buffers and the number of redo records generated between the most recent redo record and the last checkpoint. Administrators specify a target (bounded) time to complete the cache recovery phase of recovery with the FAST_START_MTTR_TARGET initialization parameter, and Oracle automatically varies the incremental checkpoint writes to meet that target.

The TARGET_MTTR field of V$INSTANCE_RECOVERY contains the MTTR target in effect. The ESTIMATED_MTTR field of V$INSTANCE_RECOVERY contains the estimated MTTR should a crash happen right away.

Enable MTTR advisory

Enabling MTTR Advisory Enabling MTTR Advisory involves setting two parameters:

STATISTICS_LEVEL = TYPICAL
FAST_START_MTTR_TARGET > 0

Estimate the value for FAST_START_MTTR_TARGET as follows:

SELECT TARGET_MTTR,
       ESTIMATED_MTTR,
       CKPT_BLOCK_WRITES
FROM V$INSTANCE_RECOVERY;

TARGET_MTTR ESTIMATED_MTTR CKPT_BLOCK_WRITES
----------- -------------- -----------------
        214             12            269880

FAST_START_MTTR_TARGET = 214;

Whenever you set FAST_START_MTTR_TARGET to a nonzero value, then set the following parameters to 0.

LOG_CHECKPOINT_TIMEOUT = 0
LOG_CHECKPOINT_INTERVAL = 0
FAST_START_IO_TARGET = 0

Disable MTTR advisory

FAST_START_MTTR_TARGET = 0
LOG_CHECKPOINT_INTERVAL = 200000