Tuesday, December 6, 2016

SCN (System Change Number) and Checkpoints

System Change Number (SCN)
system change number (SCN) is a logical, internal time stamp used by Oracle Database. SCNs order events that occur within the database, which is necessary to satisfy the ACID properties of a transaction. Oracle Database uses SCNs to mark the SCN before which all changes are known to be on disk so that recovery avoids applying unnecessary redo. The database also uses SCNs to mark the point at which no redo exists for a set of data so that recovery can stop.
SCNs occur in a monotonically increasing sequence. Oracle Database can use an SCN like a clock because an observed SCN indicates a logical point in time and repeated observations return equal or greater values. If one event has a lower SCN than another event, then it occurred at an earlier time with respect to the database. Several events may share the same SCN, which means that they occurred at the same time with respect to the database.
Every transaction has an SCN. For example, if a transaction updates a row, then the database records the SCN at which this update occurred. Other modifications in this transaction have the same SCN. When a transaction commits, the database records an SCN for this commit.
Contrary to popular belief, it does not get generated only at the time of commit, though that’s what the name itself suggests. SCN is there all the time in the database, representing a time portion for the operation which is happening at that instant of time. It is not completely inaccurate to say that SCN doesn’t get generated with a commit, it does – but that’s not the sole way SCN is generated.
Oracle Database increments SCNs in the system global area (SGA). When a transaction modifies data, the database writes a new SCN to the undo data segment assigned to the transaction. The log writer process then writes the commit record of the transaction immediately to the online redo log. The commit record has the unique SCN of the transaction. Oracle Database also uses SCNs as part of its instance recovery and media recovery mechanisms.
SCN is 6 Byte (48bit) number represented as 2 parts – SCN_BASE and SCN_WRAP. An SCN_BASE is a 4 Byte (32 bits) number and SCN_WRAP is a 2 Byte (16 bits) number. Whenever SCN_BASE reaches its maximum (2 power 32 = 4294967296 ) , SCN_WRAP goes up by one and SCN_BASE will be reset to 0. This continues till SCN_WRAP reaches it’s maximum (i.e. 2 power 16 = 65536). SCN = (SCN_WRAP * 4294967296) + SCN_BASE.
Now the question arises in our mind – what happens when SCN reaches it’s maximum? and how the Oracle handles that situation?
Let’s say you have fresh database with both SCN_WRAP and SCN_BASE at value 0. As the database is up and running, SCN value keeps on increasing for every second. For SCN_BASE to reach it’s maximum, it will take around 136 years ( 4294967296/(365*24*60*60) – Assuming one transaction per second.). This is just for one Wrap and imagine you have still 65535 wraps to complete. So, no need to fear about reaching maximum.
Even if it does reach maximum, then SCN will be reseted to 0 useless. Need to take fresh backup’s.
SMON_SCN_TIME Table
The easiest way to see both of these values is from an internal table owned by SYS user - SMON_SCN_TIME. The following is an output from the same (11204).
SELECT s.SCN_WRP, s.SCN_BAS, s.SCN from SMON_SCN_TIME s where ROWNUM<3;
SCN_WRP    SCN_BAS   SCN
----------      ----------        ----------
1                 86376436    4381343732
1                86245363    4381212659
This table contains the entries of the SCNs generated. It stores the data in roughly 5-minute increments and holds 5 days’ worth of data. This means the table contains approximately 1440 records. The exact number of records will vary slightly since the storage increment is not exactly 5 minutes.
SELECT count(*) from SMON_SCN_TIME s;
COUNT(*)
----------
1690

How can we map SCN with Timestamp in versions prior to 10g?
It’s important to remember that this table doesn’t show why there would be any increase in the rate of increase in the SCN numbers. With a little formatting, it is possible to have an idea of the SCN numbers generated but the source of their growth won’t be evident from this table.
Every redo log file has both a log sequence number and low (FIRST_CHANGE#) and high (NEXT_CHANGE#) SCN. The low SCN records the lowest SCN recorded in the log file, while the high SCN records the highest SCN in the log file.
Another way to check this is from the view V$LOG_HISTORY. The view contains the SCN in the form of the columns FIRST_CHANGE# and NEXT_CHANGE# and we can see through these two columns the amount of SCNs generated in the database over a period of time. The “first_change#” is the lowest SCN that appears in the archived log file at a given sequence number of this thread. The “next_change#” is the lowest SCN that appears in the next archived log file.
select thread#, first_change#,next_change# from V$log_history;

THREAD#  FIRST_CHANGE#    NEXT_CHANGE#
----------     -------------                  ------------
1               4211590818                4211593771
1              4211593771                 4211596653
1              4211596653                 4211599617
1             4211599617                 4211602580
1             4211602580                 4211605360
1             4211605360                 4211608174
1             4211608174                 4211610989
1             4211610989                 4211624985
1             4211624985                 4211629091
As with SMON_SCN_TABLE it’s not possible to find the source of the increase in the generation of the SCN numbers from this table view. Still, you can use this view in the single instance as well as in an RAC environment.

SCN and Checkpoints
A checkpoint occurs when all modified database buffers in the Oracle SGA are written out to datafiles by the database writer (DBWn) process. The checkpoint process (CKPT) updates all datafiles and control files with the SCN at the time of the checkpoint and signals DBWn to write out the blocks. A successful checkpoint guarantees that all database changes up to the checkpoint SCN have been recorded in the datafiles. In other words, the checkpoint records the highest SCN of all changes to blocks such that all data blocks with changes below that SCN have been written to disk by DBWn. As a result, only those changes made after the checkpoint need to be applied during recovery.
Checkpoints occur automatically as follows:
§  Whenever a redo log switch takes place (or by issuing the command ‘alter system switch logfile’ or ‘alter system checkpoint’.)
§  When all dirty buffers in the Oracle SGA are written to datafiles by DBWR.
§  Every 3 seconds (incremental checkpoint)
§  When instance shutdown normal/transactional/immediate
§  Whenever Alter Tablespace [Offline Normal|Read Only|begin Backup]
§  Whenever the time set by the LOG_CHECKPOINT_TIMEOUT initialization parameter is reached
§  Whenever the amount of redo written reaches the number of bytes associated with the LOG_CHECKPOINT_INTERVAL
Typically, LOG_CHECKPOINT_INTERVAL is chosen so that checkpoints only occur on log switches. Oracle stores the SCN associated with the checkpoint in four places: three of them in the control file and one in the datafile header for each datafile:
1. System checkpoint SCN   =======>  (SYSTEM CHECKPOINT SCN in control file)
2. Datafile checkpoint SCN  =======>  (DATAFILE CHECKPOINT SCN in control file)
3. Stop SCN                          =======>  (STOP SCN in control file)
4. Start SCN                          =======>  (DATAFILE HEADER)
Why CONTROL FILE stored in two places to be divided (SYSTEM CHECKPOINT SCN, DATAFILE CHECKPOINT SCN?)  When you put a tablespace set to read-only, SCN will freeze to stop him, this time DATAFILE CHECKPOINT SCN is no longer incremental changes, but the overall SYSTEM CHECKPOINT SCN will continue to increase, but still advancing. So, this is why you need are stored in two places SCN.
During a clean shutdown, a checkpoint occurs and the stop SCN for each datafile is set to the start SCN of the datafile header. Now, all the 4 SCN values are same.
Upon startup, Start SCN value is checked with Datafile Checkpoint SCN, if they both match then Start SCN value is checked with Stop SCN and if they do match, Oracle opens the database without any problem. Since in our case as mentioned above all the 4 values are same and so we have a clean startup. Once the database is opened, stop SCN of the control file will be resetted to NULL again to indicate that the datafile is open for normal use. At instance startup, the datafile checkpoint SCN in the control file and the start SCN in the datafile header are checked for equality. If they don’t match, it is a signal that media recovery is required.
Purpose of Checkpoints
Oracle Database uses checkpoints to achieve the following goals:
           -    Reduce the time required for recovery in case of an instance or media failure
           -  Ensure that dirty buffers in the buffer cache are written to disk regularly
           -    Ensure that all committed data is written to disk during a consistent shutdown

The System Checkpoint SCN
After a checkpoint completes, Oracle stores the system checkpoint SCN in the control file. You can access the checkpoint SCN using the following SQL:

SQL> select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
------------------
1677903

SQL> alter system checkpoint;

System altered.

SQL> select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
------------------
1679716

The Datafile Checkpoint SCN
After a checkpoint completes, Oracle stores the SCN individually in the control file for each datafile. The following SQL shows the datafile checkpoint SCN for a single datafile in the control file:
SQL> select name, checkpoint_change# from v$datafile where name like '%users01%';
NAME                                   CHECKPOINT_CHANGE#
-----------------------------------             --------------------
/u02/oradata/OMFD1/users01.dbf        292767

Partial Checkpoint SCN
Operational non-full checkpoints for subset of system i.e. tablespace or a datafile etc, would set checkpoint for affected entities only
SQL> select name, checkpoint_change# from v$datafile_header where name like '%01.dbf';

NAME                                                      CHECKPOINT_CHANGE#
----------------------------------------------------      ------------------
/u02/app/oracle/oradata/mask11g/system01.dbf               1685610
/u02/app/oracle/oradata/mask11g/sysaux01.dbf               1685610
/u02/app/oracle/oradata/mask11g/undotbs01.dbf              1685610
/u02/app/oracle/oradata/mask11g/users01.dbf                1685610

SQL> alter tablespace users read only;

Tablespace altered.

SQL> select name, checkpoint_change# from v$datafile_header where name like '%01.dbf';

NAME                                                      CHECKPOINT_CHANGE#
----------------------------------------------------      ------------------
/u02/app/oracle/oradata/mask11g/system01.dbf               1685610
/u02/app/oracle/oradata/mask11g/sysaux01.dbf               1685610
/u02/app/oracle/oradata/mask11g/undotbs01.dbf              1685610
/u02/app/oracle/oradata/mask11g/users01.dbf                1685618

SQL> alter tablespace users read write;

Tablespace altered.

SQL> select name,checkpoint_change# from v$datafile_header where name like '%01.dbf';

NAME                                                     CHECKPOINT_CHANGE#
----------------------------------------------------     ------------------
/u02/app/oracle/oradata/mask11g/system01.dbf              1685610
/u02/app/oracle/oradata/mask11g/sysaux01.dbf              1685610
/u02/app/oracle/oradata/mask11g/undotbs01.dbf             1685610
/u02/app/oracle/oradata/mask11g/users01.dbf                 1685642

The Start SCN
Oracle stores the checkpoint SCN value in the header of each datafile. This is referred to as the start SCN because it is used at instance startup time to check if recovery is required. The following SQL shows the checkpoint SCN in the datafile header for a single datafile:
SQL> select name, checkpoint_change# from v$datafile_header where name like '%users01%';

NAME                                                   CHECKPOINT_CHANGE#
-----------------------------------               --------------------
/u02/oradata/OMFD1/users01.dbf     292767

The Stop SCN
The stop SCN is held in the control file for each datafile. The following SQL shows the stop SCN for a single datafile when the database is open for normal use:

SQL> select name, last_change# from v$datafile where name like '%users01%';

NAME                                                    LAST_CHANGE#
-----------------------------------                ------------
/u02/oradata/OMFD1/users01.dbf
Under the normal conditions, where database is running without any problem, the System Checkpoint SCN, Datafile Checkpoint SCN and Start SCN have equal values. Whereas, Stop SCN will always be NULL under normal conditions for all datafiles that are online in read-write mode.
SCN after a Clean Shutdown resulting from a SHUTDOWN IMMEDIATE or SHUTDOWN NORMAL of the database, followed by STARTUP MOUNT, the previous queries on v$database and v$datafile return the following:
SQL> select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
--------------------
293184

SQL> select name,checkpoint_change#,last_change# from v$datafile where name like '%user%';

NAME                                                  CHECKPOINT_CHANGE#    LAST_CHANGE#
-----------------------------------              --------------------                       --------------
/u02/oradata/OMFD1/users01.dbf     293184                                      293184

SQL> select name,checkpoint_change# from v$datafile_header where name like '%users01%';

NAME                                                    CHECKPOINT_CHANGE#
-----------------------------------                 --------------------
/u02/oradata/OMFD1/users01.dbf         293184

SCN after an Instance Crash
The previous example showed the behavior of the SCN after a clean shutdown. To demonstrate the behavior of the checkpoints after an instance crash, the following SQL creates a table (which performs an implicit commit) and inserts a row of data into it without a commit:

create table x(x number) tablespace users;
insert into x values(100);


If the instance is crashed by using SHUTDOWN ABORT, the previous queries on v$database and v$datafile return the following after the database is started up in mount mode:

SQL> select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
--------------------
293185

SQL> select name,checkpoint_change#,last_change# from v$datafile where name like '%users01%';

NAME                                                 CHECKPOINT_CHANGE#     LAST_CHANGE#
-----------------------------------              --------------------                       --------------
/u02/oradata/OMFD1/users01.dbf     293185

SQL> select name,checkpoint_change# from v$datafile_header where name like '%users01%';

NAME                                                    CHECKPOINT_CHANGE#
-----------------------------------                 --------------------
/u02/oradata/OMFD1/users01.dbf        293185


In this case, the stop SCN is not set, which is indicated by the NULL value in the LAST_CHANGE# column. This information enables Oracle, at the time of the next startup, to determine that the instance crashed because the checkpoint on shutdown was not performed. If it had been performed, the LAST_CHANGE# and CHECKPOINT_CHANGE# values would match for each datafile as they did during a clean shutdown. If an instance crashes at shutdown, then instance crash recovery is required the next time the instance starts up.

Recovery from an Instance Crash
Upon the next instance startup that takes place after SHUTDOWN ABORT or a DBMS crash, the Oracle DBMS detects that the stop SCN for datafiles is not set in the control file during startup. Oracle then performs crash recovery. During crash recovery, Oracle applies redo log records from the online redo logs in a process referred to as roll forward to ensure that all transactions committed before the crash are applied to the datafiles. Following roll forward, active transactions that did not commit are identified from the rollback segments and are undone before the blocks involved in the active transactions can be accessed. This process is referred to as roll back. In our example, the following transaction was active but not committed at the time of the SHUTDOWN ABORT, so it needs to be rolled back:
SQL> insert into x values (100);
After instance startup, the X table exists, but remains empty. Instance recovery happens automatically at database startup without database administrator (DBA) intervention. It may take a while because of the need to apply large amounts of outstanding redo changes to data blocks for transactions that completed and those that didn’t complete and require roll back.

Recovery from a Media Failure
Up until this point, the checkpoint start SCN in the datafile header has always matched the datafile checkpoint SCN number held in the control file. This is reasonable because during a checkpoint, the datafile checkpoint SCN in the control file and the start SCN in the datafile header are both updated, along with the system checkpoint SCN. The following SQL shows the start SCN from the datafile header and datafile checkpoint SCN from the control file for the same file:
SQL> select 'controlfile' "SCN location",name, checkpoint_change# from v$datafile where name like     '%users01%'
union
select 'file header',name,checkpoint_change# from v$datafile_header where name like '%users01%';

SCN location     NAME                                               CHECKPOINT_CHANGE#
--------------      -----------------------------------             --------------------
controlfile      /u02/oradata/OMFD1/users01.dbf      293188
file header     /u02/oradata/OMFD1/users01.dbf      293188

Unlike the v$datafile view, there is no stop SCN column in the v$datafile_header view because v$datafile_header is not used at instance startup time to indicate that an instance crash occurred. However, the v$datafile_header does provide the Oracle DBMS with the information it requires to perform media recovery. At instance startup, the datafile checkpoint SCN in the control file and the start SCN in the datafile header are checked for equality. If they don’t match, it is a signal that media recovery is required.
For example, media recovery is required if a media failure has occurred and the original datafile has been replaced with a backup copy. In this case, the start SCN in the backup copy is less than the checkpoint SCN value in the control file, and Oracle requests archived redo logs—generated at the time of previous log switches—in order to reapply the changes required to bring the datafile up to the current point in time.
In order to recover the database from a media failure, you must run the database in ARCHIVELOG mode to ensure that all database changes from the online redo logs are stored permanently in archived redo log files.
You can identify files that need recovery after you have replaced a datafile with an older version by starting the instance in mount mode and running the following SQL:
SQL> select file#, change# from v$recover_file;
FILE#       CHANGE#
----------    ----------
4                313401
In this example, file 4 is the datafile in the USERS tablespace. By reexecuting the previous SQL to display the datafile checkpoint SCN in the control file and the start SCN in the datafile header, you can see that the start SCN is older due to the restore of the backup datafile that has taken place:
SQL> select 'controlfile' "SCN location",name,checkpoint_change#
       from v$datafile where name like '%users01%'
      union
      select 'file header',name,checkpoint_change#
     from v$datafile_header where name like '%users01%';
SCN location    NAME                                               CHECKPOINT_CHANGE#
--------------     -----------------------------------             --------------------
controlfile      /u02/oradata/OMFD1/users01.dbf     313551
file header     /u02/oradata/OMFD1/users01.dbf     313401
If you were to attempt to open the database, you would receive errors like the following:
ORA-01113: file 4 needs media recovery
ORA-01110: datafile 4: '/u02/oradata/OMFD1/users01.dbf'
You can recover the database by issuing RECOVER DATABASE from SQL*Plus while the database is in a mounted state. If the changes needed to recover the database to the point in time before the crash are in an archived redo log, then you will be prompted to accept the suggested name:
ORA-00279: change 313401 generated at 11/10/2001 18:50:23 needed for thread
ORA-00289: suggestion : /u02/oradata/OMFD1/arch/T0001S0000000072.ARC
ORA-00280: change 313401 for thread 1 is in sequence #72
Specify log: {=suggested | filename | AUTO | CANCEL}
If you respond to the prompt using AUTO, Oracle applies any archived redo logs it needs, followed by any necessary changes in the online redo logs, to bring the database right up to the last committed transaction before the media failure that caused the requirement for the restore.
So far, we’ve considered recovery scenarios where the goal is to recover the database to the most recent transaction. This is known as complete recovery. The RECOVER DATABASE command has several other options that enable you to recover from a backup to a point in time before the most recent transaction by rolling forward and then stopping the application of the redo log changes at a specified point. This is known as incomplete recovery. You can specify a time or an SCN as the recovery point. For example,
recover database until time '2001-11-10:18:52:00';
recover database until change 313459;
Before you perform incomplete recovery, it’s recommended that you restore a complete database backup first. After incomplete recovery, you must open the mounted database with ALTER DATABASE OPEN RESETLOGS. This creates a new incarnation of the database and clears the contents of the existing redo logs to make sure they can’t be applied.

Recovery from a Media Failure Using a Backup Control File
In the previous example, we had access to a current control file at the time of the media failure. This means that none of the start SCN values in the datafile headers exceeded the system checkpoint SCN number in the control file. To recap, the system checkpoint number is given by the following:
SQL> select checkpoint_change# from v$database;
You might be wondering why Oracle needs to maintain the last system checkpoint value in the control file as well as checkpoint SCNs in the control file for each datafile (as used in the previous example). There are two reasons for this. The first is that you might have read-only tablespaces in your database. In this case, the database checkpoint SCN increases, and the checkpoint SCN for the datafiles in the read-only tablespace remains frozen in the control file.
The following SQL report output shows a database with a read-write tablespace (USERS) and read-only tablespace (TEST). The start SCN in the file header and the checkpoint SCN in the control file for TEST are less than the system checkpoint value. Once a tablespace is read only, checkpoints have no effect on the files in it. The other read-write tablespace has checkpoint values that match the system checkpoint:
SCN location           NAME                                                CHECKPOINT_CHANGE#
--------------------     ----------------------------------              ----------------
controlfile               SYSTEM checkpoint                            355390
file header              /u02/oradata/OD2/users01.dbf           355390
file in controlfile    /u02/oradata/OD2/users01.dbf           355390
file header              /u02/oradata/OD2/test01.dbf             355383
file in controlfile    /u02/oradata/OD2/test01.dbf             355383
The second reason for the maintenance of multiple checkpoint SCNs in the control file is that you might not have a current control file available at recovery time. In this case, you need to restore an earlier control file before you can perform a recovery. The system checkpoint in the control file may indicate an earlier change than the start SCN in the datafile headers.
The following SQL shows an example where the system checkpoint SCN and datafile checkpoint SCN indicate an earlier change than the start SCN in the datafile header:
SQL> select 'controlfile'  "SCN location", 'SYSTEM checkpoint ' name, checkpoint_change#
       from v$database
       union
       select 'file in controlfile', name, checkpoint_change#
       from v$datafile where name like 'users01%'
       union
       select 'file header', name, checkpoint_change#
       from v$datafile_header where name like '%users01%';
       SCN location        NAME                                             CHECKPOINT_CHANGE#
       -------------------     ------------------------------               ------------------
       controlfile          SYSTEM checkpoint                          333765
       file header          /u02/oradata/OD2/users01.dbf        355253
       file in controlfile   /u02/oradata/OD2/users01.dbf     333765
If try you to recover a database in the usual way in this situation, Oracle detects that the control file is older than some of the datafiles, as indicated by the checkpoint SCN values in the datafile headers, and reports the following message:
SQL> recover database
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done
If you want to proceed with recovery in this situation, you need to indicate to Oracle that a noncurrent control file - possibly containing mismatches in the SCN values identified by the previous error messages - is about to be specified for recovery by using the following command:
recover database using BACKUP CONTROLFILE;
A consistent backup of a database or part of a database is a backup in which all read-write datafiles and control files have been checkpointed with respect to the same SCN.
An inconsistent backup s a backup in which all read-write datafiles and control files have not been checkpointed with respect to the same SCN. For example, one read-write datafile header may contain an SCN of 100 while others contain an SCN of 95. Oracle cannot open the database until these SCNs are consistent, that is, until all changes recorded in the online redo logs have been made to the datafiles.
Oracle enables you to restore an older backup and apply partial redo data, thereby recovering the database to a specified non-current time or SCN. This type of recovery is called incomplete recovery. You must open your database with a RESETLOGS operation after performing incomplete recovery in order to reset the online redo logs.
As a query enters the execution stage, the current system change number (SCN) is determined. As data blocks are read on behalf of the query, only blocks written with the observed SCN are used. Blocks with changed data (more recent SCNs) are reconstructed from data in the rollback segments, and the reconstructed data is returned for the query. Therefore, each query returns all committed data with respect to the SCN recorded at the time that query execution began. Changes of other transactions that occur during a query's execution are not observed, guaranteeing that consistent data is returned for each query. By default a block-level SCN is recorded for each block. That SCN is the latest SCN for any record in the block.


Crash recovery vs media recovery

STOP SCN equal NULL ==> CRASH RECOVERY NEED
DATAFILE HEADER START SCN and SCN CONTROLFILE not equal ==> MEDIA RECOVERY NEED


Difference Between "SCN and ORA_ROWSCN"

SCN is like ticker/end of every transaction in database whereas ORA_ROWSCN is an observed point in time. The ora_rowscn is a value associated to a block or a row on a block that represents the “time” the block/row was last modified.
By default, the ora_rowscn pseudo column reports the scn at block level. This means that rows sharing the same block will have the same ora_rowscn, and if one row were to be changed, then scn would be altered for both blocks.


RECOVERY DATABASE two common problems!!!

1) RECOVER DATABASE UNTIL CANCEL ==> OPEN DATABASE RESETLOG ==> DATAFILE HEADER SCN will be less than the DATAFILE SCN CONTROLFILE
this time will be unable to open the database, you must redo archive log media recovery ~~ until the datafile header of SCN = current scn. After the restore datafile, you can mount database and then went to check the controlfile and datafile header SCN.

2) RECOVER DATABASE UNTIL CANCEL USING BACKUP CONTROLFILE ===> OPEN  DATABASE RESETLOG ==> DATAFILE HEADER SCN will be greater than the DATAFILE  SCN CONTROLFILE
this time after Oracle RESTORE CONTROL FILE SYSTEM SCN CONTROL FILE will find inside will be less than the current DATAFILE HEADER SCN, nor equal to the current stored in the LOG SCN FILE inside, then you must use the RECOVER DATABASE UNTIL CANCEL USING BACKUP CONTROLFILE to DROP DATAFILE OR DROP TABLESPACE before the SCN.

No comments:

Post a Comment